develooper Front page | perl.perl5.porters | Postings from June 2013

NWCLARK TPF grant report #83

Nicholas Clark
June 10, 2013 13:04
NWCLARK TPF grant report #83
Message ID:
[Hours]		[Activity]
2013/04/02	Tuesday
 0.75		Unicode Names
 2.25		reading/responding to list mail

2013/04/03	Wednesday
 8.00		Unicode Names

2013/04/04	Thursday
 5.25		reading/responding to list mail

2013/04/05	Friday
 1.50		RT #117501 (Open64 compiler)
 6.00		Unicode Names (No \N{} in miniperl)
 0.25		caller with undef package
 0.25		mktables

2013/04/06	Saturday
 0.50		RT #117501 (Open64 compiler)
 2.25		Unicode Names
 0.75		s390

2013/04/07	Sunday
 0.25		Unicode Names
 0.75		s390

Which I calculate is 28.75 hours

There were two things of note that I worked on this week. One was the Unicode
Names code. At Karl's suggestion I changed it to parse the UnicodeData.txt
file properly. Previously it had hardcode various constants, particularly
related to the CJK ideographs and Hangul syllables. The CJK ranges in
Unicode have increased in the past, and so it's possible that they will
increase again. Not only is it (more) future proof, it also made it simpler
to detect significant gaps in the allocated character ranges, which is
useful for size optimisations. By handing the gaps better I reduced the data
size by 13K, and by using two sizes of arrays for the trie structure, saved
a further 25K.

The intent of all this is to provide the data needed for the \N{} syntax
directly as C code and static data, to avoid the tokeniser needing to load
charnames if it sees \N{}. Given that the C code in question is generated by
Perl, but to compile the Perl you need the C code, there's a potential
bootstrapping problem here. Not wishing to ship >1M of generated code if
avoidable, I experimented to see whether the \N{} escape syntax is needed by
miniperl. It turns out that if you replace the \N{} handler by abort() in
miniperl, you can still build perl perfectly. Excellent! Also, telling the
Makefile to build distinct toke.o and tokemini.o is a one line change - it's
nice when easy things *are* easy.

Frustratingly the work is not yet ready to merge into blead, as it's not yet
finished enough, and other things keep taking priority.

The other thing of note this week was a digression involving perl on s390.
Merijn was given a CD for OpenSUSE for s390, soon had it running on the
emulator "Hercules". What does one do next? Obviously - try to build Perl.
So he build blead (which worked) and ran the tests (which mostly worked).
My initial reaction was:

    Is anyone actually using Perl on it?

    In that, we've not had any bug reports about this before, and for a lot
    of these somewhat esoteric platforms I'm sort of wondering at what point
    do they stop being "fun", and start being "work". Right now, I think
    it's more at the "fun" level, and it might be telling us something
    interesting about portability, as it might be that all these tests
    failing are down to the same problem.

The problems all seemed to be involve conversion of large floating point
values to integers. Which we do quite a lot of, and should work, but
historically we've been skirting the boundary of what's conformant ANSI C,
sometimes on the wrong side. So Merijn and I exchanged code and results as I
tried to remote debug what the cause was. We tried to establish whether it
was even handling large unsigned integers correctly (it was). We tried to
rule out unions and ? : ternaries (which have confused compilers in the
past). Nope. In the end, we ascertained that it was a bug in the supplied
gcc 4.3.4 - it generated bad code for casting from unsigned integers to

At which point Niko Tyni replied that the particular problem was already
diagnosed as a compiler bug, and had been fixed. Debian was building on s390
with gcc 4.6.3, and he believed that gcc 4.4.7 was fixed.

So that all ended up being rather a waste of time, thanks to the
distribution's continued use of an obsolete and buggy compiler. Particularly
frustrating given that a fix exists in a version that is at least a year

Nicholas Clark Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About