develooper Front page | perl.perl5.porters | Postings from July 2013

Compiled-in POSIX character class inversion lists are now fully constin blead

Thread Next
Karl Williamson
July 4, 2013 04:04
Compiled-in POSIX character class inversion lists are now fully constin blead
Message ID:
The header file charclass_invlists.h contains the definitions for some 
of the POSIX character classes, such as [:xdigit:].  These are now 
declared as

static const UV foo[]

Not being fully const created problems for -DPERL_GLOBAL_STRUCT_PRIVATE, 
and it meant that these were not in the read-only text segment portion 
of the program.  Now multiple instances of Perl running the same 
executable can share these.  It appears to me from code reading that 
these also now aren't copied when the scalars containing them are dup'd, 
as SvLEN is set to 0 in those scalars.

To save memory, only the POSIX classes with smaller representations have 
been compiled-in.  The larger classes have only their Latin1 range 
values compiled.  If the program needs to access something outside that 
range, the appropriate tables must be loaded from disk.

I'm thinking, as David Mitchell suggested some time ago, that this 
change should effect our calculation of which ones should get compiled. 
  I think \d should definitely be compiled, and probably \w (the sizes 
being 84 and 1130 UVs respectively).

The other candidates with the number of UVs they occupy in Unicode 6.2 

alnum 1132
alpha 1080
graph 1088
lower 1236
print 1082
punct 272
upper 1220
cased 238  (used internally, a combination of lower + upper)

These numbers will grow in future Unicode releases.

More than 90% of Unicode properties are implemented using inversion 
lists (with the largest < 10% using hashes).  These are all read-only. 
 From looking at the code it appears that the COW mechanism could be 
used to avoid having multiple copies of these.  But the only 
documentation I found was a few references in perlapi to functions.  Is 
there some documentation other than this on how to use this, or is 
perlapi the only thing necessary to know?

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About