develooper Front page | perl.perl5.porters | Postings from March 2012

[perl #107008] UTF8 patches for 5.16

Thread Next
Father Chrysostomos via RT
March 25, 2012 17:39
[perl #107008] UTF8 patches for 5.16
Message ID:
The commit IDs for the applied patches are:

15d94df60eb merge commit
ce16c625ecb op.c warnings
84cf752cf46 fix-up for the prev. patch
70558906b0d merge commit
5db1eb8d3ec labels
f232b41c679 test fix-up
fefd015fc8d copy label in parse_label

The unapplied patches are attached as one file (trying to make things
less chaotic :-).

Here is a list, including my reason for not applying each:

6961eeb7b3 toke.c: S_missingterm cleanup
fails without the next one

4929775f62 toke.c and parser.h: Make multi_(open|close) UVs instead of chars
causes utf8.t failures

bf1b3b8908 toke.c: S_check_uni cleanup.
This warning message is only triggered for built-ins according to the
commit message, so therefore it is not necessary to make it UTF8-clean,
as all built-ins have ASCII names.

129e573435 sv.c: "isn't numeric" warning for Latin-1 data.
Tests don’t match changes.  Changes are unnecessary for 5.16.

4b9fa9d371 utf8.[ch]: Add a UNI_DISPLAY_FORMAT_UVX flag for pv_display.
Supporting patch for the following commit.

190301c569 sv.c: Make S_not_a_number use UNI_DISPLAY_FORMAT_UVX
Unnecessary for 5.16, if at all.

a885abdbb7 toke.c: [RT#73022] Make \N{...} UTF-8 clean.
This should wait until non-ASCII char names are validated properly.

c69d7c3d97 toke.c: Make S_scan_ident actually do what we've been saying
it does.
I think this change is unsafe as-is, and should be enabled via, after 5.16.  See the plan I outlined at

c5a4d957a1 toke.c: Change "Unrecognized character" to "Unrecognized
I don’t think this is a good idea.  I wouldn’t consider ; an operator. 
Characters that are not recognized as part of Perl syntax are just that:
unrecognized characters.  What they might be used for in the future is
not known yet.

f7b53923f4 toke.c: S_tokeq cleanup
I don’t see how this is necessary.  The code is just scanning for
backslashes, so using utf8 skip routines shouldn’t make any difference,
except to speed.

I think this ticket can stop being a blocker.


Father Chrysostomos

via perlbug:  queue: perl5 status: open

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About