develooper Front page | perl.perl5.porters | Postings from March 2012

[perl #107008] UTF8 patches for 5.16

Thread Next
From:
Father Chrysostomos via RT
Date:
March 25, 2012 17:39
Subject:
[perl #107008] UTF8 patches for 5.16
Message ID:
rt-3.6.HEAD-4610-1332722380-1814.107008-15-0@perl.org
The commit IDs for the applied patches are:

15d94df60eb merge commit
ce16c625ecb op.c warnings
84cf752cf46 fix-up for the prev. patch
70558906b0d merge commit
5db1eb8d3ec labels
f232b41c679 test fix-up
fefd015fc8d copy label in parse_label

The unapplied patches are attached as one file (trying to make things
less chaotic :-).

Here is a list, including my reason for not applying each:

6961eeb7b3 toke.c: S_missingterm cleanup
fails without the next one

4929775f62 toke.c and parser.h: Make multi_(open|close) UVs instead of chars
causes utf8.t failures

bf1b3b8908 toke.c: S_check_uni cleanup.
This warning message is only triggered for built-ins according to the
commit message, so therefore it is not necessary to make it UTF8-clean,
as all built-ins have ASCII names.

129e573435 sv.c: "isn't numeric" warning for Latin-1 data.
Tests don’t match changes.  Changes are unnecessary for 5.16.

4b9fa9d371 utf8.[ch]: Add a UNI_DISPLAY_FORMAT_UVX flag for pv_display.
Supporting patch for the following commit.

190301c569 sv.c: Make S_not_a_number use UNI_DISPLAY_FORMAT_UVX
Unnecessary for 5.16, if at all.

a885abdbb7 toke.c: [RT#73022] Make \N{...} UTF-8 clean.
This should wait until non-ASCII char names are validated properly.

c69d7c3d97 toke.c: Make S_scan_ident actually do what we've been saying
it does.
I think this change is unsafe as-is, and should be enabled via
feature.pm, after 5.16.  See the plan I outlined at
<https://rt.perl.org/rt3/Ticket/Display.html?id=89032#txn-1097256>.

c5a4d957a1 toke.c: Change "Unrecognized character" to "Unrecognized
operator."
I don’t think this is a good idea.  I wouldn’t consider ; an operator. 
Characters that are not recognized as part of Perl syntax are just that:
unrecognized characters.  What they might be used for in the future is
not known yet.

f7b53923f4 toke.c: S_tokeq cleanup
I don’t see how this is necessary.  The code is just scanning for
backslashes, so using utf8 skip routines shouldn’t make any difference,
except to speed.


I think this ticket can stop being a blocker.

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=107008

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About