Front page | perl.perl5.porters |
Postings from November 2015
TONYC TPF Grant 6 report #5
From:
Tony Cook
Date:
November 25, 2015 09:20
Subject:
TONYC TPF Grant 6 report #5
Message ID:
20151125092004.GK8246@mars.tony.develop-help.com
[Hours] [Activity]
2015/11/09 Monday
0.32 cygwin failures – re-test original fix and apply to blead
0.53 #126403 review and comment
0.57 #126469 review, re-test and apply to blead
0.37 #126474 comment
1.03 #126544 research, comment
1.00 #126474 research, comment
0.17 #122251 review upstream tickets and resolve
0.50 #126368 review, test all three modules with blead and
close
=====
4.49
2015/11/10 Tuesday
1.27 #126240 (camel issues), testing and comment
0.63 khw's cygwin issues
1.10 #124080 review, testing
0.05 #124080 review test results, push to blead
1.83 #124068 testing, fix issues, more testing
=====
4.88
2015/11/11 Wednesday
0.68 #124068 another fix, testing, push to blead
0.50 #126608 review and comment
1.55 #126602 review, produce a patch and comment
0.52 #126325 review patch, testing and apply to blead
0.48 jhi's scandir thread – research and comment
1.62 #126193 review code, work on a patch, testing, comment
with patch
=====
5.35
2015/11/12 Thursday
0.75 #125619 review discussion and comment
0.43 #126609 review, test and apply to blead
0.23 #126611 review and comment
2.67 #126593 debugging, research, longer comment
=====
4.08
Which I calculate is 18.8 hours.
Approximately 17 tickets were reviewed or worked on, and 3 patches
were applied.
[perl #126593] illustrates how some of perl's internal tools need to
be careful of which parts of the language they use.
The tr/// operator can do its job in one of two ways, if all the code
points are between 0 and 255 with a 256 entry table of shorts,
otherwise using a swash, which is created by SWASHNEW in
lib/utf8_heavy.pl.
tr/// uses the UTF-8 flag on the search and replacement strings to
decide whether to use the look-up table or the swash, so it's possible
for the swash to be used even when the search and replacement strings
are representable as bytes.
aa8f6cef changed a s/// operator in lib/utf8_heavy.pl to a tr///
operator. All of the characters in the search and replacement strings
can be represented as bytes - they're all ASCII range, so at first
sight the implementation should be using the lookup table rather than
the swash.
The problem is code that uses the deprecated ${^ENCODING} variable, in
this case the encoding::warnings module. encoding::warnings sets
${^ENCODING} to a filter that warns (or croaks) when a non-UTF-8
marked PV with non-ASCII is used with UTF-8 marked PVs.[1]
When parsing string literals, including tr/// operators, S_scan_str()
in toke.c always returns UTF-8 marked strings when ${^ENCODING} is true
The module that started [perl #125693] loads encoding::warnings, so
${^ENCODING} is now set, then Fatal, which loads Carp which includes
the line:
$VERSION =~ tr/_//d;
so both the search and replacement strings are passed to S_pmtrans()
(op.c) as UTF-8 marked strings.
S_pmtrans() attempts to create a swash, which starts to load
utf8_heavy.pl, until we get to the line:
(my $loose = $_[0]) =~ tr/-_ \t//d;
where things break.
[1] the current recommended practice is that the UTF-8 flag controls
internal representation only and combining two such strings isn't an
issue. Don't use encoding::warnings.
-
TONYC TPF Grant 6 report #5
by Tony Cook