develooper Front page | perl.perl5.porters | Postings from November 2015

TONYC TPF Grant 6 report #5

Tony Cook
November 25, 2015 09:20
TONYC TPF Grant 6 report #5
Message ID:
[Hours]         [Activity]
2015/11/09      Monday
 0.32           cygwin failures – re-test original fix and apply to blead
 0.53           #126403 review and comment
 0.57           #126469 review, re-test and apply to blead
 0.37           #126474 comment
 1.03           #126544 research, comment
 1.00           #126474 research, comment
 0.17           #122251 review upstream tickets and resolve
 0.50           #126368 review, test all three modules with blead and

2015/11/10      Tuesday
 1.27           #126240 (camel issues), testing and comment
 0.63           khw's cygwin issues
 1.10           #124080 review, testing
 0.05           #124080 review test results, push to blead
 1.83           #124068 testing, fix issues, more testing

2015/11/11      Wednesday
 0.68           #124068 another fix, testing, push to blead
 0.50           #126608 review and comment
 1.55           #126602 review, produce a patch and comment
 0.52           #126325 review patch, testing and apply to blead
 0.48           jhi's scandir thread – research and comment
 1.62           #126193 review code, work on a patch, testing, comment
                with patch

2015/11/12      Thursday
 0.75           #125619 review discussion and comment
 0.43           #126609 review, test and apply to blead
 0.23           #126611 review and comment
 2.67           #126593 debugging, research, longer comment

Which I calculate is 18.8 hours.

Approximately 17 tickets were reviewed or worked on, and 3 patches
were applied.

[perl #126593] illustrates how some of perl's internal tools need to
be careful of which parts of the language they use.

The tr/// operator can do its job in one of two ways, if all the code
points are between 0 and 255 with a 256 entry table of shorts,
otherwise using a swash, which is created by SWASHNEW in

tr/// uses the UTF-8 flag on the search and replacement strings to
decide whether to use the look-up table or the swash, so it's possible
for the swash to be used even when the search and replacement strings
are representable as bytes.

aa8f6cef changed a s/// operator in lib/ to a tr///
operator.  All of the characters in the search and replacement strings
can be represented as bytes - they're all ASCII range, so at first
sight the implementation should be using the lookup table rather than
the swash.

The problem is code that uses the deprecated ${^ENCODING} variable, in
this case the encoding::warnings module.  encoding::warnings sets
${^ENCODING} to a filter that warns (or croaks) when a non-UTF-8
marked PV with non-ASCII is used with UTF-8 marked PVs.[1]

When parsing string literals, including tr/// operators, S_scan_str()
in toke.c always returns UTF-8 marked strings when ${^ENCODING} is true

The module that started [perl #125693] loads encoding::warnings, so
${^ENCODING} is now set, then Fatal, which loads Carp which includes
the line:

  $VERSION =~ tr/_//d;

so both the search and replacement strings are passed to S_pmtrans()
(op.c) as UTF-8 marked strings.

S_pmtrans() attempts to create a swash, which starts to load, until we get to the line:

    (my $loose = $_[0]) =~ tr/-_ \t//d;

where things break.

[1] the current recommended practice is that the UTF-8 flag controls
internal representation only and combining two such strings isn't an
issue.  Don't use encoding::warnings. Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About