Front page | perl.perl5.porters |
Postings from February 2017
Re: [perl #130655] Bleadperl v5.25.8-68-g94749a5ed2 breaksMAUKE/Quote-Ref-0.03.tar.gz
Thread Next
From:
ilmari
Date:
February 3, 2017 01:24
Subject:
Re: [perl #130655] Bleadperl v5.25.8-68-g94749a5ed2 breaksMAUKE/Quote-Ref-0.03.tar.gz
Message ID:
d8j37fz9yjq.fsf@dalvik.ping.uio.no
"James E Keenan via RT" <perlbug-followup@perl.org> writes:
> On Fri, 27 Jan 2017 07:45:57 GMT, andreas.koenig.7os6VVqR@franz.ak.mind.de wrote:
>> bisect
>> ------
>> commit 94749a5ed2171bb6de72e384a78f5df552d812bb
>> Author: Karl Williamson <khw@cpan.org>
>> Date: Tue Dec 20 13:41:58 2016 -0700
>>
>> Deprecate non-grapheme string delimiter
>>
>> diagnostics
>> -----------
>> Wide character in print at t/03-unicode.t line 12.
>> # Looks like your test exited with 2 before it could output anything.
>> t/03-unicode.t ..
>> Dubious, test returned 2 (wstat 512, 0x200)
>> Failed 6/6 subtests
>>
>
> This test fails because the author has enabled fatalization of warnings in tests.
That's just masking the real error message, which happens to contain a
wide character. If you disable the fatal warnings, you get:
> #####
1 use warnings;
> 2 use strict;
> 3 use utf8;
> 4
> 5 use Test::More tests => 6;
> 6
> 7 use Quote::Ref;
> 8
> 9 is_deeply qwa foo bar baz , [qw foo bar baz ];
> 10 is_deeply qwh foo bar baz " , {qw foo bar baz " };
> 11
> 12 is_deeply qwa foo bar , [qw foo bar ];
> 13 is_deeply qwh foo bar , {qw foo bar };
> ...
> #####
ilmari@garkbit:~/.cpanm/work/1485526294.17204/Quote-Ref-0.03$ prove -bv t/03-unicode.t
t/03-unicode.t ..
1..6
Wide character in print at t/03-unicode.t line 12.
Unrecognized character \x{2665}; marked by <-- HERE after [qw foo <-- HERE near column 39 at t/03-unicode.t line 12.
# Looks like your test exited with 2 before it could output anything.
Dubious, test returned 2 (wstat 512, 0x200)
Failed 6/6 subtests
This reduces to the following:
$ perl5.25.9 -CS -Mutf8 -wE 'say qw bar '
Unrecognized character \x{2665}; marked by <-- HERE after qw foo <-- HERE near column 14 at -e line 1.
Which is a regression:
$ perl5.25.8 -CS -Mutf8 -wE 'say qw foo bar '
foo bar
In fact it seems like if the opening delimiter is above U+100, any
closing delimiter in the same U+x000 range matches, until we get to
U+10000, above which even cross-range delimiters match.
#!/usr/bin/env perl
use utf8;
use strict;
use warnings;
use open qw(:std :utf8);
use experimental qw(regex_sets);
use feature qw(unicode_eval);
my @delims = map {
my $s = $_ * 0x1000;
my $e = $s + 0xfff;
# Get the first two accepatble delimiters in this block
my ($o, $c) = grep /(?[ \p{Assigned} & !(
\p{Letter} | \p{Number} | \p{Space} |
\p{Nonspacing_Mark} | \p{Spacing_Mark} | \p{Format} |
\p{Private_Use}
) ])/x,
map chr, $s..$e;
defined $o && defined $c
? ($o, $c)
: ();
} 0..0xff;
splice @delims, 2, 0, "\N{U+2C2}", "\N{U+2F5}"; # between U+100 and U+1000
print "perl $]\n";
for my $i (0..$#delims-1) {
my ($o, $c) = @delims[$i, $i+1];
my $ok = eval "my \$x = q${o}foo${c}" ? "not ok" : "ok ";
warn "$@" if $ok eq "ok" and $@ !~ /string terminator/;
printf "$ok - U+%04X U+%04X\n", ord $o, ord $c;
}
On perl 5.25.9, we get the following failures:
perl 5.025009
ok - U+0000 U+0001
ok - U+0001 U+02C2
not ok - U+02C2 U+02F5
ok - U+02F5 U+104A
not ok - U+104A U+104B
ok - U+104B U+2010
not ok - U+2010 U+2011
ok - U+2011 U+3001
not ok - U+3001 U+3002
ok - U+3002 U+4DC0
not ok - U+4DC0 U+4DC1
ok - U+4DC1 U+A490
not ok - U+A490 U+A491
ok - U+A491 U+D800
not ok - U+D800 U+D801
ok - U+D801 U+FB29
not ok - U+FB29 U+FBB2
ok - U+FBB2 U+10100
not ok - U+10100 U+10101
not ok - U+10101 U+11047
not ok - U+11047 U+11048
not ok - U+11048 U+12470
not ok - U+12470 U+12471
not ok - U+12471 U+16A6E
not ok - U+16A6E U+16A6F
not ok - U+16A6F U+1BC9C
not ok - U+1BC9C U+1BC9F
not ok - U+1BC9F U+1D000
not ok - U+1D000 U+1D001
not ok - U+1D001 U+1E95E
not ok - U+1E95E U+1E95F
not ok - U+1E95F U+1F000
not ok - U+1F000 U+1F001
While on perl 5.25.8 all is good:
$ ~/tmp/delimwtf.pl
ok - U+0000 U+0001
ok - U+0001 U+02C2
ok - U+02C2 U+02F5
ok - U+02F5 U+104A
ok - U+104A U+104B
ok - U+104B U+2010
ok - U+2010 U+2011
ok - U+2011 U+3001
ok - U+3001 U+3002
ok - U+3002 U+4DC0
ok - U+4DC0 U+4DC1
ok - U+4DC1 U+A490
ok - U+A490 U+A491
ok - U+A491 U+D800
ok - U+D800 U+D801
ok - U+D801 U+FB29
ok - U+FB29 U+FBB2
ok - U+FBB2 U+10100
ok - U+10100 U+10101
ok - U+10101 U+11047
ok - U+11047 U+11048
ok - U+11048 U+12470
ok - U+12470 U+12471
ok - U+12471 U+16A6E
ok - U+16A6E U+16A6F
ok - U+16A6F U+1BC9C
ok - U+1BC9C U+1BC9F
ok - U+1BC9F U+1D000
ok - U+1D000 U+1D001
ok - U+1D001 U+1E95E
ok - U+1E95E U+1E95F
ok - U+1E95F U+1F000
ok - U+1F000 U+1F001
--
- Twitter seems more influential [than blogs] in the 'gets reported in
the mainstream press' sense at least. - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
to a mainstream media article. - Calle Dybedahl
Thread Next