Front page | perl.perl5.porters |
Postings from August 2013
[perl #119499] $! returned with UTF-8 flag under UTF-8 locales only under 5.19.2+
Thread Previous
|
Thread Next
From:
Victor Efimov via RT
Date:
August 29, 2013 22:27
Subject:
[perl #119499] $! returned with UTF-8 flag under UTF-8 locales only under 5.19.2+
Message ID:
rt-3.6.HEAD-1873-1377815238-982.119499-15-0@perl.org
On Thu Aug 29 14:06:57 2013, vsespb wrote:
> On Thu Aug 29 13:05:00 2013, public@khwilliamson.com wrote:
> > 2) As Victor notes, the commit does a UTF-8 validity check, so it is
> I agree that it's pretty reliable. However different languages and
Generator of byte sequences that are valid in UTF-8 and in another
encoding, and which represend letters (\w) in another encoding.
#!/usr/bin/env perl
use strict;
use warnings;
use Encode;
use utf8;
binmode STDOUT, ":encoding(UTF-8)";
my @A = grep { /\w/ } map { chr($_) } (128..1024);
for my $z1 (@A) { for my $z2 ('', @A) { for my $z3 ('', @A) {
for my $encoding (qw/ISO-8859-1 ISO-8859-2 ISO-8859-3 ISO-8859-4
ISO-8859-7 ISO-8859-8 ISO-8859-9 ISO-8859-10/) {
my $S = $z1.$z2.$z3;
my $e = eval { encode($encoding, "$S", Encode::FB_CROAK); };
next unless $e;
my $xx = $e;
$xx =~ s/(.)/sprintf("\\x%02X",ord($1))/eg;
Encode::_utf8_on($e);
if (utf8::valid($e)) {
print "# $encoding [$S]".(length($S))." [$e] [$xx]\n";
print <<"END";
perl -e 'use Encode; binmode STDOUT, ":encoding(UTF-8)"; my \$z = "$xx";
print "[", decode("UTF-8", "\$z", Encode::FB_CROAK), "]\\t[",
decode("$encoding", "\$z", Encode::FB_CROAK), "]\\n"'
END
}
}
}}}
__END__
example output:
perl -e 'use Encode; binmode STDOUT, ":encoding(UTF-8)"; my $z =
"\xC3\xBE"; print "[", decode("UTF-8", "$z", Encode::FB_CROAK), "]\t[",
decode("ISO-8859-2", "$z", Encode::FB_CROAK), "]\n"'
perl -e 'use Encode; binmode STDOUT, ":encoding(UTF-8)"; my $z =
"\xC3\xBC"; print "[", decode("UTF-8", "$z", Encode::FB_CROAK), "]\t[",
decode("ISO-8859-2", "$z", Encode::FB_CROAK), "]\n"'
perl -e 'use Encode; binmode STDOUT, ":encoding(UTF-8)"; my $z =
"\xC3\xA1"; print "[", decode("UTF-8", "$z", Encode::FB_CROAK), "]\t[",
decode("ISO-8859-2", "$z", Encode::FB_CROAK), "]\n"'
example output of output example:
$perl -e 'use Encode; binmode STDOUT, ":encoding(UTF-8)"; my $z =
"\xC3\xBC"; print "[", decode("UTF-8", "$z", Encode::FB_CROAK), "]\t[",
decode("ISO-8859-2", "$z", Encode::FB_CROAK), "]\n"'
[ü] [Ăź]
---
via perlbug: queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=119499
Thread Previous
|
Thread Next