develooper Front page | perl.perl5.porters | Postings from October 2000

[ID 20001028.003] Another UTF-8 upgrade bug

Thread Next
From:
andreas.koenig
Date:
October 28, 2000 23:30
Subject:
[ID 20001028.003] Another UTF-8 upgrade bug
Message ID:
m3aebo88hb.fsf@ak-71.mind.de
This should go into the test suite when the bug is fixed unless it's
covered elsewhere:

% /usr/local/perl-5.7.0@7471/bin/perl -e ' 
$X = $Y = "=E0 U+05D0";
$X =~ s/=(..)/chr(hex($1))/e;
$X =~ s/U\+(....)/chr(hex($1))/e;
$Y =~ s/U\+(....)/chr(hex($1))/e;
$Y =~ s/=(..)/chr(hex($1))/e;
print "not " unless $X eq $Y;
print "ok 1\n";
'
not ok 1


Here is a demo that shows what goes wrong:


% /usr/local/perl-5.7.0@7471/bin/perl -le '
 $X = "=E0 U+05D0 ";    
 $X =~ s/=(..)/chr(hex($1))/e;
 print $X;
 $X =~ s/U\+(....)/chr(hex($1))/e;
 print $X;
' | /usr/local/perl-5.7.0@7471/bin/perl -ple '
use bytes;
s/(.)/ord($1) < 127 ? $1 : sprintf("%02x",ord($1))/ge;
'
e0 U+05D0 
c3a0 d790 


The e0 character is converted to UTF-8 when the second regexp is
applied.



Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About