develooper Front page | perl.perl5.porters | Postings from April 2003

chr(0xE3).chr(0x81).chr(0x82) =~ /^\x{3042}$/; # match!

Thread Next
Dan Kogai
April 7, 2003 01:36
chr(0xE3).chr(0x81).chr(0x82) =~ /^\x{3042}$/; # match!
Message ID:

   One of the perl 5.8.0 users accidentally found this.

use strict;
use warnings;
$\ = "\n";

use encoding "utf8";
my $e = chr(0xE3).chr(0x81).chr(0x82);
print $e                            =~ /^\x{3042}$/ ? 'true' : 'false';
print chr(0xE3).chr(0x81).chr(0x82) =~ /^\x{3042}$/ ? 'true' : 'false';

This prints "false" for the first but "true" for the next one.  U+3042 
(HIRAGANA LETTER A) in UTF-8 is \xE3\x81\x82 so bytewise they may match 
but the UTF8 flag for chr(0xE3).chr(0x81).chr(0x82) is off so it should 
not match (regardless of use (utf8|bytes).  So the first one is okay 
but the second one is not.

my $name = "\x{5c0f}\x{98fc} \x{5f3e}"; # KOGAI, Dan

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About