develooper Front page | perl.perl5.porters | Postings from April 2003

chr(0xE3).chr(0x81).chr(0x82) =~ /^\x{3042}$/; # match!

Thread Next
From:
Dan Kogai
Date:
April 7, 2003 01:36
Subject:
chr(0xE3).chr(0x81).chr(0x82) =~ /^\x{3042}$/; # match!
Message ID:
0F03F42F-68D4-11D7-A4D4-000393AE4244@dan.co.jp
Porters,

   One of the perl 5.8.0 users accidentally found this.

#
use strict;
use warnings;
$\ = "\n";

use encoding "utf8";
my $e = chr(0xE3).chr(0x81).chr(0x82);
print $e                            =~ /^\x{3042}$/ ? 'true' : 'false';
print chr(0xE3).chr(0x81).chr(0x82) =~ /^\x{3042}$/ ? 'true' : 'false';
__END__

This prints "false" for the first but "true" for the next one.  U+3042 
(HIRAGANA LETTER A) in UTF-8 is \xE3\x81\x82 so bytewise they may match 
but the UTF8 flag for chr(0xE3).chr(0x81).chr(0x82) is off so it should 
not match (regardless of use (utf8|bytes).  So the first one is okay 
but the second one is not.

my $name = "\x{5c0f}\x{98fc} \x{5f3e}"; # KOGAI, Dan


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About