develooper Front page | perl.perl5.porters | Postings from February 2008

UTF8 problem with Perl 5.10.0

Thread Next
Phil Harvey
February 21, 2008 05:48
UTF8 problem with Perl 5.10.0
Message ID:
I am trying to convert a series of bytes that I know to be UTF8 to  
obtain the numerical codepoints for each character (if this makes  
sense).  In previous versions of Perl (back to 5.6.1), this was the  

 > perl -e 'print unpack("H*", pack("n*",unpack("U0U*","\xc3\xb6")))'

Which is what I expected, and what I require.

But in Perl 5.10.0, this happens:

 > perl-5.10.0 -e 'print unpack("H*", pack("n*",unpack("U0U*","\xc3 

Which obviously hasn't interpreted the string as UTF8.

Needless to say, this change in behaviour is rather distressing.  How  
can I change my unpack call so that this works again for all versions  
of Perl (>=5.6.1)?

TIA for any help you can provide.

	- Phil

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About