develooper Front page | perl.perl5.porters | Postings from April 2001

Re: The State of The Unicode

From:
Graham Barr
Date:
April 3, 2001 01:34
Subject:
Re: The State of The Unicode
Message ID:
20010403093327.B82879@pobox.com
On Tue, Feb 20, 2001 at 02:53:42PM +0000, Nick Ing-Simmons wrote:
> (3.1)
>    One true "bug":
> 
>    unpack('C',$str) != ord($str)   in some cases.
>    Despite perlfunc saying 
> "
>     sub ordinal { unpack("c",$_[0]); } # same as ord()
> 
> "
>    As far as I am aware this is the only remaining wart in the ASCII world.
>    (I was not aware of it till this thread started as I am pack/unpack phobic.)
> 
>    (We can make it do what it does now in scope of 'use bytes' of course.)

I think this is one place where the docs should have a clarification added
"For character with an ordinal value < 256"

I say that because pack is mainly used for packing structures to pass to C
routines. Those routines may not like it very much if the chr(202) is encoded
as utf8.

IMO, "U" should be used for pack/unpacking utf8 characters and "C" should continue
to pack the value & 255

Graham.



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About