develooper Front page | perl.perl5.porters | Postings from January 2005

Re: [perl #33734] unpack fails on utf-8 strings

Thread Previous | Thread Next
Nicholas Clark
January 13, 2005 05:56
Re: [perl #33734] unpack fails on utf-8 strings
Message ID:
On Tue, Jan 11, 2005 at 06:08:26PM +0100,  Marc A. Lehmann  wrote:

> I cannot find any conversion operator that would make sense when feed with
> non-octect-data (in the perlfunc manpage, except maybe "U", but even "U"
> should work on octets, not on an utf-8 string, i.e. it should generate two
> characters for \x80, not one).

I didn't know, but looking at the pack implementation, it's 'U', and only 'U':

$ ./perl -Ilib -we 'use Devel::Peek; Dump pack "U", 256'
SV = PV(0x1801448) at 0x1801240
  REFCNT = 1
  PV = 0x600350 "\304\200"\0 [UTF8 "\x{100}"]
  CUR = 2
  LEN = 14

It expects UTF8 on the way back in, *marked* with the UTF8 flag.

$ ./perl -Ilib -MCarp -we 'use Devel::Peek; $a = pack "U", 256; utf8::encode $a; Dump $a; Dump unpack "U", $a'
SV = PV(0x1801478) at 0x1808e70
  REFCNT = 1
  PV = 0x6016a0 "\304\200"\0
  CUR = 2
  LEN = 3
SV = IV(0x1809f68) at 0x1801090
  REFCNT = 1
  IV = 196

I'm about to test a hack that might make most things work.

Nicholas Clark

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About