develooper Front page | perl.perl5.porters | Postings from April 2007

pack/unpack feature suggestion (was: Re: perl, the data, and the utf8 flag)

Thread Previous | Thread Next
Juerd Waalboer
April 3, 2007 05:54
pack/unpack feature suggestion (was: Re: perl, the data, and the utf8 flag)
Message ID:
Glenn Linderman skribis 2007-04-01 16:34 (-0700):
> Aha!  OK, this is a way that unpack could successfully operate on a 
> multi-bytes buffer.  But I think it is also equivalent to downgrading it 
> (with a warning for values > 255) and then processing it as bytes.  

Not if you also have the "U" in the template somewhere, in addition to
other letters. (Bad idea anyway!)

> I think that pack-U should be defined to produce "encoded bytes"

It doesn't do that, though. It produces encodingless characters, not
bytes. However, you inspired me to come up with the following:

    $byte_string =   pack "a*[UTF-8]", $text_string
    $text_string = unpack "a*[UTF-8]", $byte_string

Likewise for "A" and "Z", and for arbitrary encodings. This would just
call Encode::encode (for pack) or Encode::decode (for unpack)
transparently, before doing the actual packing or unpacking.

The quantifier is a number of bytes, not characters. This means that it
can be in the middle of a multibyte encoding for a character. When that
happens, tough luck. We can't help that. (In other words: this really
only makes a lot of sense for multibyte packing if the quantifier is *)
korajn salutojn,

  juerd waalboer:  perl hacker  <>  <>
  convolution:     ict solutions and consultancy <>

Ik vertrouw stemcomputers niet.
Zie <>.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About