develooper Front page | perl.perl5.porters | Postings from February 2007

Re: Future Perl development

Thread Previous | Thread Next
February 5, 2007 14:10
Re: Future Perl development
Message ID:
Gerard Goossen schreef:
> Dr.Ruud:
>> Gerard Goossen:

>>> If we make \x{?..}? really insert codepoints, and not sometimes
>>> bytes, we need an escape sequence for bytes.
>> That is thinking the wrong way around, because it should only depend
>> on the encoding at hand. And the encoding of the source file does
>> not have to be equal to the encoding of the referenced data, for
>> example a file that is written to. So if the source file is in UTF-8
>> and the data is in Latin-1, then an "Ä" will be built from multiple
>> bytes for the source file but be only a single byte in the data.
> Sometimes you need have a byte-string. But \x.. generates a character.

perl -wle '
  print pack "H*",

> In Perl 5 \xFF generates a byte. But if your target encoding is UTF-8,
> \xFF generates two bytes. And there is no way to insert the byte FF
> into the string, because this isn't a valid codepoint UTF-8.

Doing something like that should turn it into a byte buffer, because it
is no longer valid UTF-8. So just use unpack.

> In Perl 5 \xFF inserts a byte, because 0xFF is smaller then 256


perl -wle '
  $s = substr "\x{100}\xFF", 1;
  print length $s, ":", unpack "H*", $s;

Affijn, Ruud

"Gewoon is een tijger."

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About