On Mon, Feb 05, 2007 at 10:41:15PM +0100, Juerd Waalboer wrote: > Gerard Goossen skribis 2007-02-05 20:39 (+0100): > > Sometimes you need have a byte-string. > > Indeed. > > > But \x.. generates a character. > > (Note that \xFF and \x{ff} are the same, for any capitalization of ff.) > > Or a byte. Because of the clever Unicode implementation in Perl, you get > a character if you use the return value in a unicode string, and a byte > if you use the return value in a byte string. > > This is not a matter of context, by the way. Instead, the value "\xFF" > is polymorphic. It's both a unicode string representing code point > U+00FF, and the single byte 0xFF. No. \xFF creates a character represented by FF according to the native encoding. If your native encoding is EBCDIC this does NOT correspend to U+00FF (instead it corresponds to U+007E or U+009F, depending on the flavor of EBCDIC you're on). You also assume that \xFF in the native encoding corresponds to a byte You assume (like everybody else) that in the native encoding a character corresponds to a byte with the same numeric value. This assumption is what makes the transition to UTF-8 so difficult, because in the UTF-8 encoding, the assumption is NOT correct. Gerard Goossen