On Mon, Feb 05, 2007 at 12:54:32AM +0100, Dr.Ruud wrote: > Gerard Goossen schreef: > > > If we make \x{?..}? really insert codepoints, and not sometimes > > bytes, we need an escape sequence for bytes. > > That is thinking the wrong way around, because it should only depend on > the encoding at hand. And the encoding of the source file does not have > to be equal to the encoding of the referenced data, for example a file > that is written to. So if the source file is in UTF-8 and the data is in > Latin-1, then an "Ä" will be built from multiple bytes for the source > file but be only a single byte in the data. Sometimes you need have a byte-string. But \x.. generates a character. In Perl 5 \xFF generates a byte. But if your target encoding is UTF-8, \xFF generates two bytes. And there is no way to insert the byte FF into the string, because this isn't a valid codepoint UTF-8. So I proposed to use \x[FF] in Perl7 to insert the byte FF. In Perl 5 \xFF inserts a byte, because 0xFF is smaller then 256, but having \x[FF] to be explicit that you want a byte would be nice. PS. This would also solve some EBCDIC problems where in Perl5 \xA4 does not generate an 'A', on EBCDIC platforms. Gerard Goossen