Marc Lehmann skribis 2007-03-31 1:53 (+0200): > So you force people to know about the internal flag, lest they cannot avoid > the die. No, you don't have to know about the UTF8 flag, just that Perl can't always know if your string is a text string, but is there to help you when it does. > > Besides that, the "C" in Perl's pack() is documented as a single byte. > "A C "char" is a byte". > Your words. > But here you say a byte is not a character. Thats a contradiction. "C char" ne "Perl character". > No, I asked for UTF-8 encoded characters. Again, read the documentation: > * If the pattern begins with a "U", the resulting string will > * be treated as UTF-8-encoded Unicode. Resulting string, not input string. The word "internally" is missing here. I will do my best to correct that. > thats for pack, unfortunately. > U A Unicode character number. Encodes to UTF-8 > internally > uh, that internal thing again. So how many characters will pack "U", 200 > give me? According to the documentation, 2, as UTF-8 requires that. One character. Note again that "character" isn't the same as a "C char". We in Perl land, and the people over in Unicode land, use different words, sometimes. Most of the time, a Perl "character" means codepoint. > > > Right, while the documentation on unpack "U" disagrees with it, as it talks > > > about UTF-8. > > That would be a bug, but I can't find it in my copy (5.8.8). It only > > says "Encodes to UTF-8 internally" for pack(), which as far as I can > > tell, is true. > So it talks about using UTF-8, so, according to you, it is a bug. Fine > with me. This was for pack, you were talking about unpack. Also, the word "internally" was probably not added without reason. -- korajn salutojn, juerd waalboer: perl hacker <juerd@juerd.nl> <http://juerd.nl/sig> convolution: ict solutions and consultancy <sales@convolution.nl> Ik vertrouw stemcomputers niet. Zie <http://www.wijvertrouwenstemcomputersniet.nl/>.Thread Previous | Thread Next