I started off trying to fix this paragraph: In C<quotemeta> or its inline equivalent C<\Q>, all characters whose code points are above 127 are not quoted in UTF-8 encoded strings, but all are quoted in UTF-8 strings. That (still) makes no sense to me. Here's the wording I came up with that reflects what I *thought* it was trying to say: In C<quotemeta> or its inline equivalent C<\Q>, no characters code points above 127 are quoted in UTF-8 encoded strings, but in byte encoded strings, code points between 128-255 are always quoted. Except that that is not true. :( I've played with blead, including compiled afresh this morning, and on both Darwin and Linux, and I still can't figure out what is supposed to happen, because it doesn't match either of those paragraphs above. I think from looking at Devel::Peek that things aren't being properly utf8'd. This should not be happening according to what I think that that should be saying: % blead -CS -M-feature=unicode_strings -le '$a = "\x{e9}"; print quotemeta($a)' \é % blead -CS -Mfeature=unicode_strings -le '$a = "\x{e9}"; print quotemeta($a)' \é This happens on both Darwin and Mac, and I don't understand why with -E or unicode_strings that I have a non-Unicode String! % blead -CS -MDevel::Peek -E '$a = "\x{e9}"; say "\Q$a"' \é % blead -CS -MDevel::Peek -E '$a = "\x{e9}"; Dump "\Q$a"' SV = PV(0x8010d8) at 0x80ed20 REFCNT = 1 FLAGS = (PADTMP,POK,pPOK) PV = 0x203dc0 "\\\351"\0 CUR = 2 LEN = 16 % blead -CS -MDevel::Peek -Mfeature=unicode_strings -le '$a = "\x{e9}"; Dump($a)' SV = PV(0x801038) at 0x80ed60 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x201380 "\351"\0 CUR = 1 LEN = 16 % blead -CS -MDevel::Peek -Mfeature=unicode_strings -le '$a = "\x{e9}"; Dump("\Q$a")' SV = PV(0x8010e8) at 0x80ed30 REFCNT = 1 FLAGS = (PADTMP,POK,pPOK) PV = 0x203e00 "\\\351"\0 CUR = 2 LEN = 16 But look! % blead -CS -MDevel::Peek -E '$a = "\x{e9}"; utf8::upgrade($a) ; say "\Q$a"' é % blead -CS -MDevel::Peek -E '$a = "\x{e9}"; utf8::upgrade($a) ; Dump "\Q$a"' SV = PV(0x8010d8) at 0x80f040 REFCNT = 1 FLAGS = (PADTMP,POK,pPOK,UTF8) PV = 0x203df0 "\303\251"\0 [UTF8 "\x{e9}"] CUR = 2 LEN = 16 I thought the whole point was so I didn't have to *do* that anymore. :( --tomThread Previous | Thread Next