On Thu, Dec 9, 2010 at 1:35 AM, Jonathan Pool <pool@utilika.org> wrote: > > Jonathan, you said that the encoding was utf8, but \x80 is not a legal > utf8-encoded character. But it should have warned that it was substituting > FFFD. > > The script reads a line from a UTF8-encoded file into a Perl scalar. > The file is being read in without issue. The problem is with the literals in the source file. It then operates on the scalar. > > In man perlunicode, one reads: "Unless explicitly stated, Perl operators > use [...] > You explicitly stated you wanted different behaviour from the literal by using "use encoding". perl -e'use encoding "utf8"; qr/[\x7F-\x80]' means perl -e'qr/{{{decode("utf8", "[\x7F-\x80]")}}}/' which becomes perl -e'qr/[\x7F-\x{FFFD}]/' The effect of "use encoding" on \x escapes in literals and the like is why some people avoid "use encoding".Thread Previous | Thread Next