On Thu, Dec 9, 2010 at 11:58 AM, Jonathan Pool <pool@utilika.org> wrote: > > The file is being read in without issue. The problem is with the literals > in the source file. > > > > You explicitly stated you wanted different behaviour from the literal by > using "use encoding". > > > > perl -e'use encoding "utf8"; qr/[\x7F-\x80]' > > > > means > > > > perl -e'qr/{{{decode("utf8", "[\x7F-\x80]")}}}/' > > > > which becomes > > > > perl -e'qr/[\x7F-\x{FFFD}]/' > > > > The effect of "use encoding" on \x escapes in literals and the like is > why some people avoid "use encoding". > > Thank you for this explanation. > > So, is it possible for the source code (in a UTF-8 file) to use \x80 (or > any numeric \x escape) to represent the character U+0080? > ˉ > C2 80 is the UTF-8 encoding of U+0080, so the following are equivalent: $x = "\x80"; and use encoding 'UTF-8'; $x = "\xC2\x80"; (Except perhaps in how the UTF8 flag is set, but that's not suppose to make a difference.) - EricThread Previous | Thread Next