> So, is it possible for the source code (in a UTF-8 file) to use \x80 (or any numeric \x escape) to represent the character U+0080? > > C2 80 is the UTF-8 encoding of U+0080, so the following are equivalent: > > $x = "\x80"; > > and > > use encoding 'UTF-8'; > $x = "\xC2\x80"; > > (Except perhaps in how the UTF8 flag is set, but that's not suppose to make a difference.) > > - Eric Could the latter representation (\xc2\x80) appear in a regular-expression character class, too?Thread Previous | Thread Next