On Wed, Aug 3, 2022 at 11:59 AM Philippe Bruhat (BooK) <book@cpan.org> wrote: > On Sat, Jun 18, 2022 at 11:13:18AM -0400, Ricardo Signes wrote: > > > > my $str1 = do { use utf8; "🙂" }; > > my $str2 = do { "🙂" }; > > If I understand correctly, `use utf8` is less about the encoding of the > file (Perl code is ascii anyway), and more about what the literal > strings in the lexical scope actually contain (either a utf8 string, or > a byte stream), right? > No it is the encoding of the file. Perl code is not ASCII, it's bytes (you can use non-ASCII bytes in symbols as long as they end up being interpreted as word characters). Without use utf8, the bytes are interpreted as codepoints with the identical value (which due to the mapping being identical, is essentially a process of decoding it from ISO-8859-1). With use utf8, the bytes are interpreted in Perl's internal upgraded encoding, which essentially decodes it from UTF-8. Perl has no concept of the meaning of what strings contain, they logically contain a series of codepoints either way. use utf8 just changes how those codepoints are derived from the bytes of the source code, along with the rest of the code. -DanThread Previous | Thread Next