2021-11-22 6:00 Ricardo Signes <perl.p5p@rjbs.manxome.org> wrote: > > My take is this: The end state I'd like is that strings are in one of > three states: declared text, declared bytes, unknown. Semantics exist for > how to combine these and deal with I/O discipline. The source code is > Unicode and string literals are assumed to be text. A new string literal > syntax exists for byte strings, like qb"...". > > I think the flag for text is needed instead of confused and misused utf8::is_utf8. if (is_text($text)) { say Encode::encode('UTF-8', $text); } > For my money, a useful next step is that we encourage people to opt-in to > "source code is unicode and string literals are text." This means that the > programmer is then responsible for thinking about how this will affect > their I/O. That concern is already there, we're just pushing around the > complexity like a lump under the rug. I think this push is a good one. It > lets us enable non-ASCII syntax, and it's pretty well understood. Also, we > already have something for qb"...." in the form of "do { use bytes; qq{...} > }" but we could probably add a qb, too, if we needed it. > > I agree with this. use v5.40; # Text (a decoded string). Literal is interpreted as UTF-8 my $text = "abcde"; # Bytes if you need more performance by index access my $bytes = qb"abcde";Thread Previous | Thread Next