> On Aug 2, 2021, at 1:11 AM, Yuki Kimoto <kimoto.yuki@gmail.com> wrote: > > > 2021-8-1 9:54 Felipe Gasper <felipe@felipegasper.com> wrote: > > Another way to look at it: the content of the parsed strings actually differs between the two: > > my $x = do { no utf8; "éé" }; > my $y = do { use utf8; "éé" }; > > > Felipe > > I have a question. > > I think the problem is which is the better default in 2021 for general application users. > > The existing code is "no utf8" so it won't break. > > In the new code, the generally recommended way is > > use strict; > use warnings; > use utf8; Recommended by whom? I generally don’t `use utf8`, and $work actually forbids it. The status quo’s consistency (i.e., everything’s a byte string until something explicitly decodes it) far outpaces whatever value I’d get from having `length "é"` return 1 rather than 2. > If user needs old behavior, he need to write > > use v5.xx; > no utf8; > > Are you clearly aware that this is a default change, not internal representation changes? Yup, I know that this would only affect code that does `use 5.36`, or -E at the command line. The former would, by definition, be new code, and the latter is inherently unstable, so there’s no problem with the fact that it’s a behaviour change from default per se. The problem is that the feature bundles, by definition, represent Perl at its ostensible best, its most modern. This particular proposal would make `perl -E'say "¡Hola, mundo!"` print mojibake. That seems undesirable in the extreme; no other major language introduces that complexity for such a trivial task, and if one did, it would give some indication of what’s wrong rather than Perl’s “silent failure” approach. This all said: if the desire is more to be able to use non-ASCII in identifier names (e.g., `sub épée { … }`), could a variant of utf8.pm be created that leaves string literals undecoded but just decodes sub names and the like? *That* would seem a reasonable improvement upon status quo. -FThread Previous | Thread Next