On Fri, Aug 6, 2021 at 11:23 AM Ricardo Signes <perl.p5p@rjbs.manxome.org> wrote: > Porters, > > I recently posted the suggestion > <http://markmail.org/message/wywgcbwhu2nhykxc> that "use v5.36.0" should > imply "use utf8", which led to a pretty large thread in which Felipe Gasper > repeatedly said "This is going to make things worse, not better." I spent > a lot of time grumbling about this to myself, figuring out exactly how to > rebut this, and then deciding that I tentatively, partly, agreed with him. > > We want each improvement to be a ratcheting up in language usability, when > possible, rather than "we made things worse so we could make them better." > At present, because we don't (and can't) know whether a string is text or > bytes, we don't (and can't) automatically encode it when it hits a > bytestream. We also don't know reliably whether a given output handle is > already expecting to do that encoding for us. > > I am 100% certain that adding "use utf8" to the feature bundle would be > better *for me*, but I already have a pretty strong grasp of the I/O > model of Perl. I'm not sure it's better enough for everybody. > > At the PSC, we had a long talk about this, and another proposal was made: > > We introduce a new stricture, which I'll call "source_encoding". Under > "use strict 'source_encoding'", the compiler will raise an exception when > the source contains non-ASCII content unless the utf8 pragma is in effect. > The error raised can drive the programmer to documentation explaining the > various trade-offs. That is: you can turn on utf8 and deal with how this > affects your I/O, or you can disable the stricture, or you can restate your > non-ASCII content as ASCII by using escaping constructs. > > I'm not *sure* this is an improvement, but I think it is. This prevents > the "I forgot to add utf8 and so only discovered after runtime that I have > doubly-encoded my output" bug. > FWIW, this is roughly what was suggested by Zefram as part of his proposal for utf8-by-default, phrased as "deprecate the presence of non-ASCII bytes anywhere in a source file other than in the scope of "use utf8".". https://www.nntp.perl.org/group/perl.perl5.porters/2017/10/msg246838.html -DanThread Previous