develooper Front page | perl.perl5.porters | Postings from August 2021

Re: "use v5.36.0" should imply ASCII source

Thread Previous
From:
Dan Book
Date:
August 6, 2021 15:45
Subject:
Re: "use v5.36.0" should imply ASCII source
Message ID:
CABMkAVWzGZV=UggsvX2cgZmek49aSeDqpA90PPqKdjMCm=0ugA@mail.gmail.com
On Fri, Aug 6, 2021 at 11:23 AM Ricardo Signes <perl.p5p@rjbs.manxome.org>
wrote:

> Porters,
>
> I recently posted the suggestion
> <http://markmail.org/message/wywgcbwhu2nhykxc> that "use v5.36.0" should
> imply "use utf8", which led to a pretty large thread in which Felipe Gasper
> repeatedly said "This is going to make things worse, not better."  I spent
> a lot of time grumbling about this to myself, figuring out exactly how to
> rebut this, and then deciding that I tentatively, partly, agreed with him.
>
> We want each improvement to be a ratcheting up in language usability, when
> possible, rather than "we made things worse so we could make them better."
> At present, because we don't (and can't) know whether a string is text or
> bytes, we don't (and can't) automatically encode it when it hits a
> bytestream.  We also don't know reliably whether a given output handle is
> already expecting to do that encoding for us.
>
> I am 100% certain that adding "use utf8" to the feature bundle would be
> better *for me*, but I already have a pretty strong grasp of the I/O
> model of Perl.  I'm not sure it's better enough for everybody.
>
> At the PSC, we had a long talk about this, and another proposal was made:
>
> We introduce a new stricture, which I'll call "source_encoding".  Under
> "use strict 'source_encoding'", the compiler will raise an exception when
> the source contains non-ASCII content unless the utf8 pragma is in effect.
> The error raised can drive the programmer to documentation explaining the
> various trade-offs.  That is: you can turn on utf8 and deal with how this
> affects your I/O, or you can disable the stricture, or you can restate your
> non-ASCII content as ASCII by using escaping constructs.
>
> I'm not *sure* this is an improvement, but I think it is.  This prevents
> the "I forgot to add utf8 and so only discovered after runtime that I have
> doubly-encoded my output" bug.
>

FWIW, this is roughly what was suggested by Zefram as part of his proposal
for utf8-by-default, phrased as
"deprecate the presence of non-ASCII bytes anywhere in a source file other
than in the scope of "use utf8".".
https://www.nntp.perl.org/group/perl.perl5.porters/2017/10/msg246838.html

-Dan

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About