develooper Front page | perl.perl5.porters | Postings from August 2021

Re: "use v5.36.0" should imply ASCII source

Thread Previous | Thread Next
From:
Felipe Gasper
Date:
August 6, 2021 16:04
Subject:
Re: "use v5.36.0" should imply ASCII source
Message ID:
52A1131E-AFE0-4304-939A-210FA222F0F2@felipegasper.com

> On Aug 6, 2021, at 11:22 AM, Ricardo Signes <perl.p5p@rjbs.manxome.org> wrote:
> 
> Porters,
> 
> I recently posted the suggestion that "use v5.36.0" should imply "use utf8", which led to a pretty large thread in which Felipe Gasper repeatedly said "This is going to make things worse, not better."  I spent a lot of time grumbling about this to myself, figuring out exactly how to rebut this, and then deciding that I tentatively, partly, agreed with him.
> 
> We want each improvement to be a ratcheting up in language usability, when possible, rather than "we made things worse so we could make them better."  At present, because we don't (and can't) know whether a string is text or bytes, we don't (and can't) automatically encode it when it hits a bytestream.  We also don't know reliably whether a given output handle is already expecting to do that encoding for us.
> 
> I am 100% certain that adding "use utf8" to the feature bundle would be better for me, but I already have a pretty strong grasp of the I/O model of Perl.  I'm not sure it's better enough for everybody.
> 
> At the PSC, we had a long talk about this, and another proposal was made:
> 
> We introduce a new stricture, which I'll call "source_encoding".  Under "use strict 'source_encoding'", the compiler will raise an exception when the source contains non-ASCII content unless the utf8 pragma is in effect.  The error raised can drive the programmer to documentation explaining the various trade-offs.  That is: you can turn on utf8 and deal with how this affects your I/O, or you can disable the stricture, or you can restate your non-ASCII content as ASCII by using escaping constructs.
> 
> I'm not sure this is an improvement, but I think it is.  This prevents the "I forgot to add utf8 and so only discovered after runtime that I have doubly-encoded my output" bug.

This seems reasonable. It encourages decoding of UTF-8 characters while still allowing `print "hello world"` to be correct in modern Perl.

-FG
Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About