develooper Front page | perl.perl5.porters | Postings from August 2021

Re: "use v5.36.0" should imply UTF-8 encoded source

Thread Previous | Thread Next
From:
Leon Timmermans
Date:
August 1, 2021 14:23
Subject:
Re: "use v5.36.0" should imply UTF-8 encoded source
Message ID:
CAHhgV8gHtx0eavsZt_kNXT-ij4c8W-hY5FWdQmvrWZVGyUugyg@mail.gmail.com
On Fri, Jul 30, 2021 at 8:46 PM Felipe Gasper <felipe@felipegasper.com>
wrote:

> FWIW I think it’s easier to think of the default I/O mode as “bytes” or
> “native” 8-bit encoding” rather than “Latin-1”. In that light it’s easier
> to see the status quo as the more reasonable default: we parse the code as
> bytes, and we print as bytes.
>

Code is not binary, it is text. E.g.:

use 5.010;
{ no utf8;  say "éé" =~ /\N{LATIN SMALL LETTER E WITH ACUTE}/ ? "yes" :
"no" };
{ use utf8; say "éé" =~ /\N{LATIN SMALL LETTER E WITH ACUTE}/ ? "yes" :
"no" };

The status quo is only reasonable in that 95% of all code is actually
ASCII, so it usually doesn't matter.


> Changing it so that the (“modern”) default is to decode strings as UTF-8
> but still output them as bytes seems likely to introduce lots of confusion,
> which will either a) discourage adoption of “use v5.36”, or b) discourage
> use of Perl at all:
>
> Anti-Perler: Hey that new Perl script you wrote mangles our CEO’s name.
> Perler: That’s weird … I used the modern defaults … wonder where the bug
> is …
> Anti-Perler: Maybe you should just switch to $otherlang, where this stuff
> doesn’t happen.


TBH I expect the exact opposite to happen.

Leon

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About