develooper Front page | perl.perl5.porters | Postings from July 2021

Re: "use v5.36.0" should imply UTF-8 encoded source

Thread Previous | Thread Next
From:
Dan Book
Date:
July 31, 2021 03:07
Subject:
Re: "use v5.36.0" should imply UTF-8 encoded source
Message ID:
CABMkAVXDOOb==8TjaSu9=SBe9c44czO5USmxoOso4inKvfCRZw@mail.gmail.com
On Fri, Jul 30, 2021 at 11:03 PM Eirik Berg Hanssen <
Eirik-Berg.Hanssen@allverden.no> wrote:

> On Sat, Jul 31, 2021 at 4:29 AM Dan Book <grinnz@gmail.com> wrote:
>
>> On Fri, Jul 30, 2021 at 10:15 PM Eirik Berg Hanssen <
>> Eirik-Berg.Hanssen@allverden.no> wrote:
>>
>>> On Fri, Jul 30, 2021 at 8:28 PM Leon Timmermans <fawaka@gmail.com>
>>> wrote:
>>>
>>>> On Fri, Jul 30, 2021 at 7:56 PM Felipe Gasper <felipe@felipegasper.com>
>>>> wrote:
>>>>
>>>>>
>>>>> It’ll downgrade it, but it won’t encode it, so you’ll get mojibake:
>>>>>
>>>>> > perl -Mutf8 -e'print "é"'
>>>>> �
>>>>>
>>>>
>>>> It will print mojibake as well if the script is latin-1 encoded. It's
>>>> mojibake because the terminal is utf-8, but the IO handle is latin1.
>>>>
>>>
>>>   In this case there is no "script" other than the command line, in the
>>> terminal.  Round-tripping characters from the terminal to the terminal,
>>> broken.  Sounds painful.
>>>
>>>   I'd expect the encoding to be the same for the code as for the
>>> standard handles, unless either is otherwise specified.  It would surprise
>>> me if a simple perl -E broke that.
>>>
>>>   I'm leaning towards thinking that, while there's no problem with
>>> lexical, explicit declarations of source encodings, the default source
>>> encoding is more of a global thing, and to avoid nasty surprises, ought to
>>> correspond to the default encoding of the standard handles.
>>>
>>
>> This isn't the "default", it's the entire function of "use utf8" and only
>> applies to that lexical scope.
>>
>
>   It is the "default" in the sense of "what you get in the absence of
> explicit declarations like use utf8 and no utf8".
>
>   (Is there a better word?)
>

In the absence of explicit declarations, the source code is bytes, the same
as the standard handles; so I'm not sure what your point is.

-Dan

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About