develooper Front page | perl.perl5.porters | Postings from February 2022

Re: Deprecating downgrade of `use utf8`

Thread Previous | Thread Next
From:
Karl Williamson
Date:
February 22, 2022 03:39
Subject:
Re: Deprecating downgrade of `use utf8`
Message ID:
c9903943-032a-b19c-920f-665af6ee37e8@khwilliamson.com
On 2/21/22 17:35, demerphq wrote:
> On Tue, 22 Feb 2022 at 00:34, Karl Williamson <public@khwilliamson.com 
> <mailto:public@khwilliamson.com>> wrote:
> 
>     On 2/21/22 10:48, Ovid via perl5-porters wrote:
>      > On Monday, 21 February 2022, 18:32:34 CET, Paul "LeoNerd" Evans
>     <leonerd@leonerd.org.uk <mailto:leonerd@leonerd.org.uk>> wrote:
>      >
>      >> On a similar note: How do people feel about code which turns off the
>      > "my source encoding is UTF-8" pragma after having previously
>     turned it
>      > on? I.e.
>      >>
>      >>    use utf8;
>      >>    my $café = "Ĉu vi havas sandviĉojn?";
>      >>
>      >>    no utf8;
>      >>
>      >> If we disallow this kind of thing, we can remove further weird
>      >> cornercases from the parser, because a bunch of unlikely
>     situations no
>      >> longer come up.
>      >>
>      >> As with VERSION: Are there any actually-valid use-cases for
>     doing this
>      >> kind of thing?
>      >
>      >   Conversely, do we know of areas where "no utf8" causes problems
>     in the Perl language intead of the perl core?
>      >
>      > Also, what are the problems in the perl core? Are they causing grief?
>      >
>      > Best,
>      > Ovid
>      > --
> 
>     I don't see a compelling reason to change or remove 'no utf8'
> 
>     The bugs I know about came from my reading code; not from any reported
>     issues, and they involve switching encodings in mid-file
> 
>     I think all we need say is that the behavior is undefined if your file
>     contains multiple encodings in it.
> 
>     Since ASCII is a proper subset of UTF-8, it should be fine to have
>     sections of the file only in ASCII, and other sections allow complete
>     UTF-8.  The proposed use source::encoding 'ascii' pragma can be used to
>     demarcate such sections, if desired.
> 
>     The only other possibility of multiple encodings in modern perl is
>     Latin1 vs UTF-8.  Non-ASCII Latin1 characters have a different
>     representation in UTF-8 than when not.  And I don't think it makes much
>     sense for a Perl source to have both encodings.  The bugs arise when
>     that happens.  If we say results are undefined for this behavior, we
>     don't have to worry about it.
> 
> 
> I dont really get it, why would we leave something undefined when we can 
> ban it? Undefined just means bugs.
> 
> Yves

There a a bunch of harmless uses of it on CPAN; I hate to deprecate and 
force someone to change a harmless use

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About