On 2/21/22 10:48, Ovid via perl5-porters wrote: > On Monday, 21 February 2022, 18:32:34 CET, Paul "LeoNerd" Evans <leonerd@leonerd.org.uk> wrote: > >> On a similar note: How do people feel about code which turns off the > "my source encoding is UTF-8" pragma after having previously turned it > on? I.e. >> >> use utf8; >> my $café = "Ĉu vi havas sandviĉojn?"; >> >> no utf8; >> >> If we disallow this kind of thing, we can remove further weird >> cornercases from the parser, because a bunch of unlikely situations no >> longer come up. >> >> As with VERSION: Are there any actually-valid use-cases for doing this >> kind of thing? > > Conversely, do we know of areas where "no utf8" causes problems in the Perl language intead of the perl core? > > Also, what are the problems in the perl core? Are they causing grief? > > Best, > Ovid > -- I don't see a compelling reason to change or remove 'no utf8' The bugs I know about came from my reading code; not from any reported issues, and they involve switching encodings in mid-file I think all we need say is that the behavior is undefined if your file contains multiple encodings in it. Since ASCII is a proper subset of UTF-8, it should be fine to have sections of the file only in ASCII, and other sections allow complete UTF-8. The proposed use source::encoding 'ascii' pragma can be used to demarcate such sections, if desired. The only other possibility of multiple encodings in modern perl is Latin1 vs UTF-8. Non-ASCII Latin1 characters have a different representation in UTF-8 than when not. And I don't think it makes much sense for a Perl source to have both encodings. The bugs arise when that happens. If we say results are undefined for this behavior, we don't have to worry about it.Thread Previous | Thread Next