On Fri, Jul 30, 2021 at 1:48 PM Leon Timmermans <fawaka@gmail.com> wrote: > On Fri, Jul 30, 2021 at 6:56 PM Felipe Gasper <felipe@felipegasper.com> > wrote: > >> FWIW, I think this will regress Perlâs usability. >> >> Probably the worst part about character encoding in Perl is that nothing >> indicates when youâve over-encoded or under-encoded. But, at the very least >> everything right now is consistent by default: source code is parsed as >> bytes (âLatin-1â), and I/O happens as bytes. Thus, a âminimal-effortâ >> approach to writing Perl will at least minimize the odds of encoding >> mismatches: you only run into trouble if you explicitly decode/encode. >> >> If `use v5.36` is to disrupt that consistency by making source code >> UTF-8-decoded but *leaving* I/O as bytes, this seems likely to add another >> âshin-bumperâ to use of Perl that doesnât happen in languages that type >> byte strings differently from text strings. >> >> So quick-and-simple things like `print "é"` will now, in âmodernâ Perl, >> break, with no indication of where/why until a human being comes along, >> notices the problem, and puts in the time to debug it. >> > > It doesn't actually break. PerlIO will try to downgrade that for a > non-:utf8 handle, or upgrade for a :utf8 handle. > Not that it will break in implementation, but in logic. It will print the ISO-8859-1 bytes instead of how it currently would print the UTF-8 encoded bytes, since it started as that. (But also string operations on that UTF-8-encoded string within the code would be wrong, but that doesn't always matter.) -DanThread Previous | Thread Next