Front page | perl.perl5.porters |
Postings from October 2017
Re: source encoding
From: Kenichi Ishigaki
October 28, 2017 02:54
Re: source encoding
Message ID: CADp=7twfpbtaueTjQ9uez8MOky85ddkOeka2NjmSoxo+8u9L0Q@mail.gmail.com
2017-10-26 6:51 GMT+09:00 Zefram <firstname.lastname@example.org>:
> Father Chrysostomos wrote:
>>Before perl supported Unicode, that was the most obvious way to handle
>>utf8 correctly in perl:
> It made some sense back then, but keeping such pre-5.6 Unicode-using
> programs running totally unchanged isn't a compelling backcompat case.
> If you want to keep the program running without any semantic change,
> it's easy enough to recode the source in ASCII with \x escapes.
> (Which was the other most obvious way to handle UTF-8 prior to 5.6.)
Does this mean it'll be impossible for me to use a oneliner that
replaces some Kanji characters written in CP932 (Japanese code page
for Windows) in CP932 terminal, and maybe a casual script that
replaces the same thing written in CP932?
>>If this stops working, I might find it annoying enough to stick with
>>an old perl for 'real' work.
> If you did that, would you drop your opposition to new perls making
> "Wide character in print" fatal?
>>I would hope that we could solve this before we make any significant
>>changes in the way source encoding is handled,
> That would make the transitions a bit nicer, but we seem to be a lot
> further from having any solution to filenames.
>>Maybe this is an alternative solution to enforcing a uniform encoding
>>on all source code: Make =encoding affect the source code.
> That would be a significant improvement over "use utf8", ameliorating the
> buffer encoding issues. But it's quite a bit more complex; especially
> the deprecation cycle would have an awful lot of cases. It re-imports
> the "=rapbqvat ebg13" problem, which Perl code had otherwise got rid
> of with the deprecation of encoding.pm, but obviously in practice we're
> already living with it with POD. And we'd force part of the POD to be
> placed before the code, which for a lot of people would mean separating
> it from the rest of the POD. It makes me uneasy to tie the program and
> the POD together more closely than they already are.
> I'm not enamoured of this idea, but nor do I see a fatal fault with it.
> I'd rather go with the fixed encoding, but if the consensus is that we
> can't countenance forcing users into a single encoding then this would
> be worth more consideration.
>>Also, have you considered the existing support perl has for UTF-16?
> Oh wow, I didn't know that existed. Wart much? Autodetection ftl.
> I'd be inclined to simply deprecate it as part of a move to UTF-8-only.