develooper Front page | perl.perl5.porters | Postings from February 2012

Re: Either clear the Unicode air--or make a release-blocker? (was:Unicode cheatsheet for Perl)

Thread Previous | Thread Next
From:
Leon Timmermans
Date:
February 26, 2012 07:26
Subject:
Re: Either clear the Unicode air--or make a release-blocker? (was:Unicode cheatsheet for Perl)
Message ID:
CAHhgV8gFbxaF8sqx0UetQ7TjgctiA4TYUFS8umLegYxZO9SPig@mail.gmail.com
On Sat, Feb 25, 2012 at 11:28 PM, Tom Christiansen <tchrist@perl.com> wrote:
> Some folks claim the only "safe" way to use Unicode in Perl is to always make
> explicit calls to encode/decode with a bonus FB_CROAK argument.  They claim
> that all nine of these perfectly reasonable and common-to-the-99th-percentile
> operations...
>
>    #1.   $ perl -C...
>    #2.   $ export PERL_UNICODE=...
>
>    #3.     use utf8;
>
>    #4.     use open qw[ :std :utf8            ];
>    #5.     use open qw[ :std :encoding(UTF-8) ];
>
>    #6.     binmode(FH, ":utf8");
>    #7.     binmode(FH, ":encoding(UTF-8)");
>
>    #8.     open(FH,  "< :utf8",            $path);
>    #9.     open(FH,  "< :encoding(UTF-8)", $path);
>
> ...are all of them flawed in their not raising exceptions on UTF-8
> encoding errors of one sort of another, and that somehow not even...
>
>    #0.     use warnings qw(FATAL utf8);
>
> ...is good enough to fix it.
>
> I do not know whether these claims are true.  My own tests suggest this may
> not be the whole story, because this behaves as I think it should:
>
>  darwin$ perl -C0 -E 'say for "caf\xE9", "stuff"' |
>          perl -CS -Mwarnings=FATAL,utf8 -pe 'print "$. "'
>  utf8 "\xE9" does not map to Unicode, <> line 1.
>  Exit 255

I'm having the impression that only high-level readline (e.g. not what
the parser uses) actually checks input for invalid characters. Most
other operations only seem to check for well-formedness if they check
at all. I may be mistaken though: I haven't tested tested this, just
read source.

Leon

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About