Front page | perl.perl5.porters |
Postings from January 2012
Re: pack and ASCII
Thread Previous
|
Thread Next
From:
Leon Timmermans
Date:
January 10, 2012 04:44
Subject:
Re: pack and ASCII
Message ID:
CAHhgV8jsLR03ztmhggq7EE-hZSuOeTeH+BWRvMTmZM-+jn_XAg@mail.gmail.com
On Tue, Jan 10, 2012 at 2:17 AM, Eric Brine <ikegami@adaelis.com> wrote:
> You can count on «pack "A1", $foo» never return more than one byte.
>
> You can't count on the character returned by «pack "A1", $foo» to be a byte,
> though.
Only one of those two can be true, and it isn't the former:
perl -E 'no bytes; use utf8; my $foo = pack "A1", "ţ"; say bytes::length($foo)'
2
> I presume you're expecting «pack "A*", "ţ"» to die/warn with "Wide
> character"?
>
> Then I agree, if you have a buggy code, that change would remove the need
> for an extra layer of validation to detect that bug. (The bug, of course, is
> most likely that you forgot to encode your text.)
>
> It comes down the following question: Is it more useful for «pack "A*",
> $char» to work or for «pack "A*", $non_bytes» to throw/report an error?
No, the question is if «pack "A", $byte» should DWIM or «pack "A",
$character». The only sane way out of this mess would be to split this
up in two different formats, the question is though which one gets the
letter 'A'. I think it should the former.
My opinion on that should be obvious by now.
> C<< pack "A1" >> will never return more than one octet.
If only that were the case.
Leon
Thread Previous
|
Thread Next