Front page | perl.perl5.porters |
Postings from January 2012
Re: pack and ASCII
Thread Previous
|
Thread Next
From:
Eric Brine
Date:
January 10, 2012 12:17
Subject:
Re: pack and ASCII
Message ID:
CALJW-qEpbbtkWWmnhT=5+t7xASKV0KLE2rE89UP5Y3cmcfXeKA@mail.gmail.com
On Tue, Jan 10, 2012 at 7:43 AM, Leon Timmermans <fawaka@gmail.com> wrote:
> On Tue, Jan 10, 2012 at 2:17 AM, Eric Brine <ikegami@adaelis.com> wrote:
> > You can count on «pack "A1", $foo» never return more than one byte.
> >
> > You can't count on the character returned by «pack "A1", $foo» to be a
> byte,
> > though.
>
> Only one of those two can be true, and it isn't the former:
>
> perl -E 'no bytes; use utf8; my $foo = pack "A1", "ţ"; say
> bytes::length($foo)'
> 2
>
> bytes::length does not return the number of bytes in $foo.
> > I presume you're expecting «pack "A*", "ţ"» to die/warn with "Wide
> > character"?
> >
> > Then I agree, if you have a buggy code, that change would remove the need
> > for an extra layer of validation to detect that bug. (The bug, of
> course, is
> > most likely that you forgot to encode your text.)
> >
> > It comes down the following question: Is it more useful for «pack "A*",
> > $char» to work or for «pack "A*", $non_bytes» to throw/report an error?
>
> No, the question is if «pack "A", $byte» should DWIM or «pack "A",
> $character».
Not true. Those aren't exclusive. Right now, both DWIM. The former returns
exactly one byte. The latter returns exactly one character.
> The only sane way out of this mess would be to split this
> up in two different formats
You still didn't say what you think the two formats should do.
> C<< pack "A1" >> will never return more than one octet.
>
If only that were the case.
>
Then give an example where it doesn't.
- Eric
Thread Previous
|
Thread Next