develooper Front page | perl.perl5.porters | Postings from January 2012

Re: pack and ASCII

Thread Previous | Thread Next
From:
Eric Brine
Date:
January 11, 2012 11:38
Subject:
Re: pack and ASCII
Message ID:
CALJW-qG3dehYqa-HZmMfYpGhLSptha094LbqMdiC+vKEijhs0A@mail.gmail.com
On Wed, Jan 11, 2012 at 8:13 AM, John P. Linderman (jpl) <
jpl@research.att.com> wrote:

> The byte/character/octet confusion is hurting my head, and the
> documentation isn't helping:
>
[...]

> The result of pack on a 32-bit integer is (what I would call) 4 octets
> long, but it (IMHO) should not be called 4 characters long, if we want to
> encourage thinking in Unicode terms.
>

"Character" is not a Unicode term. A character is an element of a string.
C<< pack "N" >> results in a string of four characters (string elements)
that are bytes/octets (8-bit integers). Just like the passage you quoted
says.

What I want pack/unpack to do is to allow me to pack a string into a
> predetermined (and presumably adequately large) number of octets as part of
> a record I will write, and recover, using unpack and the same format, when
> I read the record back in. In that context, using SvPV or SvPVbyte, is out
> of my control


I'm not sure if you're saying that the code you use to write out your
string suffers from The Unicode Bug, or if you're saying you have a problem
with C<print> simply warning (not dying) on bad inputs.

If the former, *I'm sorry to hear the code you use to write out your string
is buggy, but the solution isn't to break C<pack>.* Perl provides two
functions to help you deal with such modules: C<utf8::downgrade> and
C<utf8::upgrade>.

If the latter, *I'm sorry to hear the code you're not happy with C<print>,
but the solution isn't to break C<pack>.*

Of course, the bug would only manifest itself if you feed C<pack> bad data
in the first place. Perhaps you should fix *that* bug instead of trying to
change the function of C<< pack "A" >>. C<< pack "A" >> is documented to
work on text, and that's what it does. Obviously, it's never been limited
to 7-bit ASCII text inputs, but it's not limited to 8-bit text inputs
either.

it has to be something pack and unpack agree to do.


Again, I have no objection to downgrading when possible. pack and unpack
most definitely don't have to agree to start malfunctioning. Why are you
still suggesting they should?!

- Eric

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About