Front page | perl.perl5.porters |
Postings from January 2012
Re: pack and ASCII
January 11, 2012 05:13
Re: pack and ASCII
Message ID: 4F0D8B04.email@example.com
On 01/11/12 07:10, Leon Timmermans wrote:
> On Wed, Jan 11, 2012 at 8:13 AM, Jesse Luehrs<firstname.lastname@example.org> wrote:
>>> If you have code that requires a UTF8=0 string specifically, it is buggy.
>>> Specifically, it suffers from the Unicode bug. You are probably using SvPV
>>> without looking at the SvUTF8. The solution is simple: Use SvPVbyte instead.
>> There has to be some point when code can assume that it has a byte
>> string. What Leon is saying is that it's a lot more useful for pack to
>> use SvPVbyte itself automatically, since pack is typically used for
>> things like binary protocols and file formats, which are usually defined
>> in terms of bytes, not characters.
> Yes, this.
The byte/character/octet confusion is hurting my head, and the
documentation isn't helping:
Takes a LIST of values and converts it into a string
rules given by the TEMPLATE. The resulting string is the
concatenation of the converted values. Typically, each
converted value looks like its machine-level representation.
For example, on 32-bit machines an integer may be
by a sequence of 4 bytes, which will in Perl be
presented as a
string that's 4 characters long.
The result of pack on a 32-bit integer is (what I would call) 4 octets
long, but it (IMHO) should not be called 4 characters long, if we want
to encourage thinking in Unicode terms.
What I want pack/unpack to do is to allow me to pack a string into a
predetermined (and presumably adequately large) number of octets as part
of a record I will write, and recover, using unpack and the same format,
when I read the record back in. In that context, using SvPV or
SvPVbyte, is out of my control, it has to be something pack and unpack
agree to do. The "A" format item does what I want if I stay in the
ASCII world, but I'd like to break out. Maybe "A" cannot be made to do
what I requested, although I *think* what Leon is talking about would do it.