develooper Front page | perl.perl5.porters | Postings from March 2007

[perl #42102] unpack use internal string representation (utf8)

From:
powerman @ powerman . asdfGroup . com
Date:
March 26, 2007 17:04
Subject:
[perl #42102] unpack use internal string representation (utf8)
Message ID:
rt-3.6.HEAD-30201-1174941277-750.42102-75-0@perl.org
# New Ticket Created by  powerman@powerman.asdfGroup.com 
# Please include the string:  [perl #42102]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=42102 >



This is a bug report for perl from powerman@powerman.asdfGroup.com,
generated with the help of perlbug 1.35 running under perl v5.8.8.


-----------------------------------------------------------------
[Please enter your report here]

pcalc> $s1 = $s = "\xAA\xBB\xCC"; utf8::upgrade $s1
pcalc> x map {sprintf "%x", $_} unpack "CCC", $s
$VAR1 = 'aa';
$VAR2 = 'bb';
$VAR3 = 'cc';
pcalc> x map {sprintf "%x", $_} unpack "CCC", $s1
$VAR1 = 'c2';
$VAR2 = 'aa';
$VAR3 = 'c2';
pcalc> x map {sprintf "%x", $_} unpack "n", $s
$VAR1 = 'aabb';
pcalc> x map {sprintf "%x", $_} unpack "n", $s1
$VAR1 = 'c2aa';

Actually I got this issue by using JSON::XS and Compress::Zlib.
I've received HTTP reply from web server, packed it into JSON,
transfer to another part of my application, it unpack from JSON
(at this point my bytes become marked 'UTF8' as Devel::Peek show)
and Compress::Zlib fail to ungzip this HTTP reply because it use:

    sub _removeGzipHeader
        ...
        unpack ('CCCCVCC', $$string);

JSON::XS author say it's bug in perl and ask me to send bugreport.

pcalc> $s = "\xAA\xBB\xCC"; $s1=from_json(to_json([$s]))->[0];
pcalc> Dump $s
SV = PV(0x1040cc40) at 0x1051c060
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x105e04d8 "\252\273\314"\0
  CUR = 3
  LEN = 4
pcalc> Dump $s1
SV = PVMG(0x1051c4e8) at 0x1051c084
  REFCNT = 1
  FLAGS = (SMG,POK,pPOK,UTF8)
  IV = 0
  NV = 0
  PV = 0x105e7820 "\302\252\302\273\303\214"\0 [UTF8 "\x{aa}\x{bb}\x{cc}"]
  CUR = 6
  LEN = 8
  MAGIC = 0x105e7910
    MG_VIRTUAL = &PL_vtbl_utf8
    MG_TYPE = PERL_MAGIC_utf8(w)
    MG_LEN = 3
pcalc> x map {sprintf "%o", $_} unpack "CCC", $s
$VAR1 = '252';
$VAR2 = '273';
$VAR3 = '314';
pcalc> x map {sprintf "%o", $_} unpack "CCC", $s1
$VAR1 = '302';
$VAR2 = '252';
$VAR3 = '302';
pcalc> utf8::downgrade $s1
pcalc> Dump $s1
SV = PV(0x10d389b0) at 0x10e47e24
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x10f12e90 "\252\273\314"\0
  CUR = 3
  LEN = 8
pcalc> x map {sprintf "%o", $_} unpack "CCC", $s1
$VAR1 = '252';
$VAR2 = '273';
$VAR3 = '314';

Right now I'm using utf8::downgrade as workaround...

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=high
---
Site configuration information for perl v5.8.8:

Configured by Gentoo at Mon Oct 30 06:06:29 EET 2006.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.16-hardened-r11, archname=i686-linux
    uname='linux home 2.6.16-hardened-r11 #9 smp mon oct 30 04:43:33 eet 2006 i686 intel(r) core(tm)2 cpu 6600 @ 2.40ghz gnulinux '
    config_args='-des -Darchname=i686-linux -Dcccdlflags=-fPIC -Dccdlflags=-rdynamic -Dcc=i686-pc-linux-gnu-gcc -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr -Dlocincpth=  -Doptimize=-march=pentium-m -msse3 -O2 -pipe -Duselargefiles -Dd_semctl_semun -Dscriptdir=/usr/bin -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dinstallman1dir=/usr/share/man/man1 -Dinstallman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3pm -Dinc_version_list=5.8.0 5.8.0/i686-linux 5.8.2 5.8.2/i686-linux 5.8.4 5.8.4/i686-linux 5.8.5 5.8.5/i686-linux 5.8.6 5.8.6/i686-linux 5.8.7 5.8.7/i686-linux  -Dcf_by=Gentoo -Ud_csh -Dusenm -Di_ndbm -Di_gdbm -Di_db'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='i686-pc-linux-gnu-gcc', ccflags ='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-march=pentium-m -msse3 -O2 -pipe',
    cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement'
    ccversion='', gccversion='3.4.6 (Gentoo Hardened 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='i686-pc-linux-gnu-gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lpthread -lnsl -lndbm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.3.6.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.3.6'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    

---
@INC for perl v5.8.8:
    /etc/perl
    /usr/lib/perl5/vendor_perl/5.8.8/i686-linux
    /usr/lib/perl5/vendor_perl/5.8.8
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/site_perl/5.8.8/i686-linux
    /usr/lib/perl5/site_perl/5.8.8
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/5.8.8/i686-linux
    /usr/lib/perl5/5.8.8
    /usr/local/lib/site_perl
    .

---
Environment for perl v5.8.8:
    HOME=/home/powerman
    LANG=ru_RU.KOI8-R
    LANGUAGE (unset)
    LC_NUMERIC=POSIX
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/powerman/bin:/home/powerman/inferno-os/Linux/386/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/sbin:/usr/sbin:/usr/local/sbin:/usr/games/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.4.6:/opt/sun-jdk-1.4.2.13/bin:/opt/sun-jdk-1.4.2.13/jre/bin:/opt/sun-jdk-1.4.2.13/jre/javaws:/usr/kde/3.5/bin:/usr/qt/3/bin:/usr/games/bin:/opt/vmware/workstation/bin:/var/qmail/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About