Front page | perl.perl5.porters |
Postings from May 2004
utf-8 and taint don't work together
Thread Next
From:
Stas Bekman
Date:
May 19, 2004 22:51
Subject:
utf-8 and taint don't work together
Message ID:
40AC475F.4010208@stason.org
A quick check before I fill in a proper bug report (or may be Rob did
already). Rob Mueller reported a problem with utf-8 and taint on the modperl
list last night, and just now I was bitten by the same problem. Read Rob's
report if you prefer the perl code or mine below if you prefer C.
/* some data obtained from untrusted source, could be UTF-8 or not, but for
this report think it's UTF-8 */
char *str = get_data()...
SV *buf = newSVpvn(str, len);
gives us:
SV = PV(0x93ca494) at 0x93ee348
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8)
PV = 0x93fc8c8 "utf8 data snipped"
CUR = 140
LEN = 141
but if we turn Taint flag on, we lose UTF-8. Please observe:
buf = newSVpvn(str, len);
SvTAINTED_on(buf);
SV = PVMG(0x93ec4b0) at 0x93ee320
REFCNT = 1
FLAGS = (PADBUSY,PADMY,GMG,SMG,pPOK)
IV = 0
NV = 0
PV = 0x93fc938 "utf8 data snipped"
CUR = 140
LEN = 141
MAGIC = 0x9402ee0
MG_VIRTUAL = &PL_vtbl_taint
MG_TYPE = PERL_MAGIC_taint(t)
MG_LEN = 1
Setting SvUTF8_on(buf) fixes that, but we can't do that, since we don't know
whether the data is utf-8 or not. What's the fix? And I hope there is an easy
workaround for older perls, because we really need to have the taint flag on.
Thanks.
perl -V
Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
Platform:
osname=linux, osvers=2.6.3-9mdk, archname=i686-linux-thread-multi
uname='linux rabbit.stason.org 2.6.3-9mdk #1 fri apr 23 16:41:09 edt 2004
i686 unknown unknown gnulinux '
config_args='-des -Dprefix=/home/stas/perl/5.8.4-ithread -Dusethreads
-Doptimize=-g -Duseshrplib -Dusedevel -Accflags=-DDEBUG_LEAKING_SCALARS'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DDEBUG_LEAKING_SCALARS -DDEBUGGING -fno-strict-aliasing -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
optimize='-g',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DDEBUG_LEAKING_SCALARS -DDEBUGGING -fno-strict-aliasing -I/usr/local/include
-I/usr/include/gdbm'
ccversion='', gccversion='3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/libc-2.3.3.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.3.3'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E
-Wl,-rpath,/home/stas/perl/5.8.4-ithread/lib/5.8.4/i686-linux-thread-multi/CORE'
cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES
PERL_IMPLICIT_CONTEXT
Built under linux
Compiled at May 8 2004 23:38:53
%ENV:
PERLDOC_PAGER="less -R"
@INC:
/home/stas/perl/5.8.4-ithread/lib/5.8.4/i686-linux-thread-multi
/home/stas/perl/5.8.4-ithread/lib/5.8.4
/home/stas/perl/5.8.4-ithread/lib/site_perl/5.8.4/i686-linux-thread-multi
/home/stas/perl/5.8.4-ithread/lib/site_perl/5.8.4
/home/stas/perl/5.8.4-ithread/lib/site_perl
.
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Thread Next
-
utf-8 and taint don't work together
by Stas Bekman