develooper Front page | perl.perl5.porters | Postings from January 2001

Re: 8482 busted for $^V sprintf()s on OS/390

Thread Previous | Thread Next
From:
Hugo
Date:
January 21, 2001 04:29
Subject:
Re: 8482 busted for $^V sprintf()s on OS/390
Message ID:
200101211334.NAA18121@crypt.compulink.co.uk
In <20010120214851.W22573@chaos.wustl.edu>, Jarkko Hietaniemi writes:
:On Sat, Jan 20, 2001 at 11:34:16PM +0000, Simon Cozens wrote:
:> On Fri, Jan 19, 2001 at 03:27:05PM -0800, Peter Prymmer wrote:
:> > That too can be worked around by commenting out the mention of $^V.
:> 
:> Very strange. First, I thought this was an obvious result of putting
:> the EBCDIC<->Unicode tables in place, but I've realised we haven't put
:> them in place! Something's messing up v-strings pretty badly. If you
:> have a second, I'd appreciate it if you could try:
:> 
:> perl -le 'printf "%v", $^V'
:> perl -le 'print ord for split //, $^V'
:> perl -le 'use Devel::Peek; Dump($^V)'
:> perl -le 'use Devel::Peek; $a = sprintf "%v", $^V; Dump($a)'
:
:The last one even in a UNIX tells us:
:
:SV = PV(0x140001ec0) at 0x140001b40
:  REFCNT = 1
:  FLAGS = (POK,READONLY,pPOK,UTF8)
:  PV = 0x140014030 "128.256"\0
:  CUR = 7
:  LEN = 9
:
:Ho-hum, what's the UTF8 flag doing in there?  It's not wrong as
:such -- in ASCII, that is.  In EBCDIC, that's bad news since the
:digits are >127.

Presumably anything that scans digits will have problems on EBCDIC
platforms then. Consider this:
        while (isDIGIT(*s))
            count = count * 10 + (*s++ - '0');
.. which (I think) does the right thing with and without UTF8 under
ASCII, and with straight EBCDIC, but not with UTF8-encoded EBCDIC.
Similarly, we dump digits in sprintf without regard to utf8ness.

Attached patch should at least avoid setting the UTF8 flag in the
above example. (Note that the namechange 'utf' to 'vec_utf' is
cosmetic only.)

Hugo
--- sv.c.old	Fri Jan 19 14:41:38 2001
+++ sv.c	Sun Jan 21 13:25:56 2001
@@ -6780,7 +6780,7 @@
 	bool left = FALSE;
 	bool vectorize = FALSE;
 	bool vectorarg = FALSE;
-	bool utf = FALSE;
+	bool vec_utf = FALSE;
 	char fill = ' ';
 	char plus = 0;
 	char intsize = 0;
@@ -6918,19 +6918,17 @@
 	    if (args) {
 		vecsv = va_arg(*args, SV*);
 		vecstr = (U8*)SvPVx(vecsv,veclen);
-		utf = DO_UTF8(vecsv);
+		vec_utf = DO_UTF8(vecsv);
 	    }
 	    else if (efix ? efix <= svmax : svix < svmax) {
 		vecsv = svargs[efix ? efix-1 : svix++];
 		vecstr = (U8*)SvPVx(vecsv,veclen);
-		utf = DO_UTF8(vecsv);
+		vec_utf = DO_UTF8(vecsv);
 	    }
 	    else {
 		vecstr = (U8*)"";
 		veclen = 0;
 	    }
-	    if (DO_UTF8(vecsv))
-		is_utf = TRUE;
 	}
 
 	if (asterisk) {
@@ -7099,7 +7097,7 @@
 		STRLEN ulen;
 		if (!veclen)
 		    continue;
-		if (utf)
+		if (vec_utf)
 		    iv = (IV)utf8_to_uv(vecstr, veclen, &ulen, 0);
 		else {
 		    iv = *vecstr;
@@ -7179,7 +7177,7 @@
 	vector:
 		if (!veclen)
 		    continue;
-		if (utf)
+		if (vec_utf)
 		    uv = utf8_to_uv(vecstr, veclen, &ulen, 0);
 		else {
 		    uv = *vecstr;

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About