develooper Front page | perl.perl5.porters | Postings from January 2001

[PATCH perl@8269] scanning two hex-constants fails on EBCDIC environment (script length.t)

Thread Next
From:
Roca, Ignasi
Date:
January 4, 2001 08:52
Subject:
[PATCH perl@8269] scanning two hex-constants fails on EBCDIC environment (script length.t)
Message ID:
5930DC161690D211966700902715754703738AA6@madt009a.siemens.es
On EBCDIC platforms, using strings of two or more hex-constants as for
example "\x{100}\x{80}" has problems scanning the second hex-constant.
The reason is that in module toke.c the backslash of the second constant is
not handled in the backslashes parafraph:
if (*s == '\\' && s+1 < send) {...}
This is due to, that EBCDIC backslash has the value x\BC and it branchs to
the paragraph:
if (*s & 0x80 && (this_utf8 || has_utf8)) {...}

So I placed the backslashes paragraph first to solve the problem, I think
there will be not bad consequences.

Following there is the diff of this change that solves some of the
regression tests (op/concat, op/join, op/length ...) 


===================================================
--- toke.c.orig Sat Dec 30 17:51:08 2000
+++ toke.c      Thu Jan  4 17:27:03 2001
@@ -1311,29 +1311,6 @@
                break;          /* in regexp, $ might be tail anchor */
        }
        
-       /* (now in tr/// code again) */
-
-       if (*s & 0x80 && (this_utf8 || has_utf8)) {
-           STRLEN len = (STRLEN) -1;
-           UV uv;
-           if (this_utf8) {
-               uv = utf8_to_uv((U8*)s, send - s, &len, UTF8_CHECK_ONLY);
-           }
-           if (len == (STRLEN)-1) {
-               /* Illegal UTF8 (a high-bit byte), make it valid. */
-               char *old_pvx = SvPVX(sv);
-               /* need space for one extra char (NOTE: SvCUR() not set
here) */
-               d = SvGROW(sv, SvLEN(sv) + 1) + (d - old_pvx);
-               d = (char*)uv_to_utf8((U8*)d, (U8)*s++);
-           }
-           else {
-               while (len--)
-                   *d++ = *s++;
-           }
-           has_utf8 = TRUE;
-           continue;
-       }
-
        /* backslashes */
        if (*s == '\\' && s+1 < send) {
            bool to_be_utf8 = FALSE;
@@ -1567,6 +1544,29 @@
            s++;
            continue;
        } /* end if (backslash) */
+
+       /* (now in tr/// code again) */
+
+       if (*s & 0x80 && (this_utf8 || has_utf8)) {
+           STRLEN len = (STRLEN) -1;
+           UV uv;
+           if (this_utf8) {
+               uv = utf8_to_uv((U8*)s, send - s, &len, UTF8_CHECK_ONLY);
+           }
+           if (len == (STRLEN)-1) {
+               /* Illegal UTF8 (a high-bit byte), make it valid. */
+               char *old_pvx = SvPVX(sv);
+               /* need space for one extra char (NOTE: SvCUR() not set
here) */
+               d = SvGROW(sv, SvLEN(sv) + 1) + (d - old_pvx);
+               d = (char*)uv_to_utf8((U8*)d, (U8)*s++);
+           }
+           else {
+               while (len--)
+                   *d++ = *s++;
+           }
+           has_utf8 = TRUE;
+           continue;
+       }
 
        *d++ = *s++;
     } /* while loop to process each character */
=======================================================

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About