develooper Front page | perl.perl5.porters | Postings from January 2005

Re: [perl #33734] unpack fails on utf-8 strings

Thread Previous | Thread Next
From:
Nicholas Clark
Date:
January 13, 2005 13:23
Subject:
Re: [perl #33734] unpack fails on utf-8 strings
Message ID:
20050113212351.GK8659@plum.flirble.org
On Thu, Jan 13, 2005 at 01:56:33PM +0000, Nicholas Clark wrote:

> I didn't know, but looking at the pack implementation, it's 'U', and only 'U':

Seems to be 'C' and 'U'

> I'm about to test a hack that might make most things work.

After changing t/op/join.t to avoid using H* to probe the innards of UTF8
scalars, the appended diff does make all tests pass. However, I'm not
convinced that it's the way to go.

Nicholas Clark

==== //depot/perl/pp_pack.c#52 - /Users/nick/p4perl/perl/pp_pack.c ====
--- /tmp/tmp.5559.0     Thu Jan 13 21:21:50 2005
+++ /Users/nick/p4perl/perl/pp_pack.c   Thu Jan 13 16:26:17 2005
@@ -1869,7 +1869,13 @@ PP(pp_unpack)
      */
     register char *s = SvPVbyte(right, rlen);
 #else
-    register char *s = SvPV(right, rlen);
+    /* This is a hack. Only the "U" pattern requires Unicode input, so
+       downgrade everything else. We're assuing that no-one is mad enough
+       to mix U patterns and regular packed data. This will, of course, be
+       wrong.
+    */
+    register char *s = (strchr (pat, 'U') || strchr (pat, 'C'))
+       ? SvPV(right, rlen) : SvPVbyte(right, rlen);
 #endif
     char *strend = s + rlen;
     register char *patend = pat + llen;

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About