develooper Front page | perl.perl5.porters | Postings from February 2003

Re: [perl #21395] rcatline doesn't grok utf8

Thread Previous | Thread Next
From:
Enache Adrian
Date:
February 28, 2003 15:08
Subject:
Re: [perl #21395] rcatline doesn't grok utf8
Message ID:
20030228230947.GB952@ratsnest.hole
On Fri, Feb 28, 2003 at 12:20:31PM +0000, Nicholas Clark wrote:
> $ perl5.8.0 -lwe '$_ = chr 128; binmode STDIN, ":utf8"; $_ .= <STDIN>; print ord $_' <testutf8 
> Malformed UTF-8 character (unexpected continuation byte 0x80, with no preceding start byte) in ord at -e line 1, <STDIN> line 1.
> 0
> $ ./perl -lwe '$_ = chr 128; binmode STDIN, ":utf8"; $_ .= <STDIN>; print ord $_' <testutf8 
> Malformed UTF-8 character (unexpected continuation byte 0x80, with no preceding start byte) in ord at -e line 1, <STDIN> line 1.
> 0

Sorry for the hasty patch. I just skipped the scalar-utf8/file-utf8 case.

> Is it as simple as putting a check on whether the file handle is flagged
> as UTF8, and if it is upgrading the existing scalar?

It looks the the 'append' position must be recomputed, too.

If the patch below proves correct ( it gets your tests + make test right ),
I'll try to make the append/utf8 tests a little bit more smart/compact.

Regards

Adi

-----------------------------------------------------------------------
--- /arc/perl-current/sv.c	2003-02-26 04:50:55.000000000 +0200
+++ sv.c	2003-03-01 00:51:48.000000000 +0200
@@ -6247,7 +6247,16 @@ Perl_sv_gets(pTHX_ register SV *sv, regi
     (void)SvUPGRADE(sv, SVt_PV);
 
     SvSCREAM_off(sv);
-    SvPOK_only(sv);    /* Validate pointer */
+
+    append ? SvPOK_only_UTF8(sv) : SvPOK_only(sv);
+
+    if (PerlIO_isutf8(fp)) {
+	if (append) {
+	    sv_utf8_upgrade_nomg(sv);
+	    sv_pos_u2b(sv,&append,0);
+	} else
+	    SvUTF8_on(sv);
+    }
 
     if (PL_curcop == &PL_compiling) {
 	/* we always read code in line mode */
@@ -6290,7 +6299,7 @@ Perl_sv_gets(pTHX_ register SV *sv, regi
 #endif
       SvCUR_set(sv, bytesread += append);
       buffer[bytesread] = '\0';
-      goto check_utf8_and_return;
+      return (SvCUR(sv) - append) ? SvPVX(sv) : Nullch;
     }
     else if (RsPARA(PL_rs)) {
 	rsptr = "\n\n";
@@ -6543,12 +6552,6 @@ screamer2:
 	}
     }
 
-check_utf8_and_return:
-    if (PerlIO_isutf8(fp))
-	SvUTF8_on(sv);
-    else
-	SvUTF8_off(sv);
-
     return (SvCUR(sv) - append) ? SvPVX(sv) : Nullch;
 }
 

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About