develooper Front page | perl.perl5.porters | Postings from December 2000

UTF8 flag and sv_utf8_upgrade

Thread Previous | Thread Next
Nick Ing-Simmons
December 12, 2000 04:30
UTF8 flag and sv_utf8_upgrade
Message ID:
Nick Ing-Simmons <> writes:
>B. When I fix A in Tk sources it core dumps in a manner which suggests
>   something has done heap-overrun.

Debugging this I discovered a "feature" of sv_utf8_upgrade:

Perl_sv_utf8_upgrade(pTHX_ register SV *sv)
    char *s, *t;
    bool hibit;

    if (!sv || !SvPOK(sv) || SvUTF8(sv))

    /* This function could be much more efficient if we had a FLAG in SVs
     * to signal if there are any hibit chars in the PV.
    for (s = t = SvPVX(sv), hibit = FALSE; t < SvEND(sv) && !hibit; t++)
	if (*t & 0x80)
	    hibit = TRUE;

    if (hibit) {
	STRLEN len = SvCUR(sv) + 1; /* Plus the \0 */
	SvPVX(sv) = (char*)bytes_to_utf8((U8*)s, &len);
	SvCUR(sv) = len - 1;
	SvLEN(sv) = len; /* No longer know the real size. */
	Safefree(s); /* No longer using what was there before. */

Tk wants UTF8 so every time it sees an SV without SvUTF8 it calls
upgrade. Code above then scans it sees it is all ASCII and exits.
A few C statements later we do it again, and again, and ...

Is there any reason NOT to turn on SvUTF8 once we have established 
that it is valid UTF8 - even if only because it has no high bit chars?
Should perl do this or should Tk do it ?

Nick Ing-Simmons <>
Via, but not speaking for: Texas Instruments Ltd.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About