Nick Ing-Simmons <nik@tiuk.ti.com> writes: >B. When I fix A in Tk sources it core dumps in a manner which suggests > something has done heap-overrun. Debugging this I discovered a "feature" of sv_utf8_upgrade: void Perl_sv_utf8_upgrade(pTHX_ register SV *sv) { char *s, *t; bool hibit; if (!sv || !SvPOK(sv) || SvUTF8(sv)) return; /* This function could be much more efficient if we had a FLAG in SVs * to signal if there are any hibit chars in the PV. */ for (s = t = SvPVX(sv), hibit = FALSE; t < SvEND(sv) && !hibit; t++) if (*t & 0x80) hibit = TRUE; if (hibit) { STRLEN len = SvCUR(sv) + 1; /* Plus the \0 */ SvPVX(sv) = (char*)bytes_to_utf8((U8*)s, &len); SvCUR(sv) = len - 1; SvLEN(sv) = len; /* No longer know the real size. */ SvUTF8_on(sv); Safefree(s); /* No longer using what was there before. */ } } Tk wants UTF8 so every time it sees an SV without SvUTF8 it calls upgrade. Code above then scans it sees it is all ASCII and exits. A few C statements later we do it again, and again, and ... Is there any reason NOT to turn on SvUTF8 once we have established that it is valid UTF8 - even if only because it has no high bit chars? Should perl do this or should Tk do it ? -- Nick Ing-Simmons <nik@tiuk.ti.com> Via, but not speaking for: Texas Instruments Ltd.Thread Previous | Thread Next