develooper Front page | perl.perl5.porters | Postings from December 2000

Re: UTF8 flag and sv_utf8_upgrade

From:
Nicholas Clark
Date:
December 12, 2000 04:54
Subject:
Re: UTF8 flag and sv_utf8_upgrade
Message ID:
20001212125434.C91652@plum.flirble.org
On Tue, Dec 12, 2000 at 12:30:02PM +0000, Nick Ing-Simmons wrote:

>     /* This function could be much more efficient if we had a FLAG in SVs
>      * to signal if there are any hibit chars in the PV.
>      */

> Is there any reason NOT to turn on SvUTF8 once we have established 
> that it is valid UTF8 - even if only because it has no high bit chars?

I've not looked at this code, but from an general position I can't see
that it's "wrong" to flag known ASCII as UTF8. (because 7 bit ASCII is)
I guess that the loop is trying to use turning the UTF8 flag off as a
surrogate for a second flag saying that "this string is *known* to be all 1
byte characters (and you make take advantage of this if it suits you)"

Having a second flag may allow optimisation speedups elsewhere. But I feel
it ought to be a distinct flag, not an overloaded meaning on the UTF8 flag.
But flag space is rare, isn't it?

Nicholas Clark



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About