develooper Front page | perl.perl5.porters | Postings from December 2000

Re: UTF8 flag and sv_utf8_upgrade

Nicholas Clark
December 12, 2000 04:54
Re: UTF8 flag and sv_utf8_upgrade
Message ID:
On Tue, Dec 12, 2000 at 12:30:02PM +0000, Nick Ing-Simmons wrote:

>     /* This function could be much more efficient if we had a FLAG in SVs
>      * to signal if there are any hibit chars in the PV.
>      */

> Is there any reason NOT to turn on SvUTF8 once we have established 
> that it is valid UTF8 - even if only because it has no high bit chars?

I've not looked at this code, but from an general position I can't see
that it's "wrong" to flag known ASCII as UTF8. (because 7 bit ASCII is)
I guess that the loop is trying to use turning the UTF8 flag off as a
surrogate for a second flag saying that "this string is *known* to be all 1
byte characters (and you make take advantage of this if it suits you)"

Having a second flag may allow optimisation speedups elsewhere. But I feel
it ought to be a distinct flag, not an overloaded meaning on the UTF8 flag.
But flag space is rare, isn't it?

Nicholas Clark Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About