develooper Front page | perl.perl5.porters | Postings from December 2010

Re: [perl #58182] Inconsistent and wrong handling of 8th bit setchars with no locale

Thread Previous | Thread Next
From:
demerphq
Date:
December 2, 2010 06:34
Subject:
Re: [perl #58182] Inconsistent and wrong handling of 8th bit setchars with no locale
Message ID:
AANLkTin_iY2BJcd9iVPzF=pLGTa0H5uxCyq6GWR0wFgy@mail.gmail.com
On 23 September 2008 18:03, Dave Mitchell <davem@iabyn.com> wrote:
> On Mon, Sep 22, 2008 at 09:55:23PM +0200, Juerd Waalboer wrote:
>> It's a bug. A known and old bug, but it must be fixed some time.
>
> Here's a general suggestion related to fixing Unicode-related issues.
>
> A well-known issue is that the SVf_UTF8 flag means two different things:
>
>    1) whether the 'sequence of integers' are stored one per byte, or use
>    the variable-length utf-8 encoding scheme;
>
>    2) what semantics apply to that sequence of integers.
>
> We also have various bodges, such as attaching magic to cache utf8
> indexes.
>
> All this stems from the fact that there's no space in an SV to store all
> the information we want. So....
>
> How about we remove the SVf_UTF8 flag from SvFLAGS and replace it with an
> Extended String flag. This flag indicates that prepended to the SvPVX
> string is an auxilliary structure (cf the hv_aux struct) that contains all the
> extra needed unicodish info, such as encoding, charset, locale, cached
> indexes etc etc. This then both allows us to disambiguate the meaning of
> SVf_UTF8 (in the aux structure there would be two different flags for the
> two meanings), but would also provide room for future enhancements (eg
> space for a UTF32 flag should someone wish to implement that storage
> format).
>
> Just a thought...

++

yves
-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About