develooper Front page | perl.perl5.porters | Postings from June 2011

Re: RFC: Handling utf8 locales

Thread Previous | Thread Next
From:
Karl Williamson
Date:
June 27, 2011 08:36
Subject:
Re: RFC: Handling utf8 locales
Message ID:
4E08A359.10806@khwilliamson.com
On 06/27/2011 05:01 AM, Zefram wrote:
> Karl Williamson wrote:
>> Currently, under locale, the user is warranting that the strings are
>> correctly encoded in the specified locale.
> ...
>>                 under utf8 locales, which are currently documented as
>> not working, the regex engine and the casing functions would assume that
>> their strings were properly Unicode-encoded.
>
> So you're proposing that the meaning of "use locale", with respect to the
> expected encoding of strings, be completely different for UTF-8 locales
> from what it is for other locales.  I oppose this.  If the programmer is
> working with strings in native Unicode form, ey should declare this with
> "use feature 'unicode_strings'" or equivalent, not with "use locale".
>
> -zefram
>

It appears to me that you've got it completely backwards.  In all cases, 
the programmer is warranting that the string is correctly encoded in the 
specified locale.  It's just that UTF-8 locales ARE in native Unicode 
form.  The expected encoding for it is its encoding, just as the 
expected coding for any other locale is its encoding.  I think you're 
throwing red herrings at this proposal; I don't know how to explain it 
more clearly.

Right now the programmer has a choice: 1) to manipulate strings properly 
with those locales by using the unicode_strings feature; or 2) to get 
proper LC_TIME, etc. handling by using locale.  The programmer cannot 
currently get both.

To get them the ability to do both, simplest to implement is the :locale 
layer which converts all I/O so that internally things are native 
Unicode, and "use locale 'NO_CTYPE'" which divorces LC_CTYPE from the 
rest of locale handling, so that those remain, but native Unicode is 
used for string operations.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About