develooper Front page | perl.perl5.porters | Postings from June 2011

Re: RFC: Handling utf8 locales

Thread Previous | Thread Next
Karl Williamson
June 27, 2011 08:36
Re: RFC: Handling utf8 locales
Message ID:
On 06/27/2011 05:01 AM, Zefram wrote:
> Karl Williamson wrote:
>> Currently, under locale, the user is warranting that the strings are
>> correctly encoded in the specified locale.
> ...
>>                 under utf8 locales, which are currently documented as
>> not working, the regex engine and the casing functions would assume that
>> their strings were properly Unicode-encoded.
> So you're proposing that the meaning of "use locale", with respect to the
> expected encoding of strings, be completely different for UTF-8 locales
> from what it is for other locales.  I oppose this.  If the programmer is
> working with strings in native Unicode form, ey should declare this with
> "use feature 'unicode_strings'" or equivalent, not with "use locale".
> -zefram

It appears to me that you've got it completely backwards.  In all cases, 
the programmer is warranting that the string is correctly encoded in the 
specified locale.  It's just that UTF-8 locales ARE in native Unicode 
form.  The expected encoding for it is its encoding, just as the 
expected coding for any other locale is its encoding.  I think you're 
throwing red herrings at this proposal; I don't know how to explain it 
more clearly.

Right now the programmer has a choice: 1) to manipulate strings properly 
with those locales by using the unicode_strings feature; or 2) to get 
proper LC_TIME, etc. handling by using locale.  The programmer cannot 
currently get both.

To get them the ability to do both, simplest to implement is the :locale 
layer which converts all I/O so that internally things are native 
Unicode, and "use locale 'NO_CTYPE'" which divorces LC_CTYPE from the 
rest of locale handling, so that those remain, but native Unicode is 
used for string operations.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About