develooper Front page | perl.perl5.porters | Postings from February 2001

Re: Perl-Unicode fundamentals (was Re: IV preservation (was Re: [PATCH 5.7.0] compiling on OS/2))

Thread Previous | Thread Next
From:
Nick Ing-Simmons
Date:
February 21, 2001 08:48
Subject:
Re: Perl-Unicode fundamentals (was Re: IV preservation (was Re: [PATCH 5.7.0] compiling on OS/2))
Message ID:
200102211647.QAA28579@mikado.tiuk.ti.com
Ilya Zakharevich <ilya@math.ohio-state.edu> writes:
...
>Of course.
...
>So it "just
>works", ... 
...
>Exactly.  
...

So we seem to have cleared up a few things.

>> Our locale story is no where near as good as our Unicode story.
>> But that is mostly the fault of under-specified locale semantics 
>> at system level.
>
>No, the faults are at different places:
>
>  a) use locale is lexically scoped, so useless when modules are used;
>
>  b) there were no defined semantic of the interaction of locale and
>     Unicode [my proposal creates such a semantic];
>
>> Switching on EBCDIC-ness is cleaner.
>
>There is no difference (as far as Perl is concerned; except for
>sorting) between EBCDIC-ness and locale.  If you feel otherwise,
>please give an example to unconfuse me.

EBCDIC-ness is C-compile-time (./Configure time even) knowable.
So it does no suffer from "lexical" issues as in your (a) above.

So far I have avoided 'use locale' in all my descriptions.
So it seems we can document transparency and Unicode in the abstract
for iso-8859-1/Unicode or EBCDIC-ibm-1047/Unicode without using 
any "locale" analogies, assumptions etc.
This is a good thing.

When we have "transparent Unicode" in place, the brave and enthusiastic
can go look at what we should/could do to "use locale" in the new realm.
But let us put that part on one side for now and get the basics good
and solid - does that make sense to you?

>
>> use utf8;
>> 
>> still has semantic that it says the script itself is assumed to come
>> from a UTF-8 encoded source file.
>
>use utf8 is a mastodon.  
   mastodon as in :
    A. Large
    B. Hairy
    C. Extinct ? ;-) 

>It is not needed for any other purpose, so
>let it be so.

>
>> big5 has other problems in that it is a multi-byte encoding
>
>Does not matter: I discuss character mapping here, not encoding.

Agreed - I said _other_ problems for that reason.

-- 
Nick Ing-Simmons <nik@tiuk.ti.com>
Via, but not speaking for: Texas Instruments Ltd.


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About