Front page | perl.perl6.internals |
Postings from June 2001
Re: The internal string API
Thread Previous
|
Thread Next
From:
Dan Sugalski
Date:
June 29, 2001 07:49
Subject:
Re: The internal string API
Message ID:
5.1.0.14.0.20010629103110.02195df0@24.8.96.48
At 07:57 PM 6/28/2001 -0500, Jarkko Hietaniemi wrote:
>On Fri, Jun 29, 2001 at 02:52:03AM +0200, Bart Lateur wrote:
> > If I have a file in French, and a file in Chinese, I want one to
> > be treated as French, and the other as Chinese.
>
>And what do you do one you have a list of say, employees, with
>French, Chinese, and Spanish names, and you want to show them
>some order, and how does your fellow Chinese or Hindi worker
>want to see the same list ordered...?
>
>Also, please don't confuse locales with 'languages'. To start with,
>there's no definition of 'language' that people can agree on. Usually
>the existing locale definitions try to work around this fuzziness by
>having (language,country) pairs, but that is just a partial solution.
We're going to split things out into pieces internally.
* String data will be tagged sufficiently to make the characters uniquely
identified. That means we'll see Unicode, Big5/Traditional, Shift-JIS, or
whatever. This will do nothing except make sure that we know what the heck
character 0x0455 is.
* String sorting order will be specifiable, overridable, and generally
pluggable. As has been pointed out, you sort German, French, and English
text a little differently from each other. (Not to mention things like
Arabic, Chinese, or Egyptian Heiroglyphics) That's not really bound to the
character encoding of the data, as the same data will be sorted differently
by different people.
* Formatting bits will be a separate thing entirely as well. How numbers
and dates and such are formatted vary even more than how data's sorted, it
seems.
Now, we'll probably have some sort of locale specification that sets the
default encoding for unknown incoming data, default sorting order, and
default formats for things we format, but that'll all be overridable. How
it's overridable at the language level's Larry's issue ("sort as grek @foo"
maybe) but we'll definitely do it. And yes, I know we could just make
people hand-format and stick in sort subs, but bleah. Too much work on the
part of a perl programmer.
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@sidhe.org have teddy bears and even
teddy bears get drunk
Thread Previous
|
Thread Next