develooper Front page | perl.perl6.internals | Postings from June 2001

RE: The internal string API

Thread Previous | Thread Next
From:
Hong Zhang
Date:
June 19, 2001 12:26
Subject:
RE: The internal string API
Message ID:
400CE9390E334A4393CEECDD6863120A289F23@ussccm003.corp.palm.com

> >What do you mean by character size if it does not support variable
length?
> 
> Well, if strings are to be treated relatively abstractly, and we still
want 
> to poke around through the string buffer, we need to know how big a 
> character is.

I agree on this. I think support variable length encoding should be
included.

> I'm thinking locale is, in some ways, like tainting where it's really a 
> property of the data rather than a property of the code region.

I think you misunderstand my point. It is "a property of the code region",
but "a property of the context in which is the code is running". For
example,
Taiwanese read traditional chinese characters, but PRC people read
simplied chinese. Even we take the same data, and same program (code),
people just read differently. As an end user, I want to make the decision.
It will drive me crazy if Perl render/display the text file using
traditional
chinese just because it was tagged as "Big5".

> Yep, I fully agree. (Well, I'm not sure of the ASCII restriction on the 
> name, but I can live with that as a lowest-common-denominator 
> sort of thing)

> >The byte based is more useful. I have utf-8, and I want to substr it
> >to another utf-8. It is painful to convert it or linear search for
> >charaacter position.
> 
> The pain is the reason for specifying it in the API. If we force the pain 
> to be local to the encoding then it means that we don't have 
> to embed it in the core.

If it is common API, I like to specify it in core, so each encoding
implemetation can strictly follow. I believe it is common enough.

Hong

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About