develooper Front page | perl.perl5.porters | Postings from February 2001

Re: The State of The Unicode

From:
andrew
Date:
February 19, 2001 18:02
Subject:
Re: The State of The Unicode
Message ID:
20010219210151.M17705@pimlott.ne.mediaone.net
On Tue, Feb 20, 2001 at 01:39:17AM +0000, Simon Cozens wrote:
> On Mon, Feb 19, 2001 at 08:33:37PM -0500, Andrew Pimlott wrote:
> > Perl characters
> 
> What *on earth* is that meant to mean?

Nat gave you the primary definition, so

    1b.  A positive integer.  When interpreted as a character, it is
    taken to be a Unicode code point.  However, it may be used as
    "just a number".  Whenever we say, "this 'Perl character' is
    this 'character'", we mean that the numerical value of the "Perl
    character" is the Unicode code point of the character.

A "character" (distinct from "Perl character") is the abstract
thing, defined at http://www.unicode.org/glossary/.

In pre-Unicode Perl, it was a C char, with it's character
interpretation taken from the relevant locale.

(This is a working definition, ATM.)

But good grief--this Unicode thing is never going to work out if we
can't agree on an abstract model for a string (my definition: "a
list of Perl characters").  I firmly believe that.  The Unicode
string vs binary duality is way more strained than the the C string
vs binary duality ever was.

Andrew



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About