develooper Front page | perl.perl5.porters | Postings from March 2007

Re: the utf8 flag (was Re: [perl #41527] decode_utf8 sets utf8 flag on plain ascii strings)

Thread Previous | Thread Next
From:
Juerd Waalboer
Date:
March 30, 2007 18:08
Subject:
Re: the utf8 flag (was Re: [perl #41527] decode_utf8 sets utf8 flag on plain ascii strings)
Message ID:
20070331010813.GO31277@c4.convolution.nl
Marc Lehmann skribis 2007-03-31  2:29 (+0200):
> > Unicode is a character set, not a character encoding.
> As is latin1.

For all intents and purposes, latin1 is a character encoding as well as
a character set. If not officially, then certainly for Perl. It can be
used with the :encoding layer, with Encode'decode, etcetera. "Unicode"
cannot.

I don't know where your terminology comes from, but I try to stick to
whatever is common in Perl land. Sorry if that differs from other
communities.

> > Unicode is a superset of the latin1 character set, not the latin1
> > character encoding. We'd need bigger bytes for the latter :)
> Right. And Perl has those bigger bytes.

A byte, in Perl jargon at least, is an octet. An octet can hold any
single value in the rande 0..255, and is exactly 8 bits in size. Every
byte is exactly as large as any other byte.
-- 
korajn salutojn,

  juerd waalboer:  perl hacker  <juerd@juerd.nl>  <http://juerd.nl/sig>
  convolution:     ict solutions and consultancy <sales@convolution.nl>

Ik vertrouw stemcomputers niet.
Zie <http://www.wijvertrouwenstemcomputersniet.nl/>.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About