develooper Front page | perl.perl5.porters | Postings from March 2007

Re: the utf8 flag (was Re: [perl #41527] decode_utf8 sets utf8 flag on plain ascii strings)

Thread Previous | Thread Next
From:
Juerd Waalboer
Date:
March 30, 2007 18:21
Subject:
Re: the utf8 flag (was Re: [perl #41527] decode_utf8 sets utf8 flag on plain ascii strings)
Message ID:
20070331012106.GQ31277@c4.convolution.nl
Marc Lehmann skribis 2007-03-31  3:05 (+0200):
> Oh, maybe I know the reason for the confusion.
> I do talk about the *Perl* level, while you often talk about the
> *implementation*. When I say byte or octet string below, I mean on the
> Perl level. 

This is not the reason for confusion, because I also discuss the Perl
level. For my terminology, I use whatever is common in the Perl
reference documentation.

> For example, on the Perl level, upgrading a string does not
> change its semantics anywhere except w.r.t. to bugs and unpack: It still
> stays an octet string if it was an octet string before.

s/octet string/character string/ and you're entirely right. "Octets" are
a bit harder, because of the definition of an octet:

    octet

        <jargon, networking> Eight bits. This term is used in
        networking, in preference to byte, because some systems use the
        term "byte" for things that are not 8 bits long.

There's no easy way to fit numbers greater than 255 into 8 bits without
sacrificing support for 0 thru 255 inclusive. It may even be impossible.
Who knows. The person who invents a way of storing more than 255
distinct numbers in unique single octets, will probably get famous very
quickly :)
-- 
korajn salutojn,

  juerd waalboer:  perl hacker  <juerd@juerd.nl>  <http://juerd.nl/sig>
  convolution:     ict solutions and consultancy <sales@convolution.nl>

Ik vertrouw stemcomputers niet.
Zie <http://www.wijvertrouwenstemcomputersniet.nl/>.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About