develooper Front page | perl.perl5.porters | Postings from March 2007

Re: the utf8 flag (was Re: [perl #41527] decode_utf8 sets utf8 flag on plain ascii strings)

Thread Previous | Thread Next
From:
Juerd Waalboer
Date:
March 30, 2007 17:13
Subject:
Re: the utf8 flag (was Re: [perl #41527] decode_utf8 sets utf8 flag on plain ascii strings)
Message ID:
20070331001329.GH31277@c4.convolution.nl
Marc Lehmann skribis 2007-03-31  1:53 (+0200):
> So you force people to know about the internal flag, lest they cannot avoid
> the die.

No, you don't have to know about the UTF8 flag, just that Perl can't
always know if your string is a text string, but is there to help you
when it does.

> > Besides that, the "C" in Perl's pack() is documented as a single byte.
> "A C "char" is a byte".
> Your words.
> But here you say a byte is not a character. Thats a contradiction.

"C char" ne "Perl character".

> No, I asked for UTF-8 encoded characters. Again, read the documentation:
>           *       If the pattern begins with a "U", the resulting string will
>           *       be treated as UTF-8-encoded Unicode.

Resulting string, not input string.

The word "internally" is missing here. I will do my best to correct
that.

> thats for pack, unfortunately.
>           U   A Unicode character number.  Encodes to UTF-8
>           internally
> uh, that internal thing again. So how many characters will pack "U", 200
> give me? According to the documentation, 2, as UTF-8 requires that. 

One character. Note again that "character" isn't the same as a "C char".
We in Perl land, and the people over in Unicode land, use different
words, sometimes.

Most of the time, a Perl "character" means codepoint.

> > > Right, while the documentation on unpack "U" disagrees with it, as it talks
> > > about UTF-8.
> > That would be a bug, but I can't find it in my copy (5.8.8). It only
> > says "Encodes to UTF-8 internally" for pack(), which as far as I can
> > tell, is true.
> So it talks about using UTF-8, so, according to you, it is a bug. Fine
> with me.

This was for pack, you were talking about unpack. Also, the word
"internally" was probably not added without reason.
-- 
korajn salutojn,

  juerd waalboer:  perl hacker  <juerd@juerd.nl>  <http://juerd.nl/sig>
  convolution:     ict solutions and consultancy <sales@convolution.nl>

Ik vertrouw stemcomputers niet.
Zie <http://www.wijvertrouwenstemcomputersniet.nl/>.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About