develooper Front page | perl.perl5.porters | Postings from February 2000

Re: should "use byte" be "use bytes"?

From:
Larry Wall
Date:
February 10, 2000 13:00
Subject:
Re: should "use byte" be "use bytes"?
Message ID:
200002102056.MAA05377@kiev.wall.org
Tom Christiansen writes:
: [Very long explanation of prospective parsing approach for 5.6 elided]
: 
: >     3) Perl runs into a high bit in your script.  At that point it
: >	takes a look at what it has in its buffer.  If it looks like
: >	utf8, mark the script filehandle as utf8 and continue.  If not,
: >	mark the script filehandle as binary (equivalent to latin-1)
: >	and continue.
: 
: Does this mean that we'll be able to use, for example, %déjà_vu
: now, without any other special indications?

I think so.  At this point in my existence I don't think we need to
distinguish variable names from string literals, as far as recognition
of the binary/utf8 distinction goes.

: Or will some LC_* envariable like to be set?  Or a pragma?

You can use the bytes or charset pragmas I mentioned to force the issue
in the binary direction.  I believe the linux-utf8 mailing list folks
would assume that if LC_CTYPE is set to UTF-8, Perl should assume its
script is in UTF-8, though I don't know how universal that sentiment
will become.  The Linux folks are assuming they can just cut everything
over to UTF-8 at some point, and life tends to be a little more
complicated than that.

Larry



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About