develooper Front page | perl.perl5.porters | Postings from February 2000

Re: C<use utf8> should never set $^U

From:
Larry Wall
Date:
February 4, 2000 08:13
Subject:
Re: C<use utf8> should never set $^U
Message ID:
200002041610.IAA14649@kiev.wall.org
Chip Salzenberg writes:
: Having finally gotten around to reading perllocale, I find this
: welcome feature:
:     [...] if the C<$^U> global flag is set to C<1>, nearly all
:     operations will use character semantics by default.

Let's be clear about this.  According to my understanding, the purpose
of $^U is to make all *interfaces* use utf8 by default.  By default
operations are always polymorphic, and we only bend the default in the
direction of bytes.

: But I also find this *unwelcome* feature:
:     As an added convenience, if the C<utf8> pragma is used in the
:     C<main> package, C<$^U> is enabled automatically.
: 
: I consider this behavior less a convenience and more a bug.  I think
: it's a bad idea to give one pragma both lexical effects and global
: effects.  Please separate them.

My goal is to turn the utf8 pragma into a no-op, unless it's a lexically
scoped equivalent to local $^U = 1.  But in that case, it's an override,
like lexical warnings override $^W.  And it has nothing to do with
with operations, just interfaces.

:     [XXX: Should there be a -C switch to enable $^U?]
: 
: Absolutely.  And then the intent of the global behavior of C<use utf8>
: can be met by putting "-C" on the shebang line.

There will certainly be many different ways of telling Perl that various
interfaces have switched to using utf8, if for no other reason than that
different operating systems will choose different ways to implement
those interfaces.  Linux, for instance, will (for good or ill) be setting
$ENV{LC_CTYPE} to indicate that your interfaces are utf8.  Windows
has a big switch inside your process that $^U will toggle.  Other systems
will have ways of selecting individual interface semantics.  We'll have
to work with them all.

But the core of Perl should not be affected once strings are classified
into one of those three states I mentioned earlier.

Larry



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About