develooper Front page | perl.perl5.porters | Postings from August 2013

[perl #117355] [lu]cfirst don't respect 'use bytes'

Thread Previous | Thread Next
From:
Father Chrysostomos via RT
Date:
August 12, 2013 06:22
Subject:
[perl #117355] [lu]cfirst don't respect 'use bytes'
Message ID:
rt-3.6.HEAD-2552-1376288545-1785.117355-15-0@perl.org
On Sun Aug 11 19:56:53 2013, rjbs wrote:
> On Sun Jul 14 23:54:35 2013, sprout wrote:
> > > From the top of the pod in bytes.pm, added for 5.12.0:
> > >
> > > =head1 NOTICE
> > >
> > > This pragma reflects early attempts to incorporate Unicode into
> perl and
> > > has since been superseded. It breaks encapsulation (i.e. it
> exposes the
> > > innards of how the perl executable currently happens to store a
> string),
> > > and use of this module for anything other than debugging purposes
> is
> > > strongly discouraged. If you feel that the functions here within
> might be
> > > useful for your application, this possibly indicates a mismatch
> between
> > > your mental model of Perl Unicode and the current reality. In that
> case,
> > > you may wish to read some of the perl Unicode documentation:
> > > L<perluniintro>, L<perlunitut>, L<perlunifaq> and L<perlunicode>.
> >
> > What can we do to upgrade this to a deprecation?
> 
> I'm not sure.
> 
> The question is:  do we propose to allow bytes.pm to become an
> external library?  Can we do
> this usefully, since bytes currently works (as I understand it) by
> tweaking $^H and letting CORE
> sort out the rest?  Can its behavior be reimplemented as something
> entirely without core
> support.  I think so, by making copies and downgrading.  (four arg
> substr won't be exactly that
> simple, but should be doable.)
> 
> I haven't given this a lot of thought, but I think that if we can make
> bytes.pm ejectable, we
> should do so.  It's okay if it gets slower, since we've been telling
> people for years that it's only a
> debugging tool, if that.
> 
> Thoughts? Objections?

How much of it would we reimplement?

If we want to keep its current behaviour, we would end up having to
override almost every op in what would become bytes.xs.  Just search for
uses of DO_UTF8 throughout the core.  DO_UTF8 means SvUTF8 unless bytes
is turned on, in which case we pretend the flag is not set.  That means
"\xff"."\xff" returns "\xff\xc3\xbf" if the rhs is in utf8.  So we have
to override concatenation via PL_check hooks, which gets messy.  It
seems like a lot of work for preserving broken behaviour.

Are you suggesting just a subset of the behaviour?

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=117355

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About