Front page | perl.perl5.porters |
Postings from September 2011
Re: The future of POSIX in core
Thread Previous
|
Thread Next
From:
Nicholas Clark
Date:
September 2, 2011 13:14
Subject:
Re: The future of POSIX in core
Message ID:
20110902201426.GJ23881@plum.flirble.org
On Fri, Sep 02, 2011 at 03:33:49PM -0400, David Golden wrote:
> On Fri, Sep 2, 2011 at 3:22 PM, Nicholas Clark <nick@ccl4.org> wrote:
> > Whereas for a new major release, if anyone upgrades without testing, and has
> > the chutzpah to send a bug report about something, my opinion is that most
> > likely it should be rejected on the basis of "you get to keep both pieces",
> > particularly if it was a documented change. Hence in major release new
> > warnings are as tolerable as any other breakage. (ie not very tolerable)
>
> I don't think it's a problem to warn on stuff that is discovered to be
> demonstrably broken. The only question in my mind is whether to
> deprecate the functions or simply warn. I have zero qualms about
> deprecating broken things and making anyone who needs compatibility go
> load POSIX::broken for compatibility when we finally remove the
> functions from POSIX.
For the 11 is* functions
(is{alnum,alpha,cntrl,digit,graph,lower,print,punct,space,upper,xdigit})
given the core's current take on UTF-8 locales is pretty much treating
Jarkko's view of "I can let you in to the ultimate locale secret: avoid"
as "just ignore them", I suspect that the least worst fix is to return
0 immediately if the passed-in scalar has any code point above 255,
otherwise convert UTF-8 to octets if necessary, and assume that the locale-
aware matching of is*() was what the programmer really wanted.
For the multibyte functions, I suspect that the least worst fix is to
fault in some fashion (return undef, warn or die, I'm not sure) on code
points over 256, and process anything entirely in the range 0-255.
[unlike the current implementation, which ignores SvUTF8() because it
predates it by over 5 years. This is also why I think Marc Lehman is roughly
right in asserting that 5.6 breaks the XS API, by changing the meaning of
the macro SvPV(). Had 5.6 made SvPV() either return "only code points
0-255", or consistently return only the UTF-8 representation, then at least
there would be predictability, and the char* typemap in XS would not be
broken because it's now unpredictable. As it is, SvPVX, SvPV() and the char
* typemap are all "place bets now", as if your data happens to have code
points between 0 and 255, who knows what representation your octet sequence
is going to be in. Yes, you are now expected to also check SvUTF8().
Everyone get into their time machine and tell the authors of the code
written in 1994. Such as POSIX.xs]
> Now that we have an explicit policy around bugward-compatibility, we
> could also do something tricky depending on whether "use v5.16" is in
> effect or not. With it, you get correct behavior and/or warnings.
> Without it, you get the same buggy POSIX mess you had before.
Treating all bugs as features doesn't scale. One would end up with (likely)
linear growth in code size for constant features/modules, given that each
release would add some number of conditional bug fixes, but no release would
ever take them away.
Attempt to add features or modules, and that code growth is above linear.
One isn't allowed to *remove* modules, as that's not bugward-compatible,
so that can't be used to balance it out, and get back down to linear.
Nicholas Clark
Thread Previous
|
Thread Next