develooper Front page | perl.perl5.porters | Postings from November 2018

[perl #133695] [PATCH] Range Operator inconsistency

From:
Hauke D via RT
Date:
November 30, 2018 14:09
Subject:
[perl #133695] [PATCH] Range Operator inconsistency
Message ID:
rt-4.0.24-30151-1543586947-1880.133695-15-0@perl.org
Hi,

Thanks for looking into this!

The code comment in the code you showed [1] mentions #18165 [2] which references #18114 [3] where a reply by Slaven Rezic makes sense to me: 'There is a special handling for numeric strings beginning with a "0". This is to allow things like "01".."31" to preserve the leading zero for one-digit numbers.' The basic behavior appears to go all the way back to 5.000 [4].

  [1] https://perl5.git.perl.org/perl.git/blob/23665de87341f4f3452009759d4fc95ce30b8ced:/pp_ctl.c#l1179
  [2] https://rt.perl.org/Public/Bug/Display.html?id=18165
  [3] https://rt.perl.org/Public/Bug/Display.html?id=18114
  [4] https://perl5.git.perl.org/perl.git/blob/refs/tags/perl-5.000:/pp_ctl.c#l694

So my interpretation of the rules is this: If the left and right operands are strings, then check if they looks_like_number. If they do, treat them as integers. However, make an exception when the left-hand side begins with "0", for the reason stated above.

The key word here is *begins* with zero; the condition *SvPVX_const(left)!='0' causes this inconsistency:

  -3..-1 and "-3".."-1" are (-3,-2,-1)
  -2..-1 and "-2".."-1" are (-2,-1)
  -1..-1 and "-1".."-1" are (-1)
   1..-1 and  "1".."-1" are ()
  however:
  0..-1 is ()  but  "0".."-1" is (0..99)

That latter behavior may be in line with "01".."-1", which is ("01","02","03",...), but IMO it's still surprising, and in any case the fact that strings that look like numbers are treated as such appears to be undocumented.

I have two alternative proposals: (A) leave the behavior as-is, but document it, or (B) change the behavior so that the above condition is 'if the LHS is a string that begins with 0, except for the string "0" itself' (and document it) - this would cause the "01".."31" case to still work, but also cause "0".."-1" to act like 0..-1.

Patches for both A (just document) and B (change behavior) are attached, with tests included (a full build passes all tests on my end). My internals knowledge is quite limited so I hope my use of SvCUR in the second patch is correct.

My personal preference is option B, since it gets rid of the above inconsistency, but I understand that if there are worries about backwards compatibility; option A may be better in that respect. The way I've worded the documentation pretty much nails down the behavior and wouldn't allow for future changes, a third option might be to word the documentation more loosely and leave the door open for future changes.

Thanks, Regards,
-- Hauke D

P.S. The attachment "rt133695.pl" in my previous message contains an off-by-one error, but in an unused branch of code, so the output and conclusions produced by the script are still correct (as long as $inseq is always false, which it currently is).


On Thu, 29 Nov 2018 04:05:27 -0800, davem wrote:
> On Wed, Nov 28, 2018 at 07:56:34AM -0800, Hauke D via RT wrote:
> > > As first reported on PerlMonks in this thread:
> > > https://www.perlmonks.org/?node_id=1226434
> > >
> > > perlop says: "The range operator (in list context) makes use of the
> > > magical auto-increment algorithm if the operands are strings. ...
> > > If the
> > > final value specified is not in the sequence that the magical
> > > increment
> > > would produce, the sequence goes until the next value would be
> > > longer
> > > than the final value specified."
> > >
> > > And yet there are some really strange inconsistencies with respect
> > > to
> > > the produced ranges, sometimes the strings appear to be treated as
> > > integers, sometimes they don't. In particular, compare "0".."-1",
> > > which
> > > produces "0" through "99", to "1".."-1", which produces the empty
> > > list.
> 
> Perl internally tries very hard to treat the range args as numeric
> where
> possible, and has a special exception for the string "0". The relevant
> macro from pp_ctl.c (reformed for clarity) is:
> 
> /* This code tries to decide if "$left .. $right" should use the
>    magical string increment, or if the range is numeric (we make
>    an exception for .."0" [#18165]). AMS 20021031. */
> 
> #define RANGE_IS_NUMERIC(left,right) (
>        SvNIOKp(left)
>    || (SvOK(left) && !SvPOKp(left))
>    || SvNIOKp(right)
>    || (SvOK(right) && !SvPOKp(right))
>    || (
>                (
>                       (!SvOK(left) && SvOK(right))
>                    || (
>                           (!SvOK(left) || looks_like_number(left))
>                        && SvPOKp(left)
>                        && *SvPVX_const(left) != '0')
>                )
>             && (!SvOK(right) || looks_like_number(right))
>         )
>    )
> 
> Frabnkly I don't understand all those conditions; they are a lot more
> specific than the docs.


---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=133695



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About