develooper Front page | perl.perl5.porters | Postings from June 2008

Re: Empty regex (behavior of standalone C<$empty=qr//> is not a bug)

From:
David Nicol
Date:
June 19, 2008 10:11
Subject:
Re: Empty regex (behavior of standalone C<$empty=qr//> is not a bug)
Message ID:
934f64a20806191010v5002f3a0s64d2fa21693f2304@mail.gmail.com
>> ok("foo" =~ /foo/ && "bar" =~ /$x$x/);
>> ok("foo" =~ /foo/ && "bar" =~ /$x/);
>> ok 1
>> not ok 2
>
> is first of the two oks is wrong because $x$x should get optimized
> down to empty, or is the second wrong because /(?-xism:)/ should not?
>
> $ perl -le '321 =~ /321/ and print 1; 4321 =~ // and print 2; 4321 =~
> /(?-xism:)/ and print 3; 4321 =~ /(?-xism:)(?-xism:)/ and print 4'
> 1
> 2
> 3
> 4

sorry, that didn't tell us anything:

$ perl -le '321 =~ /321/ and print 1; 432 =~ // and print 2; 432 =~
/(?-xism:)/ and print 3; 321 =~ /321/ and 432 =~ /(?-xism:)(?-xism:)/
and print 4'
1
3
4

$ perl -le '$r=qr//; 321 =~ /321/ and print 1; print $r; 432 =~ /$r/
and print 2; 432 =~ /$r$r/ and print 3'
1
(?-xism:)
3


before pasting YST's example into TODO tests, what is the desired state?

the current state appears to be:
  * a single quoted empty regex triggers the empty regex case
  * multiple ones do not
  * a regex consisting of nothing but empty, non-capturing clusters
doesn't trigger the empty regex case.


YST appears to desired a state in which the empty regex, when
interpolated, will not trigger the empty-regex case.

The current state allows the empty-state regex to be introduced in a
regex containing one interpolated qr variable by setting the variable
to qr//, but not when the regex contains anything beyond that one qr
result.  By fixing the
alleged bug, this feature would be lost.

The optimization appears to be that the match operator when looking at
a regex consisting entirely of a single qr-result will skip any
additional parsing of the qr result.  This is documented as follows in
perlop:

         The result may be used as a subpattern in a match:

                   $re = qr/$pattern/;
                   $string =~ /foo${re}bar/;   # can be interpolated
in other patterns
                   $string =~ $re;             # or used standalone
                   $string =~ /$re/;           # or this way

which says that simply wrapping the qr result in slashes is equivalent
to using it standalone, which is why  /$empty/ works like // but
/(?:)/ does not, and neither does /$empty$empty/, as /$empty$empty/ is
not standalone.

Here's another empty regex edge case that may or may not be correct:

$ perl -le ' 321 =~ /321/ and print 1; 432 =~ //x and print 2; 432 =~ /


/x and print 3'
1
3

(examples produced with cygwin 5.8.8)



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About