develooper Front page | perl.perl5.porters | Postings from July 2001

Re: bug? $& temporarily modified in look-ahead

Thread Previous
From:
Rafael Garcia-Suarez
Date:
July 27, 2001 13:57
Subject:
Re: bug? $& temporarily modified in look-ahead
Message ID:
20010727225831.A819@rafael
On 2001.07.27 17:14 Jeff 'japhy/Marillion' Pinyan wrote:
> This is forwarded from a conversation I had with Jeff Friedl.
> 
> ==========================================================================
> Jeff 'japhy/Marillion' Pinyan <jeffp@crusoe.net> wrote:
> 
> |> The only problem is (understandable) that $&, while updated in a
> |> look-ahead, is NOT "down"dated in a look-behind:
> |> 
> |>   "japhy" =~ m{
> |>     ..
> |>     (?{ print "<$`><$&><$'>" })
> |>     (?=
> |>       ..
> |>       (?{ print "<$`><$&><$'>" })
> |>     )
> |>     (?<=
> |>       ..
> |>       (?{ print "<$`><$&><$'>" })
> |>     )
> |>     (?{ print "<$`><$&><$'>" })
> |>   }x;
> |> 
> |> The output is:
> |> 
> |>   <><ja><phy>
> |>   <><japh><y>
> |>   <><ja><phy>
> |>   <><ja><phy>
> 
> The 2nd one looks like a bug to me. The lookahead is just that, lookahead,
> and so I feel that it should never (ever) be part of $&. The other ones
> seem as I would expect them.
> ==========================================================================
> 
> I said it didn't seem like a bug, because:
> 
>   1. look-ahead is a regex, and the modifications done to $& are localized

So $&, inside the regexp, is the value being constructed :
#!/opt/perl/bin/perl5.7.2 -l
"abc" =~ m{
  (?:
    . (?{ print "<$`><$&><$'>" })
  ){3}
}x;

outputs :
<><a><bc>
<><ab><c>
<><abc><>

Shouldn't this be documented in perlre or perlvar?

>   2. if $& wasn't touched, there'd be little to no information you could
>      get out of it, apart from capturing (which then bumps up your
>      following captures)

Note that capturing would give "ja" and "ph", not "japh".

>   3. look-behind isn't much of a "regex", and so I don't think it gets
>      treated the same way internally

You've said it : internally. If you specify what value $& should have
inside an assertion, you make an assumption how the regexp engine is
implemented, or (in other words) you put a constraint on the implementation
of this engine. For example, $& could have been "ph" instead of "japh" if
the assertion was verified by another instance of the engine. That's why I
agree with Jeff Friedl: it seems that his point of view will make it easier
to plug alternate regexp engines or to modify the default one. -- Or,
alternatively, you could say that the value of $& in assertions is
implementation-dependent.

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About