develooper Front page | perl.perl5.porters | Postings from January 2005

Re: [perl #33908] $ behaves inconsistent with \G

Thread Previous
From:
Ronald J Kimball
Date:
January 24, 2005 08:22
Subject:
Re: [perl #33908] $ behaves inconsistent with \G
Message ID:
20050124162214.GB61984@penkwe.pair.com
On Mon, Jan 24, 2005 at 01:23:06AM -0000, Marc Lehmann wrote:

> While writing a parse, I found that /\G.*/ sometimes doesn't match, which
> I think it should do always, regardless of the content and history of $_.
> 
> It turned out that this match, run before, triggers this:
> 
>    /\G$/gc;
> 
> After that, \G will no longer match.
> 
> I think this is a bug becaise the following two segments behave differently:
> 
>    $_ = "test";
>    /.*$/gc or die "-";       # this line differs
>    /\G$/gc or die "A";
>    /\G/gc or die "B";
> 
> The above dies at "B", while this:
> 
>    $_ = "test";
>    /$/gc or die "-";         # this line differs
>    /\G$/gc or die "A";
>    /\G/gc or die "B";
> 
> Dies at "A", so the "$" in the first match has different behaviour
> depending on wether there is a .* or not in front of it. I think it should
> behave the same in both cases. More specifically, I would prefer it to
> not die ever (I am not sure how to write my parser if I cannot check for
> end-of-string more than once), but one could argue that "$" eats the
> physical or virtual line-end.
> 
> (The actual problem is that, when implementing the posix shell grammar,
> "$" is being treated both as end-of-command and end-of-input, at different
> places in my recursive descent parser.  As /\G$/ doesn't match the second
> time, the parser assumes a parse error because something is following).

I believe this behavior is falls under the section "Repeated patterns
matching zero-length substring" in perlre.  /$/gc matches a zero-length
substring, so the next regex is prevented from matching a zero-length
substring at the same position.

I don't actually know whether it's supposed to behave that way in this
specific case though.

Anyway, you may be able to work around this problem by setting pos after
the first match, to clear the 'matched a zero-length substring' flag:

$_ = "test";
/$/gc or die "-";
pos $_ = pos $_;
/\G$/gc or die "A";
/\G/gc or die "B";


Ronald

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About