develooper Front page | perl.perl5.porters | Postings from March 2000

5.6.0 != 5.004_04, don't know why.

From:
Matthew Persico
Date:
March 25, 2000 19:11
Subject:
5.6.0 != 5.004_04, don't know why.
Message ID:
38DD80BB.5621A507@acedsl.com
I have a situation where regular expression evaluation differs between
5.6.0 and 5.004_04. You may follow the original thread starting at
comp.lang.perl.moderated, id 38DC19F5.631F21A7@acedsl.com. I have
reproduced it here for your convenience. Net-net, there appears to be
some problem with literal whitespace following a '#-of' specifier.

I originally posted this:

> Look at these two debugger sessions. The first is 5.004_04:
> 
> DB<2> if( '20000324.16:10:19: ' =~ /[0-9]{6}\.[0-2][0-9]:[0-5][0-9]:[0-5][0-9]: /) { print "yep" }
> yep
> DB<3> if( '20000324.16:10:19: ' =~ /[0-9]{6}\.[0-2][0-9]:([0-5][0-9]:){2}\s/) { print 'yep' }
> yep
> DB<4> if( '20000324.16:10:19: ' =~ /[0-9]{6}\.[0-2][0-9]:([0-5][0-9]:){2} /) { print 'yep' }
> yep
> DB<5>
> 
> The second is 5.6.0, solaris 2.6
> 
> DB<35> if( '20000324.16:10:19: ' =~ /[0-9]{6}\.[0-2][0-9]:[0-5][0-9]:[0-5][0-9]: /) { print "yep" }
> yep
> DB<36> if( '20000324.16:10:19: ' =~ /[0-9]{6}\.[0-2][0-9]:([0-5][0-9]:){2}\s/) { print 'yep' }
> yep
> DB<37> if( '20000324.16:10:19: ' =~ /[0-9]{6}\.[0-2][0-9]:([0-5][0-9]:){2} /) { print 'yep' }
> 
> DB<38>
> 
> See what happened at DB<37>? The space at the end of the RE doesn't match the space at the end of the string, like it did in 5.6. But is matches in DB<35>. And I just tested ActiveState (5.6.0, RC1, it think) and it exhibits the same behavior.
> 



There were two followup posts:

Uri Guttman in x7g0tf8tzl.fsf@home.sysarch.com
> >>>>> "MP" == Matthew Persico <persicom@acedsl.com> writes:
> 
>   MP> DB<37> if( '20000324.16:10:19: ' =~ /[0-9]{6}\.[0-2][0-9]:([0-5][0-9]:){2} /) { print 'yep' }
> 
> 
> hmm, this is under 5.005_03 with similar results. i also reduced the
> problem and found cases where it does work.
> 
> 
>  DB<2> print 'yep'if('20000324.16:10:19: '=~/\d{6}\.[0-2]\d:([0-5]\d:){2} /);  
> 
>   DB<3> print 'yep'if('20000324.16:10:19: '=~/\d{6}\.[0-2]\d:([0-5]\d:){2}./);
> yep
>   DB<5> print 'yep'if('20000324.16:10:19: '=~/\d{6}\.[0-2]\d:([0-5]\d:){2}\s/);
> yep
>  
> the beginning part doesn't have to be there. it seems to be something to
> so with a space after {2}
> 
>  DB<9> print 'yep' if('10:19: '=~/([0-5]\d:){2} /);  
> 
>   DB<10> print 'yep' if('10:19: '=~/([0-5]\d:){2}./);
> yep
> 
> and this fixes it too.
> 
>   DB<13> print 'yep' if('10:19: '=~/([0-5]\d:){1,2} /);
> yep
> 
> 
> and this fails:
> 
> DB<17> print 'yep' if('10:19: '=~/(\d\d:){2} /);  
> 
> non-grabbing doesn't help:
> 
> DB<18> print 'yep' if('10:19: '=~/(?:\d\d:){2} /);
> 
> 
> 
> so i would call this a real bug in the regex engine and not a 5.6 bug as
> this is failing under 5.005_03.
> 
> send it to p5p
> 
> uri  
> 
 and Rick Delaney id 38DD2CA7.E7E7B739@home.com

> Uri Guttman wrote:
> > 
> > DB<18> print 'yep' if('10:19: '=~/(?:\d\d:){2} /);
> > 
> > so i would call this a real bug in the regex engine and not a 5.6 bug as
> > this is failing under 5.005_03.
> 
> Not in the REx engine, but the optimizer it seems.
> 
> 5.5.670 $ ./perl -Dr -le 'print "yep" if("10:19: "=~/(\d\d:){2} /);'
> Compiling REx `(\d\d:){2} '
> size 15 first at 5
> rarest char : at 0
>    1: CURLYM[1] {2,2}(13)
>    5:   DIGIT(6)
>    6:   DIGIT(7)
>    7:   EXACT <:>(11)
>   11:   SUCCEED(0)
>   12:   NOTHING(13)
>   13: EXACT < >(15)
>   15: END(0)
> anchored `: ' at 2 (checking anchored) stclass `DIGIT' minlen 7
> Omitting $` $& $' support.
>  
> EXECUTING...
>  
> Guessing start of match, REx `(\d\d:){2} ' against `10:19: '...
> Did not find anchored substr `: '...
> Match rejected by optimizer
> Freeing REx: `(\d\d:){2} '
> 
> -- 
> Rick Delaney
> rick.delaney@home.com

-- 
Matthew O. Persico
    
"If you were supposed to understand it,
we wouldn't call it code."



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About