develooper Front page | perl.perl5.porters | Postings from February 2009

Re: regexp iteration limits

Thread Previous | Thread Next
From:
Zefram
Date:
February 12, 2009 00:35
Subject:
Re: regexp iteration limits
Message ID:
20090212083425.GF2263@fysh.org
Bram wrote:
>Why use (?>[a-z])* ? Is it any different than  ([a-z])* ?

/\A([a-z])*\z/ or /\A(?:[a-z])*\z/ doesn't tickle the bug.  Apparently the
regexp engine is clever enough to recognise that a subpattern has a fixed
length of 1, and quantifies it in a different way.  /\A(?>[a-z])*\z/
makes the subexpression sufficiently complex to go the heavyweight path
and encounter the bug.

Another way to make the expression sufficiently complicated is
/\A(?:[a-z][a-z])*\z/.  You have to double the length of the string,
of course, but then it shows the bug in the same way as the (?>) form:

$ for pver in 5.{6.2,8.{0,8,9},9.4,10.0}; do echo $pver $(~/usr/perl/util/perlver $pver-i32-f52 perl -lwe '$SIG{__WARN__}=sub{print"(warn) "}; $a="xyzt"x20000; print $a =~ /\A(?:[a-z][a-z])*\z/ ? "ok" : "bug"'); done
5.6.2 ok
5.8.0 ok
5.8.8 ok
5.8.9 ok
5.9.4 bug
5.10.0 bug
$ for pver in 5.{8.{0,8,9},9.4,10.0}; do echo $pver $(~/usr/perl/util/perlver $pver-i32-f52 perl -lwe '$SIG{__WARN__}=sub{print"(warn) "}; $a="xyzt"x20000; utf8::upgrade($a); print $a =~ /\A(?:[a-z][a-z])*\z/ ? "ok" : "bug"'); done 
5.8.0 bug
5.8.8 bug
5.8.9 ok
5.9.4 bug
5.10.0 bug
$

You can also make the subexpression sufficiently complicated by adding
something that can match multiple ways.  /\A(?:X?[a-z])*\z/ triggers
the bug in a different manner, encountering the "recursion limit" warning.

-zefram

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About