develooper Front page | perl.perl5.porters | Postings from December 2013

FYI Regex synthetic start class has been mostly disabled for 3 months

Thread Next
Karl Williamson
December 31, 2013 16:01
FYI Regex synthetic start class has been mostly disabled for 3 months
Message ID:
I just pushed this commit to blead:

commit 749e076fceedeb708a624933726e7989f2302f6a
  Author: Karl Williamson <>
  Date:   Sat Dec 28 21:48:57 2013 -0700

regcomp.c: Reinstate use of synthetic start class

This effectively reverts commit 
a74bca75951b6a3b0ad03ba07eb31e2ca1227308, although the syntax has 
changed.  This commit inadvertently caused a synthetic start class (SSC) 
to not be generated in many cases where it previously was.  The SSC is 
generated by the regex optimizer to hopefully speed up finding where to 
start matching the target string against the regex pattern.  I don't 
know if this is a valid data point, but in the 3 months that this was in 
blead, there were no complaints of a slowdown that could be attributed 
to this.

The commit that caused this was made after discussing it with Yves 
Orton.  It just seemed (and still seems) wrong that doing what the code 
indicates is a logical OR should actually restrict the possibilities. 
The change essentially caused the result of OR'ing together the matches 
of two nodes, one of which nominally could match a sub-string of zero 
length, to also match a sub-string of zero length.  Previous to that 
commit, and after this new one, the result of doing the OR excludes a 
zero-length string, as if it were an AND.

The end result of the change was that the SSC could match a zero-length 
string, and thus was discarded as not being useful.  Yves and I knew 
that the change would not cause bugs; just potentially create more false 
positives.  And we were right.

I believe that this is further indication that the optimizer could 
benefit greatly from an overhaul.  It's clear from looking at the code 
and commits that other people have been similarly fooled.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About