develooper Front page | perl.perl5.porters | Postings from December 2017

regrepeat()

Thread Next
From:
Karl Williamson
Date:
December 31, 2017 21:36
Subject:
regrepeat()
Message ID:
4b43681d-bf5f-c1a7-1c66-6ec352fb92eb@khwilliamson.com
This function is called during regular expression pattern matching for 
things like

  (foo)+

to  match as many 'foo's as there are.  There is special code to handle 
the case where foo is a single byte, such as in

  a+

It turns out that these cases can be sped up dramatically if what we are 
matching is a long string of 'a's in a row.  We simply load a word with 
4 or 8 a's and look at the string a word-at-a-time, which uses 1/4 or 
1/8 the number of instructions.  By using a mask, this can be extended 
to work for

  [aA]+

as well.  The code in each case is just over 20 lines of C.

My question is, does this happen often enough in real life to justify 
the extra code?

Leon pointed out that in DNA, there may be longish strings of 'A's.

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About