develooper Front page | perl.perl5.porters | Postings from January 2018

Another idea for speeding up /i matching on ASCII characters

Thread Next
Karl Williamson
January 25, 2018 20:18
Another idea for speeding up /i matching on ASCII characters
Message ID:
This however is not space efficient.

If we have an EXACTFish node consisting entirely of ASCII characters 
that don't have folds outside of ASCII (so Kk are out because they match 
the KELVIN SIGN), it would be possible to create a parallel string to be 
used as a mask, specially constructed so that each byte is the 
appropriate mask for /i on the corresponding byte in the main string.

When doing the match, you load a words-worth of the string, and the 
corresponding portion of the mask, AND them together and compare.  If 
then fail or go on to the next word full.

Thus you wouldn't have to look up the fold for every byte as we do now. 
If you've ever watched an EXACTFish node execute with backtracking, it's 
depressing how often we have to do this over and over.

This could also work on many UTF-8 strings as well.

It essentially doubles the length of the pattern, but my guess is that 
it's around an order of magnitude faster on a 64-bit machine.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About