develooper Front page | perl.perl5.porters | Postings from January 2018

Re: Another idea for speeding up /i matching on ASCII characters

Thread Previous | Thread Next
From:
Father Chrysostomos
Date:
January 25, 2018 20:50
Subject:
Re: Another idea for speeding up /i matching on ASCII characters
Message ID:
20180125204955.22957.qmail@lists-nntp.develooper.com
Karl Williamson wrote:
> This however is not space efficient.
> 
> If we have an EXACTFish node consisting entirely of ASCII characters
> that don't have folds outside of ASCII (so Kk are out because they match
> the KELVIN SIGN), it would be possible to create a parallel string to be
> used as a mask, specially constructed so that each byte is the
> appropriate mask for /i on the corresponding byte in the main string.
> 
> When doing the match, you load a words-worth

I wandered lonely as a cloud that floats on high o'er vales and hills,
And all at once I saw a crowd....

Sorry, I got sidetracked.

> of the string, and the
> corresponding portion of the mask, AND them together and compare.  If
> then fail or go on to the next word full.
> 
> Thus you wouldn't have to look up the fold for every byte as we do now.
> If you've ever watched an EXACTFish node execute with backtracking, it's
> depressing how often we have to do this over and over.
> 
> This could also work on many UTF-8 strings as well.
> 
> It essentially doubles the length of the pattern, but my guess is that
> it's around an order of magnitude faster on a 64-bit machine.

Maybe we could actually make 'use less "memory"' do something.

How much faster is it on 32-bit machines.  (I am no longer in a posi-
tion to provide 32-bit benchmarks.  Machines bitten 32 times seem to
be getting rarer these days.)

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About