develooper Front page | perl.perl6.language | Postings from August 2000

RFC: extend "study" to produce a fast grep through many regexes

Thread Next
From:
David L. Nicol
Date:
August 21, 2000 14:43
Subject:
RFC: extend "study" to produce a fast grep through many regexes
Message ID:
39A1A264.F25771C0@kasey.umkc.edu


title: study a list of regexes
David Nicol.
Aug 21
version 1
perl6-language@perl.org


Sometimes I have a group of regexen, and I would like to know
which ones will match.

Current practice is to "study" $situation and then grep them:


example a:

	study $situation;
	@matches = @rules{ grep {$situation =~ m/$_/} keys %rules};


What if there were a faster way to do this, a way to C<study> a
group of regexes and have the computer determine which ones had
parts in common, so that if $situation =~ m/^foo/ is true, the
fifty rules that start ^bar don't waste any time.  At all.

example b:

	$matchcode = study @regexen;

will generate an anonymous subroutine (it's called $matchcode in
the example line) with a tree based on required parts of the
regexes, to minimize the number of match attempts to determine
which regexes will match.  The subroutine will take an array
argument and will return the subset of the rules (as stated in
the original array, either string or compiled qr// references)
that match on all the arguments.

The code in example a could then be replaced

	$matchcode = study keys %rules;
	@matches = @rules{ &$matchcode $situation }


This ability could speed "dirty matching" which currently cannot
take advantage of constant-time hash lookups without standardizing
the dirty parts.




-- 
                          David Nicol 816.235.1187 nicold@umkc.edu
                      Does despair.com sell a discordian calendar?

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About