develooper Front page | perl.perl5.porters | Postings from April 2006

Re: recursion now removed from the regex engine

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
April 2, 2006 09:37
Subject:
Re: recursion now removed from the regex engine
Message ID:
20060402163932.GJ4571@iabyn.com
On Sat, Mar 25, 2006 at 01:59:35PM +0000, Nicholas Clark wrote:
> Is it worth replacing the local variables with a single struct regmatch_state
> created on entry to S_regmatch?

Short answer: yes: see change 27679.

This change:

1) removes most of regmatch()'s local vars and replaces them with a pointer
st to a state structure; so thoughout the code, stuff like ln++ becomes
st->ln++.

2) caches the four most-commonly accessed members of this struct in local
vars.

3) Adds two new per-thread vars, PL_regmatch_slab and PL_regmatch_state
which provide the state struct allocation mechanism.

State structs are now allocated in 4K slabs. Each slab has next and prev
pointers, so it's a chain of slabs - which acts as a segmented stack.
During execution of regmatch(), extra slabs are allocated as necessary,
but are not freed, as regmatch "recurses" its way up and down. On final
exit from regmatch(), any slabs allocated singe entry to regmatch() are
freed. However, the first slab in the chain is never freed until
perl_destruct, to speed up entry to regmatch().

PL_regmatch_state points to the current live state struct, while
PL_regmatch_slab points to the current slab of states that contains
PL_regmatch_state.

4) various structure definitions have been moved to regexp.h to allow for
the fact that assorted pointer types contained in PL_regmatch_state must
now be visible outside of regmatch.c. I've also moved to there some other
regmatch-specific structs that were hanging out in perl.h and regcomp.h.

This code runs t/op/pat.t and t/op/regexep*.t at about the same
speed as before I made regmatch non-recurive; ie I've clawed back the
original slowdown.

I still have quite a bit more messing planned for regmatch(); I still
haven't made the stab/state pointers be automatically restored on die, so
soemthing like /(?{ die })/ will probably leak. I also intend to reduce the
size of the state struct by throwing in some unions; then I want to
integrate some or all of the other assorted state-saving mechanisms within
regmatch(), eg CURCUR and unwind.

Dave



-- 
print+qq&$}$"$/$s$,$*${d}$g$s$@$.$q$,$:$.$q$^$,$@$*$~$;$.$q$m&if+map{m,^\d{0\,},,${$::{$'}}=chr($"+=$&||1)}q&10m22,42}6:17*2~2.3@3;^2dg3q/s"&=~m*\d\*.*g

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About