develooper Front page | perl.perl5.porters | Postings from May 2003

Re: [perl #22182] regular expression bug (design limitation?)

Thread Next
From:
Edward Peschko
Date:
May 18, 2003 01:33
Subject:
Re: [perl #22182] regular expression bug (design limitation?)
Message ID:
20030517200616.A6342@mdssirds.comp.pge.com
On Thu, May 15, 2003 at 08:15:07AM -0000, Rafael Garcia-Suarez wrote:
> Edward Peschko wrote:

> > my $_parens = \&_parens;
> > 
> > if ($string =~ m"($_parens)stuff")
> > {
> > }
> > 
> > and have the regular expression engine:
> > 
> >     a) recognize that $_parens is a code reference
> 
> The problem is that the regexp engine won't see the $_parens variable, but
> something like "(CODE(0xdeadbeef))stuff".

I can think of at least three ways around this problem:

    1) When a coderef stringifies, perl remembers how to translate that coderef back 
       into a subroutine (keeps a hash of coderef string to coderef value pairs)

	2) the special, modified, version of "" that the regex engine uses is modified such
	   that it doesn't stringify $_coderef, and instead gets a callback which it calls
	   upon entering scope, as discussed.

	3) a special (?...) syntax is put around $_coderef to tell the regex engine that 
	   it is dealing with a coderef.

This is a big one - right now, I'm using perl to parse heavy-duty amounts of C++ code and
the regex engine is showing its strain.  It'd be really cool to link the parts of the gcc
parser into the regex engine via Inline::C and get a syntax-tree 'from the horses mouth' 
so to speak. 

Either that, or simply get the speed benefits that C would offer in complicated text 
searching. 

Ed

(ps - who is the main maintainer of the regex engine now, anyways? Or is it unclaimed 
territory?)

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About