develooper Front page | perl.perl5.porters | Postings from September 2004

Re: [perl #29650] Perl 5.8.3/5.8.4 - ithreads - solaris 5.8 -libpthread - reg.exp seg. fault.

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
September 12, 2004 17:01
Subject:
Re: [perl #29650] Perl 5.8.3/5.8.4 - ithreads - solaris 5.8 -libpthread - reg.exp seg. fault.
Message ID:
20040913000207.GC2621@iabyn.com
On Mon, May 17, 2004 at 10:53:25AM -0000, Rafiq Ismail (ADMIN) wrote:
> I discovered a bug whilst trying to use template toolkit under a threaded
> mod_perl environment.  This seg fault didn't occur when tested against a
> perl without ithreads.  I managed to trace the flaw down to one regular
> expression.  On the suggestion of Perrin Harkins off the modperl list, I'm
> submitting this as a bug.  Not sure if it is, but it appears so.  I've
> attached a quick sample script which simulates what happens in the
> template module where it seg. faults.  With this there is an example
> template which is simply read into a variable and matched against the
> expression.
> 
> The syntax for running this would be perl -w ./thtest.pl ./index.html; you
> can rewrite the paths for yourselves.  I find that it only seg. faults on
> files where it seems to match.
> 
> Tested with both perl 5.8.3 and 5.8.4, although I'll just include the
> output from 5.8.4 which I built with -DDEBUGGING:

Sorry about this report being ignored - there was some confusion over
whether you had included a sample index.html - it hadn't made it to the
list, but  when I checked on the RT website, it was sitting there!

The segfault is being caused by deep recursion in S_regmatch:

    ...
    #2373 0x08160eb6 in S_regmatch (my_perl=0x829c938, prog=0x8317010) at regexec.c:3291
    #2374 0x08160666 in S_regmatch (my_perl=0x829c938, prog=0x8317054) at regexec.c:3178
    #2375 0x08160472 in S_regmatch (my_perl=0x829c938, prog=0x8316ffc) at regexec.c:3123
    #2376 0x0815d083 in S_regtry (my_perl=0x829c938, prog=0x8316fb8, startpos=0x82fb338 "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">\n<html>\n<head>\n  <title>GMDB Main Page</title>\n", ' ' <repeats 72 times>, "\n", ' ' <repeats 15 times>, "\n "...) at regexec.c:2204

This is, unfortunately, the normal outcome for a regex that recurses
too deeply - do the way the current perl regex engine is implemented,
and can't be fixed without a major rewrite.

My only vague handwaving advice would be to rewrite the regex in such a
way that it avoids the deep back-tracking and recursion. In particular,
I suspect that if you could make the '3000' a lot smaller in

    ( (?: \\. | [^\$] ){1,3000} ) 

The problem might go away.

Dave.

-- 
Red sky at night - gerroff my land!
Red sky at morning - gerroff my land!
    -- old farmers' sayings #14

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About