develooper Front page | perl.perl5.porters | Postings from May 2009

Re: [perl #66110] Perl debugger runs out of memory, hangs or segfaults on XML::Parser::Lite

Thread Previous | Thread Next
From:
Craig A. Berry
Date:
May 30, 2009 07:39
Subject:
Re: [perl #66110] Perl debugger runs out of memory, hangs or segfaults on XML::Parser::Lite
Message ID:
c9ab31fc0905300739y530e4423k21d070a803c6b2ae@mail.gmail.com
On Thu, May 28, 2009 at 10:14 AM, Nicholas Clark
<perlbug-followup@perl.org> wrote:
> # New Ticket Created by  Nicholas Clark
> # Please include the string:  [perl #66110]
> # in the subject line of all future correspondence about this issue.
> # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=66110 >
>
>
> Avar mailed p5p in 51dd1af80807190107h30b8626ct6d4d0a825abe4b3b@mail.gmail.com
> http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2008-07/msg00382.html
>
> perl 5.10 and blead will do various combinations of running of of
> memory, hanging or segfaulting when running on a program using
> XML::Parser::Lite, attached is a stripped down version of X::P::L
> which demonstrates the problem:
>
>
> Dave notes:
>
> possibly a 5.10.0 regression
>
>

In Perl_regexec_flags (called from Perl_pp_match), after the got_it:
label, we end up calling savepvn repeatedly with the same large value
for the length.  The value of i when we hit the code below gets rather
large values, like 3,938,416 or 3,943,248 (it seems to keep growing).

        if (flags & REXEC_COPY_STR) {
            const I32 i = PL_regeol - startpos + (stringarg - strbeg);

< ifdef snipped>
            {
                RX_MATCH_COPIED_on(rx);
                s = savepvn(strbeg, i);
                prog->subbeg = s;
            }
            prog->sublen = i;
        }

Some values of interest, including those that make up i, are as follows:

REGEXEC\Perl_regexec_flags\my_perl->Ireg_state.re_state_regeol: 15380736
REGEXEC\Perl_regexec_flags\startpos:    11437488
REGEXEC\Perl_regexec_flags\stringarg:   11437488
REGEXEC\Perl_regexec_flags\strbeg:      11437488
REGEXEC\Perl_regexec_flags\strend:      11437502

The string we are currently matching is:

*PP_HOT\Perl_pp_match\s:        "<foo>bar</foo>"

To me that looks just a tad less than 3.9 million bytes :-), but it is
in fact the string that ends at the current value of strend.  The
string that ends at the current value of PL_regeol (aka
my_perl->Ireg_state.re_state_regeol) is "CODE(0xb18940)".  If I set a
watchpoint for PL_regeol, it keeps toggling back and forth between the
ends of these two different strings.  So it really looks as though two
different regex operations are going on at once in interleaved fashion
and keep hijacking the value of PL_regeol from each other.

I think that's about as far as I'm going to get with this but thought
I'd pass along my observations.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About