On Thu, May 28, 2009 at 10:14 AM, Nicholas Clark <perlbug-followup@perl.org> wrote: > # New Ticket Created by Nicholas Clark > # Please include the string: [perl #66110] > # in the subject line of all future correspondence about this issue. > # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=66110 > > > > Avar mailed p5p in 51dd1af80807190107h30b8626ct6d4d0a825abe4b3b@mail.gmail.com > http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2008-07/msg00382.html > > perl 5.10 and blead will do various combinations of running of of > memory, hanging or segfaulting when running on a program using > XML::Parser::Lite, attached is a stripped down version of X::P::L > which demonstrates the problem: > > > Dave notes: > > possibly a 5.10.0 regression > > In Perl_regexec_flags (called from Perl_pp_match), after the got_it: label, we end up calling savepvn repeatedly with the same large value for the length. The value of i when we hit the code below gets rather large values, like 3,938,416 or 3,943,248 (it seems to keep growing). if (flags & REXEC_COPY_STR) { const I32 i = PL_regeol - startpos + (stringarg - strbeg); < ifdef snipped> { RX_MATCH_COPIED_on(rx); s = savepvn(strbeg, i); prog->subbeg = s; } prog->sublen = i; } Some values of interest, including those that make up i, are as follows: REGEXEC\Perl_regexec_flags\my_perl->Ireg_state.re_state_regeol: 15380736 REGEXEC\Perl_regexec_flags\startpos: 11437488 REGEXEC\Perl_regexec_flags\stringarg: 11437488 REGEXEC\Perl_regexec_flags\strbeg: 11437488 REGEXEC\Perl_regexec_flags\strend: 11437502 The string we are currently matching is: *PP_HOT\Perl_pp_match\s: "<foo>bar</foo>" To me that looks just a tad less than 3.9 million bytes :-), but it is in fact the string that ends at the current value of strend. The string that ends at the current value of PL_regeol (aka my_perl->Ireg_state.re_state_regeol) is "CODE(0xb18940)". If I set a watchpoint for PL_regeol, it keeps toggling back and forth between the ends of these two different strings. So it really looks as though two different regex operations are going on at once in interleaved fashion and keep hijacking the value of PL_regeol from each other. I think that's about as far as I'm going to get with this but thought I'd pass along my observations.Thread Previous | Thread Next