develooper Front page | perl.perl5.porters | Postings from September 2014

Re: [perl #122283] Possible regexp memory explosion in 5.20.0

Thread Previous | Thread Next
From:
demerphq
Date:
September 25, 2014 07:42
Subject:
Re: [perl #122283] Possible regexp memory explosion in 5.20.0
Message ID:
CANgJU+UHTvAoqsiyu1enocsC2nRRjQcsBBc95uwELNonxUJuBQ@mail.gmail.com
On 13 July 2014 16:27, Hugo van der Sanden <perlbug-followup@perl.org>
wrote:

> # New Ticket Created by  Hugo van der Sanden
> # Please include the string:  [perl #122283]
> # in the subject line of all future correspondence about this issue.
> # <URL: https://rt.perl.org/Ticket/Display.html?id=122283 >
>
>
>
> This is a bug report for perl from hv@crypt.org,
> generated with the help of perlbug 1.40 running under perl 5.20.0.
>
>
> -----------------------------------------------------------------
> [Please describe your issue here]
>
> I've been experimenting with an attempt to take a SQL grammar expressed
> in BNF and convert it (programmatically) into something that can parse
> SQL with it as a Regexp::Grammars (v1.035) grammar.
>
> The code below is (60%) cut down from an interim stage in that process;
> this reaches about 10MB process size under perl-5.16.3; under perl-5.20.0
> it grows to over 1GB. Cutting down the grammar rule by rule does gradually
> reduce the memory use, but it remains a high multiple of the memory use
> under perl-5.16.3, and I've not yet found any smoking gun; I've included
> the full 200-odd lines here rather than risk eliding something important.
>
> Damain and I are looking into it, but he suggested I perlbug it as a
> heads-up of a possible problem in 5.20, likely of interest to davem
> as potentially relating to regexp engine changes.
>
> zen% ulimit -v # I've set a 1GB process-size limit
> 1000000
> zen% /usr/bin/time /opt/perl-5.16.3/bin/perl ./t0 # top(1) shows peak 10MB
> VIRT
> ok
> 8.52user 0.01system 0:08.54elapsed 99%CPU (0avgtext+0avgdata
> 34816maxresident)k
> 0inputs+0outputs (0major+2331minor)pagefaults 0swaps
> zen% /usr/bin/time /opt/perl-5.20.0/bin/perl ./t0
> Out of memory!
> Command exited with non-zero status 1
> 41.59user 2.10system 0:43.83elapsed 99%CPU (0avgtext+0avgdata
> 3641344maxresident)k
> 0inputs+0outputs (0major+228082minor)pagefaults 0swaps
> zen% cat t0
> #!/opt/perl-5.20.0/bin/perl
> use strict;
> use warnings;
> use Regexp::Grammars;
>
> my $g = qr{
> ^ <query_specification> $
>
> <rule: simple_Latin_letter> <simple_Latin_upper_case_letter> |
> <simple_Latin_lower_case_letter>
> <token: simple_Latin_upper_case_letter> A | B | C | D | E | F | G | H | I
> | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
> <token: simple_Latin_lower_case_letter> a | b | c | d | e | f | g | h | i
> | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
> <token: digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
>

You really shoud use character classes here, and not use regex subs for
insertable literals. IOW, (?&digit) should be replaced with $digit which
would be defined as:

$digit= "[0-9]"

Similar for (?&ws) and similar patterns.

Anyway, I have pushed the following commit which should fix this. Please
test.

commit a51d618a82a7057c3aabb600a7a8691d27f44a34
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Sep 19 19:57:34 2014 +0200

    rt 122283 - do not recurse into GOSUB/GOSTART when not SCF_DO_SUBSTR

    See also comments in patch. A complex regex "grammar" like that in
    RT 122283 causes perl to take literally forever, and exhaust all
    memory during the pattern optimization phase.

    Unfortunately I could not track down exacty why this occured, but
    it was very clear that the excessive recursion was unnecessary and
    excessive. By simply eliminating the unncessary recursion performance
    goes back to being acceptable.

    I have not thought of a good way to test this change, so this patch
    does not include any tests. Perhaps we can test it using alarm, but
    I will follow up on that later.

Ticket closers: please dont close the ticket until I have reported that I
have applied tests for this.

cheers,
Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About