develooper Front page | perl.perl5.porters | Postings from November 2014

[Fwd: Re: [perl #123198] Memory leak in regex appears in 5.20.1]

Thread Next
From:
arocker
Date:
November 15, 2014 04:27
Subject:
[Fwd: Re: [perl #123198] Memory leak in regex appears in 5.20.1]
Message ID:
535d795719b835787a36ffccfebfb0cb.squirrel@mail.vex.net
---------------------------- Original Message ----------------------------
From:    "Dave Mitchell via RT" <perlbug-followup@perl.org>
Date:    Fri, November 14, 2014 7:03 am
To:      arocker@vex.net
--------------------------------------------------------------------------

The poster in LinkedIn's "Perl" discussion group is Nimrod Shlomo Chotzen

Initial message:

I switched this week to using perl-5.20.1, and I try to run an old (but
working) script on it. in this loop in loop in one function which iterates
around 50000 time

foreach my $key_word (grep {/\w+/} @keywords) {
if ($string_to_check =~
m/^\Q$key_word\E$|^\Q$key_word\E[\W]|[\W]\Q$key_word\E[s\W]|[\W]\Q$key_word\E$|^\Q$key_word\Eies|[\W]\Q$key_word\Eies|^\Q$key_word\Ees|[\W]\Q$key_word\Ees/i){
$found_key_words{lc($key_word)}=1;
}

}

the memory usage increases very fast. at the next call of that function it
continues to increase, until it just freezes my machine.
-----------------------------------------------------------------------
Second:

I simplified the regexp to this:
$string_to_check =~ m/(?<=\W)\Q$key_word\E(?=\W|(s|es|ies\W))/i)
and now it's working great. Though I still wonder when will the next
eruption of memory occur :)
------------------------------------------------------------------------
Response to a request for sample data:

there are no two different datasets. just different perls 5.18.2 and
5-20.1(where the memory leak occurs) the dataset is very simple: about
50000 terms: one words,two words or three words.

something like: "lantern,green lantern,green lantern
movie,movie,reviews,bad reviews,bad movie,bad movie reviews"...I can't
give the real set...

-------------------------------------------------------------------------

It looks like a failure to free memory in the handling of regex patterns.
If that area was changed between 5.18.2 & 5.20, that's probably where it
is.
If this isn't sufficient, I'll try to elicit more clues.


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About