develooper Front page | perl.perl5.porters | Postings from May 2012

[perl #112790] Regexp engine cannot match >2GB strings

From:
David Leadbeater
Date:
May 6, 2012 11:20
Subject:
[perl #112790] Regexp engine cannot match >2GB strings
Message ID:
rt-3.6.HEAD-4610-1336328393-36.112790-75-0@perl.org
# New Ticket Created by  David Leadbeater 
# Please include the string:  [perl #112790]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=112790 >


Matching unexpectedly fails when the string is longer than I32. The
following fixes it, but I see a lot of I32 in the regexp engine itself so
this might be masking other issues (see also RT #72784).

diff --git a/pp_hot.c b/pp_hot.c
index 89165d9..662b908 100644
--- a/pp_hot.c
+++ b/pp_hot.c
@@ -1303,7 +1303,7 @@ PP(pp_match)
        rx = PM_GETRE(pm);
     }

-    if (RX_MINLEN(rx) > (I32)len)
+    if ((STRLEN)RX_MINLEN(rx) > len)
        goto failure;

     truebase = t = s;

Reproduce with:

$ perl -Mre=debug -le'$a="x"x 1048576; $b.=$a for 1 .. 2047; $b.="y"; print
length $b; print $b =~ /y/ ? "Matched" : "No match"'
Compiling REx "y"
Final program:
   1: EXACT <y> (3)
   3: END (0)
anchored "y" at 0 (checking anchored isall) minlen 1
2146435073
Guessing start of match in sv for REx "y" against
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"...
Found anchored substr "y" at offset 2146435072...
Starting position does not contradict /^/m...
Guessed: match at offset 2146435072
Matched
Freeing REx: "y"

$ perl -Mre=debugcolor -le'$a="x"x 1048576; $b.=$a for 1 .. 2048; $b.="y";
print length $b; print $b =~ /y/ ? "Matched" : "No match"'
Compiling REx "y"
Final program:
   1: EXACT <y> (3)
   3: END (0)
anchored "y" at 0 (checking anchored isall) minlen 1
2147483649
No match
Freeing REx: "y"



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About