> [nicholas - Tue Dec 27 12:41:58 2005]:
>
> On Tue, Dec 27, 2005 at 08:43:10PM +0800, Ivan, Wong Yat Cheung wrote:
> > John W. Krahn via RT wrote:
> > >Hint: The alternation (?:\\"|[^"]) backtracks for every position
> that does
> > >not match the pattern '\"' which in your case is a LOT of
> backtracking.
> > >Put
> > >the alternation that will match the most first, for example:
> > >(?:[^\\"]|\\").
> > Actually that regex is not writen by me, I was just told to find why
> it
> > doesn't work. I am now using /msgid\s+((?:".*(?<!\\\\)"\s*)+)/ and
> it
> > works. However, I think I need to report this since no matter how
> bad a
> > regex is, it should NOT segfault. Anyway, for our case,
> (?:[^\\"]|\\")
>
> You are correct that it should not segfault. It's a known bug of the
> current
> perl regexp implementation that it can segfault because the C code
> recurses
> too deeply and exhausts the system stack. Fixing this is not trivial,
> which is
> why it's not been done yet.
>
It has been done now, and it resolves this problem.
steve@kirk:~/smoke/perl-current$ perl rt_38031.pl
Segmentation fault
steve@kirk:~/smoke/perl-current$ ./perl rt_38031.pl
Got one
steve@kirk:~/smoke/perl-current$ cat rt_38031.pl
#!/usr/bin/perl -w
my $many = 150;
my $line1 = <DATA>;
my $line2 = <DATA>;
my $lines = join( "", $line1, $line2 x $many );
while ($lines =~ /(msgid\s+(?:"(?:\\"|[^"])*"\s*)+)/gxm) {
print "Got one\n";
}
__DATA__
msgid
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"