develooper Front page | perl.perl5.porters | Postings from March 2006

[perl #6844] core dump using a Perl regular expression

From:
Steve Peters via RT
Date:
March 29, 2006 10:31
Subject:
[perl #6844] core dump using a Perl regular expression
Message ID:
rt-3.0.11-6844-131669.4.82659517176621@perl.org
> [spadkins@officevision.com - Sat Apr 21 01:30:54 2001]:
> 
> 
> -----------------------------------------------------------------
> [Please enter your report here]
> 
> Hi,
> 
> I sent this message to the Template Toolkit (from CPAN) mailing
> list, but it is really a problem independent of that module in
> the core perl interpreter.
> 
> Stephen
> _____________________________
> 
> Hi,
> 
> Great work on the Template Toolkit.
> I have been using it for a short time and have been very successful
> with it.  However, I found that processing a certain file (with no
> variables or directives!!) was causing perl to core dump !?!
> Now I know that no perl code should *ever* cause the perl interpreter
> to core dump, so I suspected a bug in Perl or in the way that Perl
> was compiled on my machine.
> 
> I was using Perl 5.6.0 on a VALinux box (running RedHat 6.0,
>    enhanced).
> 
>    uname -a
>    Linux shark 2.2.14-VA.5.1smp #1 SMP Tue Sep 12 18:02:03 PDT 2000
>    i686 unknown
> 
> The perl was compiled to use the libperl.so shared library.
> 
> The first thing I did was upgrade to TT 2.01. ... same core dump.
> 
> The second thing I did was upgrade Perl to 5.6.1, and I built it with
>    all
> of the default options (hands off configuration), which resulted in a
> statically linked perl binary. ... same core dump.
> 
> The third thing I did was trace into the Template Toolkit modules to
>    see
> exactly where it was core dumping.  In Template::Parser.pm, in the
> interpolate_text() method, there is a regular expression used (line
>    371)
> which is causing the core dump.
> 
>     while ($text =~
>            /
>            ( (?: \\. | [^\$] )+ )   # escaped or non-'$' character
>    [$1]
>            |
>            ( \$ (?:                 # embedded variable
>    [$2]
>              (?: \{ ([^\}]*) \} )   # ${ ... }
>    [$3]
>              |
>              ([\w\.]+)              # $word
>    [$4]
>              )
>            )
>         /gx) {
> 
> This code is simply parsing through the text of the template that it
>    has
> been given, dividing it into chunks of text and "$var" or "${var}"
>    variables
> which divide the text into chunks.
> 
> I was still doubting that this problem was anything other than my own
> fault, so I continued to diagnose.
> 
> The fourth thing I did was to create a perl script which would
>    recreate
> this error independently of the Template Toolkit, using the offending
> expression.  I created a script called "ptest", which begins as
>    follows.
> 
> ::::::::::::::
> ptest
> ::::::::::::::
> #!/usr/local/bin/perl
> 
> $text = join("",<DATA>);
> 
>     while ($text =~
>            /
>            ( (?: \\. | [^\$] )+ )   # escaped or non-'$' character
>    [$1]
>            |
>            ( \$ (?:                 # embedded variable
>    [$2]
>              (?: \{ ([^\}]*) \} )   # ${ ... }
>    [$3]
>              |
>              ([\w\.]+)              # $word
>    [$4]
>              )
>            )
>         /gx) {
> 
>         ($pre, $var, $dir) = ($1, $3 || $4, $2);
>         print "pre=$pre var=$var dir=$dir\n";
> 
>     }
> 
> __DATA__
> 
> So I put the offending input data after the __DATA__, and I was able
> to recreate the core dump.  I also tried other sets of data after the
> __DATA__ and got normal behavior.  Now I had a simple, isolated test
> case which I could email to someone for diagnosis!  This "ptest"
> script is attached at the end of this email in uuencoded form.
> 
> The fifth thing I did was to transfer this script to a Sun box,
> and sure enough, it core dumped there too!  This led me to a
> strong suspicion that there was in fact a bug in the Perl regular
> expression library, which unfortunately, causes the Template Toolkit
> to fail on occasion.  Details of the Sun box and its perl
> interpreter are as follows.
> 
>    www:/home/spadkins> uname -a
>    SunOS www 5.7 Generic_106541-14 sun4u sparc SUNW,Ultra-60
>    www:/home/spadkins> perl --version
>    This is perl, v5.6.0 built for sun4-solaris
> 
> The sixth thing I did was to try to find out if I could modify
> the regular expression so that it did not cause the error
> (i.e. isolate the problem, even if the result would not work
> for the purposes of the Template Toolkit).
> 
> So I changed the following Reg Exp phrase from
> 
>            ( (?: \\. | [^\$] )+ )   # escaped or non-'$' character
>    [$1]
> to
>            ( (?: [^\$] )+ )   # escaped or non-'$' character [$1]
> 
> and the core dump went away.  I know that this will not solve anything
> for the Template Toolkit, because we need that expression to allow us
> to escape "$" signs ("\$") in the text.  But it might help us figure
> out a work-around.
> 
> Anyway, this is where I ran out of steam, because I am not such a
> regular expression guru that I could proceed and find a work-around.
> 
> That's when I turned to this mailing list.
> It seems to me that two things could happen.
> 
>  * Someone might be able to find a work-around so that the
>    Template Toolkit does not need to use the buggy Perl syntax
>    (That would be people on this list.)
> 
>  * Someone who works with Perl could investigate why the seemingly
>    legal regular expression is causing a core dump.
>    (Does someone on this list know whom I should forward this email
>    to for that?)
> 
> Again, thanks for all the good work.
> 
> Stephen

This problem appears to have been resolved with change #27598.





nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About