develooper Front page | perl.perl5.porters | Postings from February 2003

Re: [perl #20667] unicode regex vs non-unicode regex

Thread Previous
From:
Nicholas Clark
Date:
February 3, 2003 08:07
Subject:
Re: [perl #20667] unicode regex vs non-unicode regex
Message ID:
20030203160648.U83537@plum.flirble.org
On Mon, Feb 03, 2003 at 11:15:33AM -0000, Jean-Paul Jorda wrote:

> UNDER LINUX, perl 5.6 (see config below)
> Running the following program leads to a 'Segmentation fault' :
> 
> 
> #! /usr/bin/perl -w
> 
> 
> $toto = 'Hello';
> $toto =~ /\w/; # this line provokes the problem!
> 
> 
> $name = 'A B';
> if ($name =~ /(\p{IsUpper}) (\p{IsUpper})/){
>     print "It's good! : $1 $2  \n";
> } else {
>     print "It's not good...\n";
> }
> # end of the program
> 
> 
> UNDER SunOS,with perl 5.8
> 
> This leads to "It's not good...",
> instead of the expected "It's good! A B". if I suppress 
> $toto =~ /\w/; 
> I get the right result.

On 5.6.1 on x86 Linux valgrind says:

==15740== Invalid read of size 4
==15740==    at 0x80D3826: S_find_byclass (in /usr/bin/perl5.6.1)
==15740==    by 0x80D36A7: Perl_re_intuit_start (in /usr/bin/perl5.6.1)
==15740==    by 0x809B74A: Perl_pp_match (in /usr/bin/perl5.6.1)
==15740==    by 0x809915F: Perl_runops_standard (in /usr/bin/perl5.6.1)
==15740==    Address 0x8 is not stack'd, malloc'd or free'd

The regexp bug provoked by the \w is still present at patch 18644
(ie current today) but the coredump is absent from 5.8.0 (onwards)
on Linux.

Nicholas Clark

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About