develooper Front page | perl.perl5.porters | Postings from September 2014

[perl #122747] Assertion failed in Perl_reg_numbered_buff_fetch, file regcomp.c, line 7459

Thread Previous | Thread Next
From:
Father Chrysostomos via RT
Date:
September 11, 2014 15:30
Subject:
[perl #122747] Assertion failed in Perl_reg_numbered_buff_fetch, file regcomp.c, line 7459
Message ID:
rt-4.0.18-10359-1410449440-1071.122747-15-0@perl.org
On Thu Sep 11 06:07:03 2014, mmartinec wrote:
> Got it down to this small test program:
> 
> #!/usr/bin/perl
> 
> use strict;
> use re 'taint';
> 
> my(@body) = (
>    "<mailto:xxxx.xxxx\@outlook.com>",
>    "A\x{B9}ker\x{E8}eva xxxx.xxxx\@outlook.com \x{201D}",
> );
> 
> for (@body) {
>    s{ <? (?<!mailto:) \b ( [a-z0-9.]+ \@ \S+ ) \b
>       (?: > | \s{1,10} (?!phone) [a-z]{2,11} : ) }{ }xgi;
> }
> 
> 
> perl 5.20.{0,1} :
> Assertion failed: ((STRLEN)rx->sublen >= (STRLEN)((s - rx->subbeg) + 
> i)), function Perl_reg_numbered_buff_fetch, file regcomp.c, line 7455.
> Abort trap
> 

I think what’s happening is that the kludge to localise $1, etc. is executed when the regexp is in an inconsistent state.  rx->subbeg is referring to the string from the previous match ('<mailto:xxxx.xxxx@outlook.com>'), but the offsets for $1 extend beyond the end of the 30-character string:

(gdb) p rx->offs[1]
$8 = {
  start = 12, 
  end = 33, 
  start_tmp = 12
}

A watchpoint on rx->offs shows that it gets swapped out here in regexec.c:

2706	        swap = prog->offs;
2707	        /* do we need a save destructor here for eval dies? */
2708	        Newxz(prog->offs, (prog->nparens + 1), regexp_paren_pair);
2709		DEBUG_BUFFERS_r(PerlIO_printf(Perl_debug_log,
2710		    "rex=0x%"UVxf" saving  offs: orig=0x%"UVxf" new=0x%"UVxf"\n"

when the backtrace is like this:

#0  Perl_regexec_flags (my_perl=0x100803200, rx=0x10082fdf8, stringarg=0x10060b658 "A¹kerèeva xxxx.xxxx@outlook.com ”", strend=0x10060b67d "", strbeg=0x10060b658 "A¹kerèeva xxxx.xxxx@outlook.com ”", minend=0, sv=0x1008063e8, data=0x0, flags=1) at regexec.c:2709
#1  0x0000000100247f3f in Perl_pp_subst (my_perl=0x100803200) at pp_hot.c:2120
#2  0x00000001001b847c in Perl_runops_debug (my_perl=0x100803200) at dump.c:2231
#3  0x000000010000a8ea in S_run_body (my_perl=0x100803200, oldscope=1) at perl.c:2416
#4  0x0000000100009905 in perl_run (my_perl=0x100803200) at perl.c:2339
#5  0x0000000100072698 in main (argc=3, argv=0x7fff5fbffa78, env=0x7fff5fbffa98) at miniperlmain.c:120

So the ordering of some of this stuff needs to be rethought.

A git bisect points me to this commit:

commit 44a2ac759eaf811ea851bdf9177a51bf9b95b5ce
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Dec 29 22:45:51 2006 +0100

    Re: [PATCH] Change implementation of %+ to use a proper tied hash interface and add support for %-
    Message-ID: <9b18b3110612291245q792fe91cu69422d2b81bb4f0b@mail.gmail.com>

But I think it’s a false positive.

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=122747

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About