develooper Front page | perl.perl5.porters | Postings from October 2017

[perl #129158] null ptr deref, segfault in Perl_pp_split () atpp.c:5738

Thread Next
From:
James E Keenan via RT
Date:
October 28, 2017 13:30
Subject:
[perl #129158] null ptr deref, segfault in Perl_pp_split () atpp.c:5738
Message ID:
rt-4.0.24-13181-1509197431-1190.129158-15-0@perl.org
On Mon, 26 Sep 2016 11:58:56 GMT, davem wrote:
> On Mon, Sep 12, 2016 at 03:23:07PM +0100, Dave Mitchell wrote:
> > On Sun, Sep 11, 2016 at 10:17:39PM -0700, Father Chrysostomos via RT
> > wrote:
> > > On Thu Sep 01 01:49:12 2016, brian.carpenter@gmail.com wrote:
> > > > Perl v5.25.5 (v5.25.4-25-g109ac34*), found with AFL + ASAN. A
> > > > non-
> > > > instrumented build of
> > > > v5.25.4-5-g92d73bf returns the valgrind output at the end.
> > >
> > > I can reproduce it on dromedary, but not locally.  On dromedary I
> > > don’t have a functional gdb, so it’s a little hard to debug.
> > >
> > > I tried bisecting, but got perl-5.6.0-4727-g4cddb5c, which seems
> > > like a red herring.
> > >
> > > I managed to reduce it to this:
> > >
> > > $ cat foo
> > > map{s///o > split 0,split /0/>0}<DATA>__END__
> >
> > It's only an issue on non-threaded builds. It started failing
> > sometime
> > between 5.16.0 and 5.18.0.
> > I'm looking into it.
> 
> It turns out that this is another variant of RT #124368: on non-
> threaded
> builds, the s///o combines: modifying the op tree under /o to avoid
> recompilation on subsequent iterations; and m// using the last
> successful
> pattern. The combination of the two makes /o end up modifying some
> random
> other part of the op tree.
> 
> I haven't fixed this yet, but while investigating what was going on, I
> decided I would finally sort out the horrible OP_SPLIT/OP_PUSHRE mess,
> which I have now done and am smoking as smoke-me/davem/pushre2.
> 
> The most significant commit from that branch is the following, which
> I intend to merge into blead in few days time. Note this it is *not* a
> fix
> for this ticket.
> 

Dave, Father C, et al.,

Could we get an update on the status of this ticket?

> commit e505c6fa5fa71fde220a84dd42edbcb3b3c82566
> Author:     David Mitchell <davem@iabyn.com>
> AuthorDate: Thu Sep 15 10:59:37 2016 +0100
> Commit:     David Mitchell <davem@iabyn.com>
> CommitDate: Sun Sep 25 07:57:49 2016 +0100
> 
> make OP_SPLIT a PMOP, and eliminate OP_PUSHRE
> 
> Most ops that execute a regex, such as match and subst, are of type
> PMOP.
> A PMOP allows the actual regex to be attached directly to that op, due
> to its extra fields.
> 
> OP_SPLIT is different; it is just a plain LISTOP, but it always has an
> OP_PUSHRE as its first child, which *is* a PMOP and which has the
> regex
> attached.
> 
> At runtime, pp_pushre()'s only job is to push itself (i.e. the current
> PL_op) onto the stack. Later pp_split() pops this to get access to the
> regex it wants to execute.
> 
> This is a bit unpleasant, because we're pushing an OP* onto the stack,
> which is supposed to be an array of SV*'s. As a bit of a hack, on
> DEBUGGING builds we push a PVLV with the PL_op address embedded
> instead,
> but this still isn't very satisfactory.
> 
> Now that regexes are first-class SVs, we could push a REGEXP onto the
> stack rather than PL_op. However, there is an optimisation of @array =
> split which eliminates the assign and embeds the array's GV/padix
> directly
> in the PUSHRE op. So split still needs access to that op. But the
> pushre
> op will always be splitop->op_first anyway, so one possibility is to
> just
> skip executing the pushre altogether, and make pp_split just directly
> access op_first instead to get the regex and @array info.
> 
> But if we're doing that, then why not just go the full hog and make
> OP_SPLIT into a PMOP, and eliminate the OP_PUSHRE op entirely: with
> the
> data that was spread across the two ops now combined into just the one
> split op.
> 
> That is exactly what this commit does.
> 
> For a simple compile-time pattern like  split(/foo/, $s, 1), the
> optree
> looks like:
> 
> before:
>     <@> split[t2] lK
>        </> pushre(/"foo"/) s/RTIME
>        <0> padsv[$s:1,2] s
>        <$> const(IV 1) s
> 
> after:
>     </> split(/"foo"/)[t2] lK/RTIME
>        <0> padsv[$s:1,2] s
>        <$> const[IV 1] s
> 
> while for a run-time expression like split(/$pat/, $s, 1),
> 
> before:
>     <@> split[t3] lK
>        </> pushre() sK/RTIME
>           <|> regcomp(other->8) sK
>              <0> padsv[$pat:2,3] s
>        <0> padsv[$s:1,3] s
>        <$> const(IV 1)s
> 
> after:
>     </> split()[t3] lK/RTIME
>        <|> regcomp(other->8) sK
>           <0> padsv[$pat:2,3] s
>        <0> padsv[$s:1,3] s
>        <$> const[IV 1] s
> 
> This makes the code faster and simpler.
> 
> At the same time, two new private flags have been added for OP_SPLIT -
> OPpSPLIT_ASSIGN and OPpSPLIT_LEX - which make it explicit that the
> assign op has been optimised away, and if so, whether the array is
> lexical.
> 
> Also, deparsing of split has been improved, to the extent that
> 
> perl TEST -deparse op/split.t
> 
> now passes.
> 
> Also, a couple of panic messages in pp_split() have been replaced with
> asserts().


-- 
James E Keenan (jkeenan@cpan.org)

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=129158

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About