develooper Front page | perl.perl5.porters | Postings from February 2020

Re: 5.30.2 soon

Thread Previous
From:
Steve Hay via perl5-porters
Date:
February 7, 2020 08:10
Subject:
Re: 5.30.2 soon
Message ID:
CADED=K6xF1X3xZKJ=FGiFYOD9mU7O3GJE4j3MWSGmRTTZa8zcQ@mail.gmail.com
On Thu, 6 Feb 2020 at 17:13, Nicholas Clark <nick@ccl4.org> wrote:

> On Thu, Feb 06, 2020 at 01:41:21PM +0000, Steve Hay via perl5-porters
> wrote:
>
> > https://github.com/Perl/perl5/blob/maint-votes/votes-5.30.xml
> >
> > (This renders nicely when viewed in Firefox and some other browsers, but
> is
> > also readable as a text file.)
> >
> > If there are any more changes you think should be included that match the
> > criteria for back-porting set out in perlpolicy.pod then please let me
> know.
>
> At work we upgraded to 5.30.1 and hit the bug that was fixed by
>
> commit 3b2e5620ed4a6b341f97ffd1d4b6466cc2c4bc5b
> Author: Karl Williamson <khw@cpan.org>
> Date:   Fri Aug 23 12:40:24 2019 -0600
>
>     PATCH: [perl #134329] Use after free in regcomp.c
>
>     A compiled regex is composed of nodes, forming a linked list, with
>     normally a maximum of 16 bits used to specify the offset of the next
>     link.  For patterns that require more space than this, the nodes that
>     jump around are replaced with ones that have wider offsets.  Most nodes
>     are unaffected, as they just contain the offset of the next node, and
>     that number is always small.  The jump nodes are the ones affected.
>
>     When compiling a pattern, the 16 bit mechanism is used, until it
>     overflows, at which point the pattern is recompiled with the long jumps
>     instead.
>
>     When I rewrote the compiler last year to make it generally one pass, I
>     noticed a lot of the cases where a node was added didn't check if the
>     result overflowed (the function that does this returns FALSE in that
>     case).  I presumed the prior authors knew better, and did not change
>     things, except to put in a bogus value in the link (offset) field that
>     should cause a crash if it were used.  That's what's happening in this
>     ticket.
>
>     But seeing this example, it's clear that the return value should be
>     checked every time, because you can reach the limit at any time.  This
>     commit changes to do that, and to require the function's return value
> to
>     not be ignored, to guard against future changes.
>
>     My guess is that the reason it generally worked when there were
> multiple
>     passes is that the first pass didn't do anything except count space,
> and
>     that at some point before the end of the pass the return value did get
>     checked, so by the time the nodes were allocated for real, it knew
>     enough to use the long jumps.
>
>  MANIFEST                 |   1 +
>  embed.fnc                |   4 +-
>  proto.h                  |   8 +++-
>  regcomp.c                | 109
> ++++++++++++++++++++++++++++++++++-------------
>  t/re/bigfuzzy_not_utf8.t | Bin 0 -> 36399 bytes
>  5 files changed, 88 insertions(+), 34 deletions(-)
>
>
> I think that that meets the criteria in perlpolicy.pod [without me editing
> that file to fit :-)] but I haven't figured out if it really fits (or
> works).
>
> I've mitigated our problem (super-large machine generated regex, now 33%
> smaller*) so that we don't *need* this (currently), but it might be useful
> to go in, before others hit it, who don't have such an easy work around.
>
>
>
Thanks. It merges back with only a trivial conflict which I think isn't a
problem and all tests pass here. I'm not sure why I skipped over that one
when I trawled through commits looking for candidates. Maybe a change in
proto.h put me off, but I believe it's fine for back-porting so I've added
it to the voting file.

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About