develooper Front page | perl.perl5.porters | Postings from March 2016

Re: FYI - PM posted about a horrible List::Util::first bug

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
March 20, 2016 17:40
Subject:
Re: FYI - PM posted about a horrible List::Util::first bug
Message ID:
20160320174000.GK29332@iabyn.com
On Thu, May 28, 2015 at 11:09:09AM +0100, Dave Mitchell wrote:
> On Wed, May 27, 2015 at 09:30:22PM +0100, Paul "LeoNerd" Evans wrote:
> > What's the conclusion here? Do I have to fix any code in ListUtils.xs
> > then?
> 
> Well I think the provisional conclusion is that lc() et al are at fault.
> As to whether List::Util is doing anything wrong,  the two differences
> between the first() implementation and perl's grep() implementation are:
> 
> 1) grep temporarily increases the ref count of each SV it aliases to $_,
> via:
> 
>     # define DEFSV_set(sv) \
>         (SvREFCNT_dec(GvSV(PL_defgv)), GvSV(PL_defgv) = SvREFCNT_inc(sv))
> 
> 2) grep makes a mortal copy of each PADTMP it encounters.
> 
> Either of those would make List::Util work with the bad lc/uc/etc in 5.20.0
> and 5.22.0.
> 
> The second of those was to fix
> 
>     #78194: Referencing a PADTMP twice produces two copies
> 
> (which I don't fully understand from a cursory read of the ticket).
> 
> The first, if nothing else, avoids crashes if the thing being iterated
> is freed elsewhere, e.g.
> 
>     use List::Util qw{ first };
>     use Devel::Peek;
>     my $a = "A";
>     @a = (1);
>     @b = first { shift @a; Dump $_ } @a;
> 
> which outputs:
> 
>     SV = UNKNOWN(0xff) (0x1b19038) at 0x1af8f88
>       REFCNT = 0
>       FLAGS = ()

I've just pushed a fix as smoke-me/davem/lc_fix. I propose that this
gets merged before 5.24.0 (despite the code freeze).

Here's the commit message:

commit 8603ce1e371b120cca063739d86827130f272354
Author:     David Mitchell <davem@iabyn.com>
AuthorDate: Sun Mar 20 17:12:13 2016 +0000
Commit:     David Mitchell <davem@iabyn.com>
CommitDate: Sun Mar 20 17:34:08 2016 +0000

    stop lc() etc accidentally modifying in-place.
    
    As an optimisation, [ul]c() and [ul]cfirst() sometimes modify their
    argument in-place rather than returning a modified copy.
    
    This should only be done when there is no possibility that the arg is
    going to be reused. However, this fails:
    
        use List::Util qw{ first };
        my %hash = ( ASD => 1, ZXC => 2, QWE => 3, TYU => 4);
        print first { lc $_ eq 'qwe' } keys %hash;
    
    which prints "qwe" rather than "QWE".
    
    Bascally everything in perl that sets $_ or $a/$b and calls a code block
    or function, such as map, grep, for and, sort, either copies any PADTMPs,
    turns off SvTEMP, and/or bumps the reference count.
    
    List::Util doesn't do this, and it is likely that other CPAN modules
    which do "set $_ and call a block" don't either.
    
    This has been failing since 5.20.0: perl has been in-placing if the arg is
    (SvTEMP && RC==1 && !mg) (due to v5.19.7-112-g5cd5e2d).
    
    Make the optimisation critera stricter by always copying SvTEMPs.
    It still allows the optimisation if the arg is a PADTMP - I don't know
    whether this is unsafe too.
    
    Perhaps we can think of something better after 5.24?


-- 
A power surge on the Bridge is rapidly and correctly diagnosed as a faulty
capacitor by the highly-trained and competent engineering staff.
    -- Things That Never Happen in "Star Trek" #9

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About