develooper Front page | perl.perl5.porters | Postings from February 2018

Re: [perl #132595] multiconcat reading and fetching wrong order

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
February 20, 2018 09:21
Subject:
Re: [perl #132595] multiconcat reading and fetching wrong order
Message ID:
20180220092117.GF3216@iabyn.com
On Sun, Dec 17, 2017 at 05:29:41AM -0800, Zefram wrote:
> The multiconcat op is stated to replicate the order of magic handling
> and so on that was exhibited by the unoptimised concat op.  But here's
> a case where the replication is not exact:
> 
> $ perl5.27.5 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$::i; $::foo = "b".$::i; "c".$::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar'
> b1c1b1c2
> $ perl5.27.6 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$::i; $::foo = "b".$::i; "c".$::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar'
> ac1b2c2
> 
> Blead behaves the same as 5.27.6.  I wouldn't consider the new
> behaviour wrong in itself; the issue is just that it's not replicating
> the unoptimised behaviour, and so multiconcat is failing in its aim of
> being a very pure optimisation.
> 
> FWIW, the behaviour of the same test case has changed before:
> 
> 5.6.0 to 5.7.1: ac1b1c2
> 5.8.0 to 5.8.8, 5.9.0 to 5.9.2: b1c1b1c2
> 5.8.9, 5.9.3 to 5.13.1: ac1b1c2
> 5.13.2 to 5.27.5: b1c1b1c2

Now fixed with:

commit af39014264c90cfc5a35e4f6e39ba038a8fb0c29
Author:     David Mitchell <davem@iabyn.com>
AuthorDate: Sat Feb 10 15:27:59 2018 +0000
Commit:     David Mitchell <davem@iabyn.com>
CommitDate: Mon Feb 19 22:06:49 2018 +0000

    redo magic/overload handing in pp_multiconcat
    
    The way pp_multiconcat handles things like tieing and overloading
    doesn't work very well at the moment. There's a lot of code to handle
    edge cases, and there are still open bugs.
    
    The basic algorithm in pp_multiconcat is to first stringify (i.e. call
    SvPV() on) *all* args, then use the obtained values to calculate the total
    length and utf8ness required, then do a single SvGROW and copy all the
    bytes from all the args.
    
    This ordering is wrong when variables with visible side effects, such as
    tie/overload, are encountered. The current approach is to stringify args
    up until such an arg is encountered, concat all args up until that one
    together via the normal fast route, then jump to a special block of code
    which concats any remaining args one by one the "hard" way, handling
    overload etc.
    
    This is problematic because we sometimes need to go back in time. For
    example in ($undef . $overloaded), we're supposed to call
        $overloaded->concat($undef, reverse=1)
    so to speak, but by the time of the method call, we've already tried to
    stringify $undef and emitted a spurious 'uninit var' warning.
    
    The new approach taken in this commit is to:
    
    1) Bail out of the stringify loop under a greater range of problematical
    variable classes - namely we stop when encountering *anything* which
    might cause external effects, so in addition to tied and overloaded vars,
    we now stop for any sort of get magic, or any undefined value where
    warnings are in scope.
    
    2) If we bail out, we throw away any stringification results so far,
    and concatenate *all* args the slow way, even ones we're already
    stringified. This solves the "going back in time" problem mentioned above.
    It's safe because the only vars that get processed twice are ones for which
    the first stringification could have no side effects.
    
    The slow concat loop now uses S_do_concat(), which is a new static inline
    function which implements the main body of pp_concat() - so they share
    identical code.
    
    An intentional side-effect of this commit is to fix three tickets:
    
        RT #132783
        RT #132827
        RT #132595
    
    so tests for them are included in this commit.
    
    One effect of this commit is that string concatenation of magic or
    undefined vars will now be slower than before, e.g.
    
        "pid=$$"
        "value=$undef"
    
    but they will probably still be faster than before pp_multiconcat was
    introduced.


-- 
Please note that ash-trays are provided for the use of smokers,
whereas the floor is provided for the use of all patrons.
    -- Bill Royston

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About