develooper Front page | perl.perl5.porters | Postings from August 2016

Re: [perl #39358] sort with custom subname and prototype ($$)segfaults intermittently

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
August 10, 2016 15:36
Subject:
Re: [perl #39358] sort with custom subname and prototype ($$)segfaults intermittently
Message ID:
20160810153639.GA3173@iabyn.com
On Sun, Sep 26, 2010 at 02:39:49PM -0700, Father Chrysostomos via RT wrote:
> On Thu Jun 08 21:23:40 2006, johnh@isi.edu wrote:
> > #!/usr/bin/perl -w
> > 
> > @scores = (
> >   '0000.0000.00039:0000.0008.0031.',
> >   '0000.0000.00032:0000.0008.0024.',
> >   '0002.0002.00033:0002.0011.0020.',
> >   '0028.0028.00190:0077.0085.0028.');
> > 
> > @match_indices = (0,1,2,3);
> > sub sort_by_index($$) {
> >     my($A,$B) = @_;
> >     return $scores[$match_indices[$A]] cmp
> >    $scores[$match_indices[$B]];
> > }
> > @match_indices = sort sort_by_index @match_indices;
> > ----------------------------------------
> 
> This gives me:
> Bizarre copy of UNKNOWN in aelem at - line 12.
> in commit 42bd538f820268331bc55a66fa9df0673de69250
> 
> d963bf01c4c4db296760b1148f98bf668efcaf58, three commits later (the
> intervening commits affecting only documentation), gives 
> Use of uninitialized value $match_indices[2] in array element at - line 12.
> Use of uninitialized value $match_indices[2] in array element at - line 12.
> Use of uninitialized value $match_indices[0] in array element at - line 12.
> 
> but I can’t see how it fixes it.

Fixed now in blead by the following. The issue was as the bit below about
mergesort leaving random pointers lying about.

commit 84721d614eb7d9835d9a09505b0001c7be40a865
Author:     David Mitchell <davem@iabyn.com>
AuthorDate: Wed Aug 10 15:12:56 2016 +0100
Commit:     David Mitchell <davem@iabyn.com>
CommitDate: Wed Aug 10 16:34:04 2016 +0100

    Partially pessimise in-place sorting
    
    There's currently an optimisation that converts at compile-time
    
        @a = sort { .... } @a
    
    into (approximately)
    
        sort { ... } \@a
    
    Then at run time, rather than passing an svp pointer to the appropriate
    sort routine which points to a list of SV*'s on the stack, pp_sort()
    passes a pointer to @a's AvARRAY. This allows the array to be sorted
    in-place, which is more efficient.
    
    However, it has some issues. First, the @a visible to the sort routine
    will be varying, whereas logically it should still hold the original list
    of values until after the '@a = ...' assignment.
    
    Secondly, the mergesort algorithm cureently used internally, when in
    mid-sort, temporarily stores pointers in the array which aren't pointers
    to SVs - this means that if @a elements are accessed mid-sort, it can
    crash.
    
    The solution to both these problems is for pp_sort() to push the elements
    of @a onto the stack at the beginning, sort the stack (like normal sorts
    do), then copy back to @a at the end. This is less efficient than before,
    but is still a lot more efficient than executing separate padav and
    aassign ops.
    
    Here are benchmark results in raw instruction counts etc (lower is better)
    for the sort line in this code block:
    
        my (@a, @b);
        @a = reverse 1..10;
        @b = sort { $a <=> $b } @a;
    
    A is for a non-in-place sort, i.e. @b = sort ... @a;
    B and C are for an inline sort, i.e. as above, but  @a = sort ... @a;
    where B is blead before this commit and C is this commit.
    
                     A       B       C
                ------  ------  ------
            Ir  5238.0  2324.0  2772.0
            Dr  1464.0   649.0   765.0
            Dw   919.0   298.0   370.0
          COND   782.0   320.0   405.0
           IND    25.0    25.0    26.0
    
        COND_m    14.9    13.0    17.0
         IND_m     8.0     5.0     5.0
    
    As can be seen, this partial pessimisation slows down in-place sorting by
    round 20%, but overall in-place is still nearly twice the speed as without
    the optimisation.
    
    These are the figures for a plain numeric sort (which is optimised to use
    a C comparison function); for other types of sort, the cost of the
    comparator dominates, and so the slowdown is much less marked.


-- 
"But Sidley Park is already a picture, and a most amiable picture too.
The slopes are green and gentle. The trees are companionably grouped at
intervals that show them to advantage. The rill is a serpentine ribbon
unwound from the lake peaceably contained by meadows on which the right
amount of sheep are tastefully arranged." -- Lady Croom, "Arcadia"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About