develooper Front page | perl.perl5.porters | Postings from August 2016

Re: [perl #39358] sort with custom subname and prototype ($$)segfaults intermittently

Thread Previous | Thread Next
Dave Mitchell
August 11, 2016 08:11
Re: [perl #39358] sort with custom subname and prototype ($$)segfaults intermittently
Message ID:
On Wed, Aug 10, 2016 at 05:03:28PM -0400, John P. Linderman wrote:
> #!/usr/bin/perl -w
> @scores = (
>   '0000.0000.00039:0000.0008.0031.',
>   '0000.0000.00032:0000.0008.0024.',
>   '0002.0002.00033:0002.0011.0020.',
>   '0028.0028.00190:0077.0085.0028.');
> @match_indices = (0,1,2,3);
> sub sort_by_index($$) {
>     my($A,$B) = @_;
>     return $scores[$match_indices[$A]] cmp
>    $scores[$match_indices[$B]];
> }
> @match_indices = sort sort_by_index @match_indices;
> This could be "fixed" at the expense of allocating *two* extra arrays
> instead of just one (which is already more than quicksort, which sorts "in
> place").

I don't understand any of the above sentence. Its just been fixed, so why
is it still "could be"?  No extra arrays are created, and I don't
understand why creating an extra array would be of benefit.

> But I wonder why people think that the contents of any array in
> the process of being sorted can participate in any meaningful way in a
> comparison. Must all sort implementations guarantee that the intermediate
> contents of the array to be sorted are some permutation of the original
> array?

In this code:

    @a = sort {...} @a

at the logical perl level, a list is created from the contents of @a,
sort returns a sorted version of that list, then that list is assigned
to @a. If @a gets modified during the sort itself (and before the
assignment), then that's internal optimisation details leaking out.

The internal quicksort and mergesort implementations sort a (C) array
of SV pointers - usually that array is just a chunk of perl's argument
stack. Before my commit, they someones also worked directly on an AvARRAY
array, which turned out to be a bad idea.

> I don't like the idea of penalizing *all* users for the dubious
> intent of *some* users. Segfaults are evil, of course. A better solution
> would be to somehow mark an array as "off limits" while a sort is in
> progress, but that's way beyond my pay grade.

The previous code temporarily marked the array as readonly, which stopped
modification, but didn't stop access. The only way to stop reading would
be to temporarily tie the array. But this wuld be semantically wrong:
there's nothing to imply in the expression

    something = sort { ... } @a

that accessing @a during the sort would be an error.

"But Sidley Park is already a picture, and a most amiable picture too.
The slopes are green and gentle. The trees are companionably grouped at
intervals that show them to advantage. The rill is a serpentine ribbon
unwound from the lake peaceably contained by meadows on which the right
amount of sheep are tastefully arranged." -- Lady Croom, "Arcadia"

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About