Front page | perl.perl5.porters |
Postings from June 2004
Short circuiting sort
From:
David L. Nicol
Date:
June 20, 2004 01:31
Subject:
Short circuiting sort
Message ID:
40D50DD9.3000803@umkc.edu
Nick Ing-Simmons wrote:
>(I would still like sort in scalar context to return min or max value
> - don't care which because one can change cmp function to get the other.)
>
Currently
($first) = sort @ary;
gives the first of the sorted list, after doing a mess of extra work.
If C<sort> were to
take a hint indicating how many results are going to be used, similar to
the fourth
argument of C<split> but determined by parsing appropriate l-values,
sort algorithms
that ignore the rest of the array could be used. Worst case would be
l*n comparisons
when l is how many values you want returned and n is array size, and it
could be
turned into n log l by using a binary system to locate a earlier value
within the set of
values that will be returned.
Since the above code works, a sensible thing for sort in scalar to do
would be to
treat the l-value as if it was a singleton array, making
$first = sort @ary;
and
($first) = sort @ary;
equivalent.
The scalar version could do a simple short-circuit minimum function (n-1
comparisons)
until the mechanisms for communicating L-value array size are hammered out.
>What can one use your sortedness value for?
>
* you can sort arrays based on how sorted they are, for instance when
deciding what to do next
* you can determine an sortedness threshold above which you consider
your data "sorted enough" to not bother with an expensive sort
* if the sortedness pass cached its result as the first pass of the
merge sort,
code such as
@ary = sort @ary if sort( @ary ) < 0.95;
would take n-1 comparisons except when a sort is required.
* It'll make Randall Schwartz call my C coding "Amazing" which I can use on
the back covcer of my paperback