develooper Front page | perl.perl5.porters | Postings from June 2004

Short circuiting sort

David L. Nicol
June 20, 2004 01:31
Short circuiting sort
Message ID:

Nick Ing-Simmons wrote:

>(I would still like sort in scalar context to return min or max value
> - don't care which because one can change cmp function to get the other.)


       ($first) = sort @ary;

gives the first of the sorted list, after doing a mess of extra work.  
If C<sort> were to
take a hint indicating how many results are going to be used, similar to 
the fourth
argument of C<split> but determined by parsing appropriate l-values, 
sort algorithms
that ignore the rest of the array could be used.  Worst case would be 
l*n comparisons
when l is how many values you want returned and n is array size, and it 
could be
turned into n log l by using a binary system to locate a earlier value 
within the set of
values that will be returned.

Since the above code works, a sensible thing for sort in scalar to do 
would be to
treat the l-value as if it was a singleton array, making

     $first = sort @ary;


    ($first) = sort @ary;


The scalar version could do a simple short-circuit minimum function (n-1 
until the mechanisms for communicating L-value array size are hammered out.

>What can one use your sortedness value for?
* you can sort arrays based on how sorted they are, for instance when
deciding what to do next

* you can determine an sortedness threshold above which you consider
your data "sorted enough" to not bother with an expensive sort

* if the sortedness pass cached its result as the first pass of the 
merge sort,
code such as

       @ary = sort @ary if sort( @ary ) < 0.95;

would take n-1 comparisons except when a sort is required.

* It'll make Randall Schwartz call my C coding "Amazing" which I can use on
the back covcer of my paperback Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About