develooper Front page | perl.perl5.porters | Postings from June 2004

Re: optimizing Short circuiting sort method based on [M]

June 30, 2004 09:25
Re: optimizing Short circuiting sort method based on [M]
Message ID:
On Thu, Jul 01, 2004 at 12:55:56AM +1000, Paul Fenwick wrote:
> G'day Ton/P5P,
> Ton Hospel wrote:
> >Mergesort works well too if you cut off runs at M.
> You're absolutely right.  Quicksort throws away unnecessary elements at 
> the start (except in pathological cases).  Mergesort throws them away at 
> the end (once runs start to become bigger than M).  Both of them become 
> much faster by having M < N.
> In the end, I would expect the complexity to be the same.

There is a linear algorithm possible, although that will not sort the M
elements. It is possible to find the Mth element of a set of N elements in
O (N) time (albeit with a heavy constant). Given the Mth element, finding
the first M elements requires another, single, pass over the entire set.

In practise however, I'd use a heap with M elements. Pass once over the
entire set, and for each element, if it's less than the largest element
in the heap, swap it with the largest, and heapify the heap. This will
be O (N log M) worst case, and O (N) if the set is already sorted.

Abigail Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About