develooper Front page | perl.perl5.porters | Postings from June 2004

Re: optimizing Short circuiting sort method based on [M]

From:
Glenn Linderman
Date:
June 30, 2004 11:42
Subject:
Re: optimizing Short circuiting sort method based on [M]
Message ID:
40E309A4.6010204@nevcal.com
On approximately 6/30/2004 6:11 AM, came the following characters from
the keyboard of Ton Hospel:

> In article <40E2774F.1070106@perltraining.com.au>,
> 	Paul Fenwick <pjf@perltraining.com.au> writes:
> 
>>I haven't thought about any other sorting algorithms yet.  I agree that 
>>heapsort is definitely worth examining as an efficient way to gathering 
>>only the first sorted M elements, as at each stage the next element is 
>>known (the root), and there are only two choices for the next element 
>>(the immediate children of the root).
> 
> Mergesort works well too if you cut off runs at M.
> you start with N/M groups of M elements, each sort in O(MlogM), so that
> gives (NlogM).
> Then you need to combine N/M runs, which is (N/M-1) combines, in which each
> can stop after collecting M elements => O(N).
> So the sum is still O(NlogM)
> 
> And since the plain sort already is mergesort, i think that would be 
> the best choice since it will e.g. have the same stability properties,
> and the partial sort smoothly extends to the full sort.

Thanks for describing how one would do this with a modified merge sort. 
  I agree that there would be some benefits to using merge sort, because 
of perl already using mergesort, but I'm uncertain that any of the code 
could be reused, without slowing down the "full sort", which wouldn't be 
a brilliant thing to do.

Although I recall that O(N lg N) heapsort had a high coefficient, which 
is what makes quicksort generally outperform heapsort, and IIRC 
mergesort also generally outperforms heapsort for doing complete sorts, 
the description above sounds like it would also have a fairly high 
coefficient...especially for extremely small M.

For M == 1, it is pretty clear that a selection sort couldn't be 
outperformed.

I'm not sure how big M has to get, before M passes of bubble sort 
outperform selection sort.

And I have no clue how big M has to get before M passes of bubble sort 
are less efficient than either heapsort or mergesort, or which would be 
the next contender.

And I also have no clue how big M has to get before sorting the whole 
array is more efficient than some number of passes heapsort or mergesort 
(due to extra work to determine early stopping points).

-- 
Glenn -- http://nevcal.com/
===========================
The best part about procrastination is that you are never bored,
because you have all kinds of things that you should be doing.




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About