develooper Front page | perl.perl5.porters | Postings from July 2004

Re: optimizing Short circuiting sort method based on [M]

Glenn Linderman
July 9, 2004 11:46
Re: optimizing Short circuiting sort method based on [M]
Message ID:
On approximately 7/9/2004 7:19 AM, came the following characters from
the keyboard of Ton Hospel:
> In article <>,
> 	david nicol <> writes:
>>Does not the combining involve more comparisons?
> Sure, but I accounted for them. N/M combines each collecting M elements
> from 2 runs of M => M compares per combine => O(N) compares, which then
> gets subsumed in the O(N log M) we already had from the first step

I appreciated as interesting the idea that a "top N sort" might 
appropriately be different than the "incremental sort of unknown result 

To get a feeling for how different, it would be interesting to compare 
the costs of obtaining each element in turn for the incremental sort, 
vs. the costs of obtaining the top N.

And we've only seen a discussion of discarding elements (and 
corresponding costs) applied to merge sort, not to other possible 
sorts... perhaps the others don't lend themselves to discarding elements 
as well as merge sort does.

But then unless the total cost of obtaining the top N elements in 
significantly less for a top N sort than for the incremental sort, 
perhaps only the incremental sort need be implemented.

I look forward to David's Sort::Lazy module; perhaps it would be a 
suitable framework for investigating some of these sort characteristics 
empirically?  And also providing a framework for implementing and 
measuring both the theoretical costs O(N,M) and actual costs (for 
particular data sets).

Glenn --
The best part about procrastination is that you are never bored,
because you have all kinds of things that you should be doing. Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About