On approximately 7/9/2004 7:19 AM, came the following characters from the keyboard of Ton Hospel: > In article <1089335013.1041.182.camel@plaza.davidnicol.com>, > david nicol <davidnicol@pay2send.com> writes: > >>Does not the combining involve more comparisons? > > > Sure, but I accounted for them. N/M combines each collecting M elements > from 2 runs of M => M compares per combine => O(N) compares, which then > gets subsumed in the O(N log M) we already had from the first step I appreciated as interesting the idea that a "top N sort" might appropriately be different than the "incremental sort of unknown result elements". To get a feeling for how different, it would be interesting to compare the costs of obtaining each element in turn for the incremental sort, vs. the costs of obtaining the top N. And we've only seen a discussion of discarding elements (and corresponding costs) applied to merge sort, not to other possible sorts... perhaps the others don't lend themselves to discarding elements as well as merge sort does. But then unless the total cost of obtaining the top N elements in significantly less for a top N sort than for the incremental sort, perhaps only the incremental sort need be implemented. I look forward to David's Sort::Lazy module; perhaps it would be a suitable framework for investigating some of these sort characteristics empirically? And also providing a framework for implementing and measuring both the theoretical costs O(N,M) and actual costs (for particular data sets). -- Glenn -- http://nevcal.com/ =========================== The best part about procrastination is that you are never bored, because you have all kinds of things that you should be doing.