From:

Date:

July 8, 2004 19:14Subject:

Re: optimizing Short circuiting sort method based on [M]Message ID:

1089339265.1040.316.camel@plaza.davidnicol.comOn Wed, 2004-07-07 at 09:39, John P. Linderman wrote: > I meant to reply to the porters at large earlier, but succeeded in only > contacting a couple people. Heap sort is going to double the number > of comparisons realtive to a binary search, since it has to do two > compares at each level of the log M tree. It saves moves, but moves > are cheap, and comparisons are dear, so I don't see heap sort ever > being the winner. You can get N log M comparisons from the binary > search you have at > http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2004-06/msg00592.html > and N * M moves, which will beat the wholesale sorts for small enough M. > But the wholesale merge sort does a nice job on orderly data, and will > beat the specialized sort if there are fewer than log M runs, for example. > I can imagine keeping track of where insertions happen, and biasing the > initial guess-point for the binary search when it appears the data are > non-random. That could make comparisons look more like N than like N log M > for orderly data. -- jpl Right. The questions are, given N and M, when do we want to collect runs? I imagine run-collecting as the first step in "lazy sort" which might be a special case of another kind of sort, I'm not sure. "Nicol's Lazy Sort" (so I'm bold) goes like this: 1: gather runs, in a stable way. Takes up to 2N comparisons: sub Lazy::Sort::TIEARRAY{ my $pack = shift; my $compare = shift; my @Ary = @_; my @Runs; my $Run; $Runs[$Run=0] = [0]; for $index (1..$#Ary){ if ( &$compare( $Ary[$index], $Ary[$$Runs[$Run][0]]) < 1){ unshift @{$Runs[$Run]}, $index; next; }else{ if ( &$compare( $Ary[$index], $Ary[$$Runs[$Run][$#{$Runs[$Run]}]]) >= 0){ push @{$Runs[$Run]}, $index; }else{ $Runs[++$Run] = [$index]; } } bless [$compare,\@Ary, \@Runs], 'Lazy::Sort'; }; more comparisons if we keep stability by bubbling equal elements in from the front, so we don't. A run of equals will get pushed. 2: compare leftmost of all runs to find leftmost and shift it off, takes R (number of runs remaining) comparisons: sub Lazy::Sort::SHIFT{ wantarray and eval{Games::Roguelike::Nethack::play()}; my ($compare,$Ary,$Runs) = @_[0,1,2]; # empty run arrays are prevented by splicing out empty # runs as they occur # while(@{$Runs->[0]} == 0) { # shift @{$Runs}; # @{$Runs} or return undef; #}; @{$Runs} or return undef; my $best = $Runs->[0]->[0]; my $bestrun = 0; for my $run (1..$#{$Runs}){ if (&$compare( $Ary->[$best], $Ary->[$Runs->[$run]->[0]] ) < 0){ $best = $Runs->[$run]->[0]; $bestrun = $run; } } shift @{$Runs->[$bestrun]}; if (@{$Runs->[$bestrun]} == 0){ splice @{$runs},$bestrun,1; } return $Ary->[$best] } I believe I have just described a crippled merge-sort, that finds runs once and then does R comparisons to pull out each element. Ton's insight that if we know M in advance we can throw away elements in a run past M is good. By finding runs and then pulling elements off the runs until we have M of them, we get worst-case of 2N comparisons to find the runs and a worst case of R=N/2 runs for a contrived sequence such as @Ary = map { ($_ | 1) ? $_ : - $_ } (1000..1) After finding runs, we can find the M best with M*R more comparisons. Which may or may not be an improvement over a NlogM single pass. How big do the numbers have to be before it's worth the expected 1.5*N comparisons to find out how small R is, given that after determining R we can still fall back to the single pass when M*R is larger than NlogM? Finding the hints is the other half of the problem. I've seen several identifiable cases so far: @result = (sort @Ary)[0..CONSTANT]; ($first, $second, $third) = sort @Ary; Would it be possible for the hint to be of the form of the pre-sized assigned-to array, so the short-sort could assign directly into it instead of using the stack? -- david nicol "cat and buttered toast"Thread Previous | Thread Next

- RE: optimizing Short circuiting sort method based on [M] by Orton, Yves
- Re: optimizing Short circuiting sort method based on [M] by Glenn Linderman
- Re: optimizing Short circuiting sort method based on [M] by John P. Linderman
- Re: optimizing Short circuiting sort method based on [M] by John P. Linderman
- Re: optimizing Short circuiting sort method based on [M] by Nicholas Clark
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- more 5.9 sort tests by david nicol
- Re: more 5.9 sort tests by Rafael Garcia-Suarez
- Re: more 5.9 sort tests by david nicol
- Re: more 5.9 sort tests by Nicholas Clark
- Re: more 5.9 sort tests by Rafael Garcia-Suarez
- Re: more 5.9 sort tests by Ronald J Kimball
- Re: more 5.9 sort tests (second draft) by david nicol
- Re: more 5.9 sort tests (second draft) by Rafael Garcia-Suarez
- Re: optimizing Short circuiting sort method based on [M] by Nicholas Clark
- Re: optimizing Short circuiting sort method based on [M] by Nicholas Clark
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Nicholas Clark
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Dave Mitchell
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Paul Fenwick
- Re: optimizing Short circuiting sort method based on [M] by Nicholas Clark
- Re: optimizing Short circuiting sort method based on [M] by John P. Linderman
**Re: optimizing Short circuiting sort method based on [M]**by david nicol- Re: optimizing Short circuiting sort method based on [M] by Paul Fenwick
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- sort in scalar context, but broken due to inexperience. by David Nicol
- Re: sort in scalar context, but broken due to inexperience. by Dave Mitchell
- Re: sort in scalar context, but broken due to inexperience. by hv
- Re: sort in scalar context, but broken due to inexperience. by david nicol
- Re: sort in scalar context, but broken due to inexperience. by merlyn
- Re: sort in scalar context, but broken due to inexperience. by Rafael Garcia-Suarez
- [PATCH] sort plays nethack in a scalar context (was: sort in scalarcontext, but broken due to inexperience.) by Paul Fenwick
- Re: [PATCH] sort plays nethack in a scalar context (was: sort inscalar context, but broken due to inexperience.) by Rafael Garcia-Suarez
- Re: [PATCH] sort plays nethack in a scalar context by Paul Fenwick
- Re: [PATCH] sort plays nethack in a scalar context by Nicholas Clark
- Re: [PATCH] sort plays nethack in a scalar context by H.Merijn Brand
- Re: [PATCH] sort plays nethack in a scalar context by Nicholas Clark
- Re: [PATCH] sort plays nethack in a scalar context (was: sort in scalar context, but broken due to inexperience.) by merlyn
- Re: sort in scalar context, but broken due to inexperience. by Graham Barr
- Re: sort in scalar context, but broken due to inexperience. by Dave Mitchell
- Re: sort in scalar context, but broken due to inexperience. by Nick Ing-Simmons
- Short circuiting sort by David L. Nicol
- Re: Short circuiting sort by Glenn Linderman
- Re: Short circuiting sort by david nicol
- Re: sort in scalar context, but broken due to inexperience. by david nicol
- Re: sort in scalar context, but broken due to inexperience. by Graham Barr
- RE: [PATCH] sort plays nethack in a scalar context by Orton, Yves
- Re: [PATCH] sort plays nethack in a scalar context by Abigail
- Re: [PATCH] sort plays nethack in a scalar context by Rafael Garcia-Suarez
- Re: [PATCH] sort plays nethack in a scalar context by Ovid
- Re: [PATCH] sort plays nethack in a scalar context by Nicholas Clark
- Re: [PATCH] sort plays nethack in a scalar context by david nicol
- Re: Short circuiting sort by John P. Linderman
- Re: Short circuiting sort by Glenn Linderman
- Re: Short circuiting sort by david nicol
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Glenn Linderman
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Paul Fenwick
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Paul Fenwick
- Re: optimizing Short circuiting sort method based on [M] by Glenn Linderman
- Re: optimizing Short circuiting sort method based on [M] by Glenn Linderman
- Re: optimizing Short circuiting sort method based on [M] by perl5-porters
- Re: optimizing Short circuiting sort method based on [M] by Glenn Linderman
- Re: optimizing Short circuiting sort method based on [M] by Nicholas Clark
- Re: optimizing Short circuiting sort method based on [M] by Glenn Linderman
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Dave Mitchell
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by Dave Mitchell
- Re: optimizing Short circuiting sort method based on [M] by Paul Fenwick
- Re: optimizing Short circuiting sort method based on [M] by david nicol
- Re: optimizing Short circuiting sort method based on [M] by perl5-porters
- Re: optimizing Short circuiting sort method based on [M] by Glenn Linderman

nntp.perl.org: Perl Programming lists via nntp and http.

Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About