develooper Front page | perl.perl5.porters | Postings from June 2004

Re: Short circuiting sort

From:
david nicol
Date:
June 28, 2004 13:31
Subject:
Re: Short circuiting sort
Message ID:
1088454684.1795.78.camel@plaza.davidnicol.com
On Fri, 2004-06-25 at 10:55, Glenn Linderman wrote:
> 
> So in terms of N items in the array and M items desired...
> 
> One trivial way to take advantage of the hint is to recall that the 
> lowly N-squared bubble sort (and variations) produces one high (or low) 
> value per pass, and that for small M and large N, that M*N is smaller 
> than N lg N.

And after we have the bubble sort implementation working, optimizing 
to N lg M is easy enough to do by using a binary search to find the
insertion point in the short list.  The short list is sorted.

Algorithm:

        slice off first M elements into @Short[0..M-1]:
	@Short[0..M-1] = @Long[0..M-1];

        sort @Short in place and stable (insertion? M is small)

        For each $Elm (@Long[M..N-1]) {

#ifdef SIMPLE_BUBBLE

           # this approach is easy to write and understand
           # and for small M there are few additional comparisons
           my ShortPosition = M-1
           while(comparison($Elm, $Short[ShortPosition]) < 0){
              $Short[ShortPosition + 1] = $Short[ShortPosition];
              $Short[ShortPosition--] = $Elm;           
           }

#elsif
           # we do fewer comparisions to locate where
           # the Elm goes, and then slide the rest of
           # short down.

           next unless comparison($Elm, $Short[M-1]) < 0;
           LowerBound = 0
           UpperBound = M-2
           # we want to find the highest index in Short that
	   # gives us a negative comparison against the
           # new element, and insert the new element before it.

	   while(LowerBound < UpperBound){
	      Guess = LowerBound + floor((UpperBound - LowerBound) / 2)
              if (comparison($Elm, $Short[Guess]) < 0){
                 # Element goes before Guess
                 UpperBound = Guess
              }else{
                 # Element goes on or after Guess
		 LowerBound = Guess
              };
           }
           # Guess, LB, and UB are all the same now. 
           # (LB=0 is correct for M==1, FWTW )
           # and we want to insert the new element before
           # the determined location by shifting everything
           # down and then assigning.

           Pos = M ;
           While (Pos-- > LowerBound){
		Short[Pos] = Short[Pos - 1]
           }

           Short[LowerBound] = $Elm
				


#endif


	}


The worst cases for both, a descending list, are N*M or N * Lg M
comparisons and N*M element shiftings.  One might be able to
save a few element shiftings by optimizing a mergesort to throw
away results we don't care about, but in situations other than
descending lists, knowing what the bar for entry into the short 
list is, as it lowers, in a single pass, will (I expect), save 
many comparisons: with a butchered mergesort, we're going to
be throwing away unnecessarily sorted sequences.

Assigning to $Short[M] can be easily optimized out too.




-- 
david nicol
this message was sent from a filtering e-mail address.
Until I add you to my friends list, to get by Big Emet
the cowboy doorman your reply must include a listed
"magic phrase" such as "cat and buttered toast" 




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About