develooper Front page | perl.perl5.porters | Postings from February 2008

Slowdown of "sort keys %hash" under "use locale"

Thread Next
From:
Marcus Holland-Moritz
Date:
February 15, 2008 08:00
Subject:
Slowdown of "sort keys %hash" under "use locale"
Message ID:
20080215170015.72f6160d@r2d2
So, I was just presented an interesting slowdown in perl-5.8 that
is still present in blead.

The slowdown happens when sorting the keys of a hash and when the
code is run under 'use locale'.

  mhx@r2d2 $ cat sorttest.pl 
  $code = shift;
  %words = map { chomp; ($_ => 1) } <>;
  ($t0) = times;
  eval "\@s = $code \%words";
  ($t1) = times;
  printf "sorting %d words using [ $code ] took %.2f seconds\n",
         scalar keys %words, $t1 - $t0;

With 5.6, everything's fine:

  mhx@r2d2 $ perl5.6.2 sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 1.07 seconds

  mhx@r2d2 $ perl5.6.2 -Mlocale sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 1.35 seconds

With 5.8, there's a slowdown...

  mhx@r2d2 $ perl5.8.8 sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 1.33 seconds

  mhx@r2d2 $ perl5.8.8 -Mlocale sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 8.64 seconds

...that vanishes if you run the list of words through map:

  mhx@r2d2 $ perl5.8.8 -Mlocale sorttest.pl 'sort map $_, keys' </usr/share/dict/words 
  sorting 234937 words using [ sort map $_, keys ] took 1.62 seconds

In bleadperl, this is still present (and doesn't vanish with map):

  mhx@r2d2 $ bleadperl sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 0.79 seconds

  mhx@r2d2 $ bleadperl -Mlocale sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 9.17 seconds

  mhx@r2d2 $ bleadperl -Mlocale sorttest.pl 'sort map $_, keys' </usr/share/dict/words 
  sorting 234937 words using [ sort map $_, keys ] took 9.18 seconds

The reason for the slowdown is that the keys have become readonly
scalars, for which the results of strxfrm() isn't cached, so the
code spends most of the time doing strxfrm()'s.

I've a simple fix for that: Also enable the caching for readonly
scalars. However, I'm not fully sure what the implications of this
change would be. According to sv_magicext()'s docs, it's perfectly
legal to attach magic to readonly SVs. However, most of the core
code explicitly doesn't attach magic to readonly SVs.

The following patch removes the special treatment of readonly SVs
and restores the original speed.

--- perl-current-orig-1/sv.c    2008-02-15 00:40:47.000000000 +0100
+++ perl-current-mhx-1/sv.c     2008-02-15 01:21:30.000000000 +0100
@@ -6494,11 +6494,6 @@
            Safefree(mg->mg_ptr);
        s = SvPV_const(sv, len);
        if ((xf = mem_collxfrm(s, len, &xlen))) {
-           if (SvREADONLY(sv)) {
-               SAVEFREEPV(xf);
-               *nxp = xlen;
-               return xf + sizeof(PL_collation_ix);
-           }
            if (! mg) {
 #ifdef PERL_OLD_COPY_ON_WRITE
                if (SvIsCOW(sv))


  mhx@r2d2 $ ./perl sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 1.05 seconds

  mhx@r2d2 $ ./perl -Mlocale sorttest.pl 'sort keys' </usr/share/dict/words 
  sorting 234937 words using [ sort keys ] took 1.73 seconds

  mhx@r2d2 $ ./perl -Mlocale sorttest.pl 'sort map $_, keys' </usr/share/dict/words 
  sorting 234937 words using [ sort map $_, keys ] took 1.84 seconds

It also doesn't cause any failures in the test suite. So it looks
fine to me, but probably someone else can imagine what it'll break.

Of course, the new code would consume more memory, but the speedup
seems rather significant to me for such a common idiom.

Marcus

-- 
As far as the laws of mathematics refer to reality, they are not
certain, and as far as they are certain, they do not refer to reality.
		-- Albert Einstein

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About