develooper Front page | perl.perl5.porters | Postings from August 2021

Re: Robin Hood Hashing for the perl core

Thread Previous | Thread Next
Dave Mitchell
August 4, 2021 13:45
Re: Robin Hood Hashing for the perl core
Message ID:
On Tue, Aug 03, 2021 at 02:27:42PM +0000, Nicholas Clark wrote:
> I'd like to suggest that we switch the core's hashtable implementation from
> the current array of linked-lists (which is not friendly with CPU caches) to
> "open addressing" hashing (which is). More specifically, to "Robin Hood
> Hashing", and most specifically to the implementation that I wrote for
> MoarVM (and Perl 5) - "A Better Hash".

[snip long description]

Maybe my eyes glazed over at the important bit, but I didn't see anything
explaining what the RHH does when there's a collision (i.e. the equivalent
of having more than one entry in the HE linked list in the old way).

> It's actually a bit better than that. It's not noticeably slower on
> anything. I'm not sure what makes a good benchmark. Code I've tried isn't
> bottlenecked on hashing. I've tried using cachegrind - we're using more CPU
> in the hash code, and more data references, but fewer last level misses.

A good start would be:

$ ./perl -Ilib Porting/ --tests='/^expr::hash/' -j 8 -w /tmp/results -v perlbinary1 perlbinary2 ...

which runs cachegrind on various empty and full loops containing the
snippets of code matched in t/perf/benchmarks and writes the results (with
startup and loop overhead factored out) into the file /tmp/results.
The -j 8 says run 8 benchmarks in parallel. The -v says show what's going on.

Then display the results in various formats, e.g. as percentages:

    $ ./perl -Ilib Porting/ -r /tmp/results

or as raw numbers:

    $ ./perl -Ilib Porting/ -r /tmp/results  --raw

Overhead, without any fuss, the stars were going out.
    -- Arthur C Clarke

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About