On Wed, Dec 07, 2016 at 12:30:35AM +0100, demerphq wrote: > I just pushed a patch which optimises our hash function somewhat by > using a hybrid approach, with an optimised OAATH for short keys (16 > bytes or less), and Siphash 1-3, for the rest. > > For longer keys, say 50 byte or up, Siphash 1-3 makes a big > difference. By the time you get to 1k keys or longer it makes a HUGE > difference. > > For shorter keys I unrolled the OAATH logic so it doesn't have so much > overhead. The new OAATH code is a touch different in that it hashes > back to front. I can't comment on the technical aspects of hash choice, but thanks. > Here are the timings from the change using Porting/bench.pl for keys > of length 1..24, 50, 100, and 1000. I believe most keys in a process > are short, hence the bias in the benchmarks. Just one thing - the tests in t/perf/benchmarks are currently intended to be lightweight, e.g. a test might measure the cost of a single assignment or hash lookup. Your tests are individually processing a thousand keys, which makes them completely dominate the results when averaged using --raw. Also, there is an existing test namespace, expr::hash:: which should be reused. I'd also suggest using exists, and this does a hash generation and lookup, but without the costs of creating or freeing hash elements. Putting that all together, I'd suggest applying something like the attached. -- Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.Thread Previous | Thread Next