develooper Front page | perl.perl5.porters | Postings from October 2003

5.8.2-RC1 and mp2

Thread Previous | Thread Next
From:
Scott A Crosby
Date:
October 30, 2003 06:29
Subject:
5.8.2-RC1 and mp2
Message ID:
oyd1xsuu7d8.fsf@bert.cs.rice.edu
Hello. I've been monitoring this thread and I thought all this stuff
about rehashing was only for the 5.8.x series and only to preserve
binary compatibility. Thus, you'd revert back to 5.8.1 style
randomized hashes for 5.9.0?

Also, in [1]

     (ignoring mod perl) I'm not convinced that it kicks in too way
     early.  MJD calculates that the average linked list size on an HV
     should be 1.6 entries. (ie there usually shouldn't be a linear
     search) It's kicking in at 4, *after* a hash split failed to
     partition the longest list below 4. So it's saving linear
     searches of 4 elements.

You must check for under-attack on each hash insert. Only checking
when the hash table doubles means opens up the attack I described in
the thread 'Perl 5.8.1, plan C, and algorithmic complexity attacks.'

Also, in a private reply on that thread, MJD said:

    If the hash has N buckets and K keys, then the probability that
    any given bucket contains exactly i keys is

          (N choose i) (N-1)^(K-i) / (N^K)

    I posted this on p5p last week sometime.

    This formula implies that in a sufficiently large hash, when K=N,
    the average nonempty bucket contains 1.6 keys.  'Sufficiently
    large' in the context of Perl is 'always'.

    I was not able to calculate the closed form for the variance.
    However, numeric calculation of the cases that will come up in
    practice (say, up to N = 2^30) suggest that 5 * log_2 (n) is
    extremely conservative.  It looks to me like it's more like

           5 + log(N)/2

    and even that is conservative.

Going by his advice, we should rehash in 5.8.2 when the chain length
is > 5+log(N)/2. This is assuming that there are $N$ buckets and at
most $N$ keys inserted. (IE, a load of 1.) 

Scott


[1] http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2003-10/msg01472.html


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About