On Tue, Oct 21, 2003 at 04:26:27PM -0500, Scott A Crosby wrote: > AFAIK, you switch from the past predictable hash function to a new > randomized one only when you detect that you're under attack. I have a > couple of concerns: > > 1. How do you dectect 'under attack', and is it possible to > construct an attack while not triggering the rekey & rehash? Currently the criteria is that if the longest linked list in the hash contains >50% of the entries then the rehashing is triggered (strictly this is at a hash split, when the number of buckets is doubled, and it is skipped if the longest linked is < 8, or we're already on the alternative hashing function) The criteria is too conservative (MJD has done some maths and estimates that it should be longest linked list > 1.6) hence one could argue that a less "perfect" attack currently would still work. (eg one that split the values between exactly 3 buckets) > 2. When you rekey and rehash with a new random key, how do you avoid > potentially rehashing endlessly. Is it possible to engage in a new > attack by forcing endless rehashings instead of lots of collisions? There is only one rekeying done. I'm assuming that that works. There's a 1/(2**32) chance that it will be identical to the previous algorithm > 3. Does this rehashing logic apply univerally to all hash tables in > the system? For instance, can mod_perl or other internal hash tables > or be built that accidently avoid the detect-attack-and-rehash > logic? Yes it applies to all (except the shared string table) that call the functions in hv.c > 4. Are you still using keyed Jenkin's for the hash function? Yes (if that's what 5.8.0 uses) 5.8.0 uses it seeded with 0 every time 5.8.1 seeds it with a random 32 bit number, constant per perl run plan C uses a seed of 0 for the default hashing. (and keeps the random seed introduced by 5.8.1 as 0, hence compatibility) The rehashing uses the same algorithm but seeded with a random constant (one seed per process). Hence the rehashing is used the same strategy as 5.8.1 applied to all hashes. I hope this answers your questions. We almost need a FAQ :-) More maths or experimentation on what the "attack" criteria should be is welcome. ("attack" could be any pathological data. There are some "attacks" in the scripts run during the perl build process) Nicholas ClarkThread Previous | Thread Next