develooper Front page | perl.perl5.porters | Postings from October 2003

Hypothetical attack on 5.8.1 randomized hashes.

Thread Previous | Thread Next
From:
Scott A Crosby
Date:
October 31, 2003 06:50
Subject:
Hypothetical attack on 5.8.1 randomized hashes.
Message ID:
oydhe1p8nt5.fsf_-_@bert.cs.rice.edu
On Thu, 30 Oct 2003 20:48:31 +0000, Nicholas Clark <nick@ccl4.org> writes:

> On Thu, Oct 30, 2003 at 02:08:06PM -0600, Scott A Crosby wrote:
> > On Thu, 30 Oct 2003 19:25:17 +0000, Nicholas Clark <nick@ccl4.org> writes:
> 
> > Enlightenment. Thanks, looks good.
> 
> Thanks for over checking this for us.


Unfortunately, that may involve more work for you. 

Last night, talking with another student crystalized a hypothetical
attack that may work on even randomized hashes. Tim Peters first
suggested it to me and Solar Designer convinced me that it might be
practical in some cases, but I did not have an attack that I thought
would work. Now I do. The idea is roughly to exploit remote timing
analysis in order to determine the 'secret' hash seed, then use that
to generate many collisions. It is *only* hypothetical --- I won't
have time to practically test it for 4-6 weeks. I only bring it up now
because Stas said that he doesn't want to have to revisit this issue
again in a few weeks.

This attack only works in an online setting that uses persistent hash
keys. This is exactly something mod_perl has to face. It would even
break 5.8.1 with all randomized hashes. I bring it up because Stas
mentioned in this thread that he doesn't want to have to redo mod_perl
*again* in a few months, and this issue might force that. I'm also
sorry for not thinking it through earlier --- I've been focused on
other things.

Here is how it works.

Assume I have a remote server using perl that uses keyed hashing with
a 32-bit key. Also assume that the remote server keeps using the *same
key* 'for a while', say 24 or more hours. For now assume that the
server does not use 5.8.2's hash-function-switching technique. It uses
5.8.1 randomize-always.

Now assume I have an oracle that will tell me, given two inputs,
whether they collide in the remote hash table. (I'll tell you how this
gets implemented) Here is the algorithm that computes the hash key.

If there are 4 billion hash keys and 8 buckets in the hash table, then
EACH time my oracle reports a collision or non-collision, I learn
about the hash function. Say that the oracle says that A and B
collide, I can enumerate all 4 billion keys and throw out all of those
keys inconsistent with my evidence -- those keys where A and B did
*not* collide. This leaves 512 million left. Each time I find a
collision with the oracle, I can throw out another 87.5% of the
remaining keyspace. Each time I find two keys through the oracle that
do not collide, I can throw out 12.5% of the remaining keyspace.
Eventually, with about 40 queries of the oracle, there is only one key
left consistent with all of the evidence. Once the attacker has that,
a conventional hash table attack applies. This particular algorithm
would take about 40GHZ*hours for 1000-byte inputs.

The algorithm described above requires a hash-collision oracle. Does
such an oracle exist? How do we determine if a remote hash table
experiences a collision? 

Assume that we have two otherwise identical executions of a program,
one where there is a hash table collision, and one where there is
not. The execution with a hash table collision will take SLIGHTLY more
time to execute. If at the end of execution, the program sends a
response, then that differential latency, if observed by the attacker
*is* the oracle they need and the attacker can use the algorithm above
to find the secret key. That latency can be measured in
nanoseconds---If the attacker cannot observe something this small,
then the victim is safe.

Unfortunately for perl, that looks to not be the case. It is true that
a conventional hash table collision would be to fast for an attacker
to notice the latency difference. However, Perl allows variable-length
hash inputs. What if an attacker makes two 1010 character inputs A and
B which both share the same 1000-character prefix, they send this to
as part of a POST request handled by mod_perl. Then if there was a
hash collision, a strcmp() would be done. Benchmarking, that takes
5000ns. When can an attacker discriminate that 5000ns additional latency?

I'm not completely sure, but indications are that, at least over a few
hops on a local area network, this looks to be possible. If an
attacker cannot, they can always increase the length of the common
prefix to create a larger strcmp() latency difference. At some length
they will succeed.

In conclusion I have a hypothetical attack. I don't know and won't
know for at least a couple of months if it is practical to apply to
mod_perl websites. I also don't know if it is worth defending against.
Finally, I don't know if you or Stas want to deal with it now or wait
for further research.

Scott

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About