programming is terriblelessons learned from a life wasted

Hash Table Denial of Service Attacks, Revisited.

Back in 2011, Effective Denial of Service attacks against web application platforms was presented at 28c3—The annual Chaos Computer Club gathering in Germany. The essence of the attack was exploiting how hash tables are used in modern web frameworks.

A hash table is used to store a series of keys that relate to values— usually built using a hash function f, and some buckets, where the function maps a given key to a specific bucket. To lookup a value for a key, you compute the hash of the key, and search the matching bucket. When the function maps two different keys to the same bucket, you have a collision. To handle this, some use a different bucket, or allow buckets to contain multiple key, value pairs. Collisions make operations like lookups and removals much slower.

In the talk, they show that an attacker can craft colliding keys, submit them in a http request, and cause the webserver to dramatically slow down. To prevent this we might think to use a collision resistant hash (i.e a cryptographic hash,where collisions are hard to guess) but in practice they are far slower than the normal hash functions used, and still deterministic—If the attacker knows the hash, they can always brute force it, and brute force can be quite effective. Instead, implementations moved to randomized hash functions. You can thinking of it as a hash function that takes a secret key, and you should not be able to calculate a hash value or a collision without the secret.

This didn’t work out so well—At 29c3 the following year another talk was given Hash-Flooding DOS Reloaded: Attacks and defenses, demonstrating randomized hash functions can still be vulnerable to denial of service attacks. Although the idea was sound—a random value was picked, and used in the hash function, The implementation was poor. Hash functions like CityHash and MurmurHash were vulnerable to a form of differential cryptanalysis—watching the differences introduced by the random value, and derive a way to make these differences cancel each other out. Security, randomness, and cryptography are still very hard to get right.

The talk covers how they translated this analysis into working attacks on a variety of platforms. They also presented a solution too—SipHash. It’s a keyed hash function built around solid, researched cryptographic techniques, designed by experts, that works about as fast as a traditional weak hash function. The home page for SipHash has a wealth of talks, and supporting material.

If you still haven’t had enough crypto, the talk on real world RSA Factorization — FactHacks — makes an excellent followup