Today we introduce Cuckoo hashing, which is only briefly mentioned by our book on page 484 (and not described very accurately there). See instead the paper cuckoo-undergrad.pdf in the current directory, it describes the basic idea: We have a single table, and two hash functions h1 and h2. Each key x must be stored either in slot h1(x) or h2(x)). If we can achieve this, then pretty clearly we can do all searches and deletions in at most two probes. This is O(1) "worst case", which is better than the hashing gurantees we have seen so far, which were only "average case" (typical search had cost O(1), but worst case key search was closer to O(lg N)). But the tricky part will be insertions: we will propose a "cuckoo" procedure, and claim that while it may be expensive in the worst case, the average cost of insertion is in fact O(1), assuming we have a "random enough" hash functions h1 and h2. We'll compare its performance to chaining, using two Java implementations in today's directory. A flaw in our version of CuckooHash is that our two hash functions, h1(x) and h2(x), are really just two different "compressions" of x.hashCode(). So our code fails hard as soon as we have three keys with the same hashCode() value; but we'll see this does not happen often. And we could fix the problem for *all* keys, if we picked h1 and h2 from a large enough family of functions. Today's directory has two hashing implementations, where our main interest is comparing the search performance of cuckoo hashing with chaining (worst-case search should be its main advantage, after all). In either case the Driver populates a big table, and then calculates both the average and worst-case number of equality operations per search: Set.java -- interface (keys only, no values) ChainingHash.java -- hashing with lists (rebuild at alpha=4) CuckooHash.java -- cuckoo hashing (rebuild on a bad add) Driver.java -- test driver Note: the driver adds random Int objects in the range 0..2*N-1, until the set reaches size N. The default N is 500000.