space efficient hash tables with worst case constant
play

Space Efficient Hash Tables with Worst Case Constant Access Time - PowerPoint PPT Presentation

Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing 1 INFORMATIK Space Efficient Hash Tables with Worst Case Constant Access Time Dimitris Fotakis and Peter Sanders (MPII) Rasmus Pagh (IT U. Copenhagen) Paul


  1. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 1 ✞ ☎ ✞ ☎ � ✁ INFORMATIK Space Efficient Hash Tables with Worst Case Constant Access Time Dimitris Fotakis and Peter Sanders (MPII) Rasmus Pagh (IT U. Copenhagen) Paul Spirakis (CTI)

  2. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 2 ✞ ☎ ✞ ☎ � ✁ Overview INFORMATIK ✷ The Problem and Related Work ✷ Cuckoo Hashing ✷ d -ary Cuckoo Hashing ✷ Analysis ✷ Relation to Bipartite Matching ✷ Filter Hashing ✷ Discussion

  3. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 3 ✞ ☎ ✞ ☎ � ✁ The Problem INFORMATIK Represent a set of n elements (with associated information) using space (1 + ǫ ) n . Support operations insert, delete, lookup, (doall) efficiently. Assume a truly random hash function h

  4. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 4 ✞ ☎ ✞ ☎ � ✁ Related Work INFORMATIK h 1 h 2 h 3 Uniform hashing: Expected time ≈ 1 ǫ Dynamic Perfect Hashing, [Dietzfelbinger et al. 94] Worst case constant time for lookup but ǫ is not small. Approaching the Information Theoretic Lower Bound: [Brodnik Munro 99,Raman Rao 02] Space (1 + o (1)) × lower bound without associated information [Pagh 01] static case.

  5. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 5 ✞ ☎ ✞ ☎ � ✁ Cuckoo Hashing INFORMATIK [Pagh Rodler 01] Table of size (2 + ǫ ) n . Two choices for each element. Insert moves elements; rebuild if necessary. Very fast lookup and delete. Expected constant insertion time. h 1 h 2

  6. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 6 ✞ ☎ ✞ ☎ � ✁ d -ary Cuckoo Hashing INFORMATIK d choices for each element. Worst case d probes for delete and lookup. h 1 Task: maintain L -perfect matching h 2 in the bipartite graph ( L = Elements , R = Cells , E = Choices) , h 3 e.g., insert by BFS.

  7. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 7 ✞ ☎ ✞ ☎ � ✁ Experiments INFORMATIK 5 d=2 d=3 ε * #probes for insert 4 d=4 d=5 3 2 1 0 0 0.2 0.4 0.6 0.8 1 space utilization

  8. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 8 ✞ ☎ ✞ ☎ � ✁ Tradeoff: Space ↔ Lookup/Deletion Time INFORMATIK log 1 � � Lookup and Delete: d = O probes ǫ Proof Outline: the bipartite graph ( L, R, E ) has an L -perfect matching ⇔ Hall’s Theorem h 1 � ∃ M ⊆ L : | neighbors( M ) | < | M | h 2 . . . Chernoff bounds . . . h 3 true whp if d ≥ 2(1 + ǫ ) ln( e ǫ )

  9. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 9 ✞ ☎ ✞ ☎ � ✁ Tradeoff: INFORMATIK Space ↔ Insertion time � O (log log(1 /ǫ )) � 1 , (experiments) − → O (1 /ǫ ) ? Insert: ǫ Expansion property: half the nodes within O (log(1 /ǫ )) from a free node Shrinking property: number of far-away nodes shrinks geometrically with distance ⇒ short average augmenting path length

  10. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 10 ✞ ☎ ✞ ☎ � ✁ INFORMATIK Average Case Analysis of Bipartite Matching [Motwani 94] : A bipartite graph ( L, R, E ) with | L | = | R | and | E | > n ln n random edges has a perfect matching whp. Time O ( | E | log | L | / log log | L | ) Here: slight assymmetry, very sparse, linear time

  11. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 11 ✞ ☎ ✞ ☎ � ✁ Filter Hashing INFORMATIK log 2 1 � � ✷ O layers ǫ ✷ shrinking geometrically ✷ perfect hashing for the overflow table ✷ realistic hash functions

  12. Fotakis/Pagh/Sanders/Spirakis: d -ary Cuckoo Hashing ❝ 12 ✞ ☎ ✞ ☎ � ✁ Discussion INFORMATIK h 1 d -ary Cuckoo: fast, practical, very space efficient h 2 h 3 Open Question ✷ “real” hash functions ✷ Tighten insertion time ✷ average case lookup time ✷ average case max cardinality bipartite matching for sparse symmetric graphs

Recommend


More recommend