Randomness in Computing L ECTURE 15 Last time • Poisson approximation • Application: max load • Application: Coupon Collector Today • Hashing 3/19/2020 Sofya Raskhodnikova;Randomness in Computing
Extra slide for notes 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Static dictionary problem Motivating example Password checker to prevent people from using common passwords. • S is the set of common passwords • Universe: set 𝑉 • 𝑇 ⊆ 𝑉 and 𝑛 = |𝑇| • 𝑛 ≪ |𝑉 | Goal: A data structure for storing 𝑻 that supports the search query “ Does 𝑥 ∈ 𝑻 ?” for all words 𝑥 ∈ 𝑽 . 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Solutions Deterministic solutions • Store 𝑻 as a sorted array (or as a binary search tree) Search time : O ( log 𝑛 ), Space : O ( 𝑛 ) • Store an array that for each 𝑥 ∈ 𝑽 has 1 if 𝑥 ∈ 𝑻 and 0 otherwise. Search time: O ( 1 ), Space : O ( |𝑽| ) A randomized solution • Hashing
Chain Hashing • Hash table: 𝒐 bins, words that fall in the Elements of 𝑻 same bin are chained into a linked list. • Hash function: ℎ : 𝑉 [𝑜] 1 To construct the table 2 hash all elements of 𝑇 ⋮ To search for word 𝒙 check if 𝑥 is in bin ℎ(𝑥) ⋮ Desiderata for 𝒊 : • O(1) evaluation time. 𝒐 • O(1) space to store ℎ .
A random hash function • Simplifying assumption: hash function ℎ is selected at random: Pr ℎ 𝑥 = 𝑘 = 1 𝑜 for all 𝑥 ∈ 𝑉, 𝑘 ∈ [𝑜] • Once ℎ is chosen, every evaluation of ℎ yields the same answer. Search time : • If 𝑥 ∉ 𝑇, expected number of words in bin ℎ(𝑥) is • If 𝑥 ∈ 𝑇, expected number of words in bin ℎ(𝑥) is If we set 𝑜 = 𝑛, then • the expected search time is O(1) ln 𝑛 • max time to search is max load: w.p. close to 1, it is Θ ln ln 𝑛 Faster than a search tree, with space still Θ(𝑛) . 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Are we done? • How many hash functions are there? • How many bits do we need to store a description of a hash function? This is prohibitively expensive! • Idea: Choose from a smaller family of hash functions. 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Universal hash family • A set ℋ of hash functions is universal if for every pair 𝑥 1 , 𝑥 2 ∈ 𝑉 and for ℎ chosen uniformly from ℋ ≤ 1 Pr ℎ 𝑥 1 = ℎ 𝑥 2 𝑜 Constructing a universal hash family • Fix a prime 𝑞 ≥ |𝑉| and think of the range as 0,1, … , 𝑜 − 1 . • Define 𝒊 𝒃,𝒄 𝒚 = 𝑏𝑦 + 𝑐 𝑛𝑝𝑒 𝑞 𝑛𝑝𝑒 𝑜 ℋ = ℎ 𝑏,𝑐 𝑏 ∈ 𝑞 − 1 , 0 ≤ 𝑐 ≤ 𝑞 − 1} Theorem ℋ is universal. 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Proof that ℋ is universal • Define 𝒊 𝒃,𝒄 𝒚 = 𝑏𝑦 + 𝑐 𝑛𝑝𝑒 𝑞 𝑛𝑝𝑒 𝑜 ℋ = ℎ 𝑏,𝑐 𝑏 ∈ 𝑞 − 1 , 0 ≤ 𝑐 ≤ 𝑞 − 1} Proof: Fix 𝑦 1 ≠ 𝑦 2 from U. • Idea: count # of ℎ 𝑏,𝑐 in ℋ for which 𝑦 1 , 𝑦 2 collide. • We will show that – They can’t collide after performing mod 𝑞 . – So, they must map to different values 𝑤 1 , 𝑤 2 at this point – Each (𝑤 1 , 𝑤 2 ) corresponds to a unique pair (𝑏, 𝑐) . – So it suffices to count the number of pairs (𝑤 1 , 𝑤 2 ) with 𝑤 1 ≠ 𝑤 2 , but 𝑤 1 = 𝑤 2 𝑛𝑝𝑒 𝑜 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Proof that ℋ is universal • Define 𝒊 𝒃,𝒄 𝒚 = 𝑏𝑦 + 𝑐 𝑛𝑝𝑒 𝑞 𝑛𝑝𝑒 𝑜 ℋ = ℎ 𝑏,𝑐 𝑏 ∈ 𝑞 − 1 , 0 ≤ 𝑐 ≤ 𝑞 − 1} 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Proof that ℋ is universal • Define 𝒊 𝒃,𝒄 𝒚 = 𝑏𝑦 + 𝑐 𝑛𝑝𝑒 𝑞 𝑛𝑝𝑒 𝑜 ℋ = ℎ 𝑏,𝑐 𝑏 ∈ 𝑞 − 1 , 0 ≤ 𝑐 ≤ 𝑞 − 1} 3/19/2020 Sofya Raskhodnikova; Randomness in Computing
Recommend
More recommend