cs 473 algorithms
play

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of - PowerPoint PPT Presentation

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall 2016 Chandra & Ruta (UIUC) CS473 1 Fall 2016 1 / 22 CS 473: Algorithms, Fall 2016 Fingerprinting Lecture 11 September 28, 2016 Chandra


  1. CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall 2016 Chandra & Ruta (UIUC) CS473 1 Fall 2016 1 / 22

  2. CS 473: Algorithms, Fall 2016 Fingerprinting Lecture 11 September 28, 2016 Chandra & Ruta (UIUC) CS473 2 Fall 2016 2 / 22

  3. Fingerprinting Source: Wikipedia Process of mapping a large data item to a much shorter bit string, called its fingerprint. Fingerprints uniquely identifies data for all practical purposes . Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 22

  4. Fingerprinting Source: Wikipedia Process of mapping a large data item to a much shorter bit string, called its fingerprint. Fingerprints uniquely identifies data for all practical purposes . Typically used to avoid comparison and transmission of bulky data. Eg: Web browser can store/fetch file fingerprints to check if it is changed. Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 22

  5. Fingerprinting Source: Wikipedia Process of mapping a large data item to a much shorter bit string, called its fingerprint. Fingerprints uniquely identifies data for all practical purposes . Typically used to avoid comparison and transmission of bulky data. Eg: Web browser can store/fetch file fingerprints to check if it is changed. As you may have guessed, fingerprint functions are hash functions. Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 22

  6. Bloom Filters Hashing: To insert x in dictionary store x in table in location h(x) 1 To lookup y in dictionary check contents of location h(y) 2 Chandra & Ruta (UIUC) CS473 4 Fall 2016 4 / 22

  7. Bloom Filters Hashing: To insert x in dictionary store x in table in location h(x) 1 To lookup y in dictionary check contents of location h(y) 2 Bloom Filter: tradeoff space for false positives Storing items in dictionary expensive in terms of memory, 1 especially if items are unwieldy objects such a long strings, images, etc with non-uniform sizes. To insert x in dictionary set bit to 1 in location h(x) (initially all 2 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no. 3 Chandra & Ruta (UIUC) CS473 4 Fall 2016 4 / 22

  8. Bloom Filters Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  9. Bloom Filters Bloom Filter: tradeoff space for false positives To insert x in dictionary set bit to 1 in location h(x) (initially all 1 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no 2 No false negatives but false positives possible due to collisions 3 Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  10. Bloom Filters Bloom Filter: tradeoff space for false positives To insert x in dictionary set bit to 1 in location h(x) (initially all 1 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no 2 No false negatives but false positives possible due to collisions 3 Reducing false positives: Pick k hash functions h 1 , h 2 , . . . , h k independently 1 To insert x for 1 ≤ i ≤ k set bit in location h i (x) in table i to 1 2 To lookup y compute h i (y) for 1 ≤ i ≤ k and say yes only if 3 each bit in the corresponding location is 1 , otherwise say no. If probability of false positive for one hash function is α < 1 then with k independent hash function it is Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  11. Bloom Filters Bloom Filter: tradeoff space for false positives To insert x in dictionary set bit to 1 in location h(x) (initially all 1 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no 2 No false negatives but false positives possible due to collisions 3 Reducing false positives: Pick k hash functions h 1 , h 2 , . . . , h k independently 1 To insert x for 1 ≤ i ≤ k set bit in location h i (x) in table i to 1 2 To lookup y compute h i (y) for 1 ≤ i ≤ k and say yes only if 3 each bit in the corresponding location is 1 , otherwise say no. If probability of false positive for one hash function is α < 1 then with k independent hash function it is α k . Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  12. Outline Use of hash functions for designing fast algorithms Problem Given a text T of length m and pattern P of length n , m ≫ n , find all occurrences of P in T . Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 22

  13. Outline Use of hash functions for designing fast algorithms Problem Given a text T of length m and pattern P of length n , m ≫ n , find all occurrences of P in T . Karp-Rabin Randomized Algorithm Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 22

  14. Outline Use of hash functions for designing fast algorithms Problem Given a text T of length m and pattern P of length n , m ≫ n , find all occurrences of P in T . Karp-Rabin Randomized Algorithm Sampling a prime String equality via mod p arithmetic Rabin’s fingerprinting scheme – rolling hash Karp-Rabin pattern matching algorithm: O(m + n) time. Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 22

  15. Sampling a prime Problem Given an integer x > 0 , sample a prime uniformly at random from all the primes between 1 and x . Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 22

  16. Sampling a prime Problem Given an integer x > 0 , sample a prime uniformly at random from all the primes between 1 and x . Procedure Sample a number p uniformly at random from { 1 , . . . , x } . 1 If p is a prime, then output p . Else go to Step (1). 2 Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 22

  17. Sampling a prime Problem Given an integer x > 0 , sample a prime uniformly at random from all the primes between 1 and x . Procedure Sample a number p uniformly at random from { 1 , . . . , x } . 1 If p is a prime, then output p . Else go to Step (1). 2 Checking if p is prime Agrawal-Kayal-Saxena primality test: deterministic but slow Miller-Rabin randomized primality test: fast but randomized outputs ‘prime’ when it is not with very low probability . Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 22

  18. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  19. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  20. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  21. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . B : Number (prime) p ∗ is picked. Pr[B] = 1 / x . B ⊂ A . Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  22. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . B : Number (prime) p ∗ is picked. Pr[B] = 1 / x . B ⊂ A . Pr[B | A] = Pr [A ∩ B] = Pr [B] 1 / x 1 [A] = π (x) / x = Pr [A] Pr π (x) Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  23. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . B : Number (prime) p ∗ is picked. Pr[B] = 1 / x . B ⊂ A . Pr[B | A] = Pr [A ∩ B] = Pr [B] 1 / x 1 [A] = π (x) / x = Pr [A] Pr π (x) Running time in expectation Q: How many samples in expectation before termination? A: x /π (x) . Exercise. Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  24. How many primes between 0 and x π (x) : Number of primes between 0 and x . Prime Number Theorem π (x) lim x →∞ x / ln x = 1 By Jacques Hadamard and Charles Jean de la Vall´ ee-Poussin in 1896 Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 22

  25. How many primes between 0 and x π (x) : Number of primes between 0 and x . Prime Number Theorem π (x) lim x →∞ x / ln x = 1 By Jacques Hadamard and Charles Jean de la Vall´ ee-Poussin in 1896 Chebyshev (from 1848) π (x) ≥ 7 ln x = (1 . 262 .. ) x x x lg x > 8 lg x Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 22

  26. How many primes between 0 and x π (x) : Number of primes between 0 and x . Prime Number Theorem π (x) lim x →∞ x / ln x = 1 By Jacques Hadamard and Charles Jean de la Vall´ ee-Poussin in 1896 Chebyshev (from 1848) π (x) ≥ 7 ln x = (1 . 262 .. ) x x x lg x > 8 lg x π (x) 1 y ∼ { 1 , . . . , x } u.a.r., then y is a prime w.p. > lg x . x Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 22

Recommend


More recommend