advanced algorithms iii
play

Advanced Algorithms (III) Shanghai Jiao Tong University Chihao - PowerPoint PPT Presentation

Advanced Algorithms (III) Shanghai Jiao Tong University Chihao Zhang March 16th, 2020 Balls-into-Bins Balls-into-Bins Throw balls into bins uniformly at random m n Balls-into-Bins Throw balls into bins uniformly at random m n


  1. Advanced Algorithms (III) Shanghai Jiao Tong University Chihao Zhang March 16th, 2020

  2. Balls-into-Bins

  3. Balls-into-Bins Throw balls into bins uniformly at random m n

  4. Balls-into-Bins Throw balls into bins uniformly at random m n • What is the chance that some bin contains more than one balls? (Birthday paradox)

  5. Balls-into-Bins Throw balls into bins uniformly at random m n • What is the chance that some bin contains more than one balls? (Birthday paradox) • How many balls in the fullest bin? (Max load)

  6. Balls-into-Bins Throw balls into bins uniformly at random m n • What is the chance that some bin contains more than one balls? (Birthday paradox) • How many balls in the fullest bin? (Max load) • How large is to hit all bins (Coupon Collector) m

  7. Birthday Paradox

  8. Birthday Paradox In a group of more than 30 people, which very high chances that two of them have the same birthday

  9. Birthday Paradox In a group of more than 30 people, which very high chances that two of them have the same birthday Pr[no same birthday] ≤ 1 ⋅ ( ) ⋅ ( ) … ( ) n − 1 n − 2 n − m + 1 n n n = exp ( − m ( m − 1) ∑ m − 1 m − 1 i =1 i i =1 ( 1 − i n ) ≤ exp − ) ∏ = n 2 n

  10. Pr[no same birthday] ≤ exp ( − m ( m − 1) ) 2 n

  11. Pr[no same birthday] ≤ exp ( − m ( m − 1) ) 2 n For , , the probability is less than 0.304 m = 30 n = 365

  12. Pr[no same birthday] ≤ exp ( − m ( m − 1) ) 2 n For , , the probability is less than 0.304 m = 30 n = 365 m = O ( n ) For , the probability can be arbitrarily close to 0.

  13. Max Load

  14. Max Load Let be the number of balls in the -th bin X i i

  15. Max Load Let be the number of balls in the -th bin X i i What is We analyze this when X = max i ∈ [ n ] X i ? m = n

  16. Max Load Let be the number of balls in the -th bin X i i What is We analyze this when X = max i ∈ [ n ] X i ? m = n If we can argue that, is less than with X 1 k 1 − O ( n ) 1 probability , then by union bound, Pr[ X ≥ k ] = O (1)

  17. Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k !

  18. Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k ! 2 π k ( k e ) k We apply the Stirling’s formula k ! ≈

  19. Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k ! 2 π k ( k e ) k We apply the Stirling’s formula k ! ≈ k ! ≤ ( k k ) So Pr[ X ≥ k ] ≤ 1 e

  20. Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k ! 2 π k ( k e ) k We apply the Stirling’s formula k ! ≈ k ! ≤ ( k k ) So Pr[ X ≥ k ] ≤ 1 e = O ( k = O ( k ( k ) n ) log log n ) e 1 log n We want . Choose

  21. Concentration Bounds

  22. Concentration Bounds We shall develop general tools to obtain “with high probability” results…

  23. Concentration Bounds We shall develop general tools to obtain “with high probability” results… These results are critical for analyzing randomized algorithms

  24. Concentration Bounds We shall develop general tools to obtain “with high probability” results… These results are critical for analyzing randomized algorithms This is the main topic in the coming 4-5 weeks

  25. Markov Inequality

  26. Markov Inequality Markov Inequality For any nonnegative random variable and , X a > 0 Pr[ X > a ] ≤ E [ X ] a

  27. Markov Inequality Markov Inequality For any nonnegative random variable and , X a > 0 Pr[ X > a ] ≤ E [ X ] a Proof . E [ X ] = E [ X ∣ X > a ] ⋅ Pr[ X > a ] + E [ X | X ≤ a ] ⋅ Pr[ X ≤ a ] ≥ a ⋅ Pr[ X > a ]

  28. Applications

  29. Applications • A Las-Vegas randomized algorithm with expected running time terminates in O ( n 2 ) time with O ( n ) 1 − O ( n ) 1 probability

  30. Applications • A Las-Vegas randomized algorithm with expected running time terminates in O ( n 2 ) time with O ( n ) 1 − O ( n ) 1 probability • In -balls-into- -bins problem, . So n n E [ X i ] = 1 Pr [ X 1 > log log n ] ≤ log log n log n log n

  31. Applications • A Las-Vegas randomized algorithm with expected running time terminates in O ( n 2 ) time with O ( n ) 1 − O ( n ) 1 probability • In -balls-into- -bins problem, . So n n E [ X i ] = 1 Pr [ X 1 > log log n ] ≤ log log n log n log n This is far from the truth…

  32. Chebyshev’s Inequality

  33. Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ

  34. Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ E [ f ( X ) ] Pr [ X ≥ a ] = Pr [ f ( X ) ≥ f ( a ) ] ≤ f ( a )

  35. Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ E [ f ( X ) ] Pr [ X ≥ a ] = Pr [ f ( X ) ≥ f ( a ) ] ≤ f ( a ) f ( x ) = x 2 gives the Chebyshev’s inequality

  36. Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ E [ f ( X ) ] Pr [ X ≥ a ] = Pr [ f ( X ) ≥ f ( a ) ] ≤ f ( a ) f ( x ) = x 2 gives the Chebyshev’s inequality Pr[ X ≥ a ] ≤ E [ X 2 ] or Pr [ | X − E [ X ] | ≥ a ] ≤ Var [ X ] a 2 a 2

  37. Coupon Collector

  38. Coupon Collector Recall the coupon collector problem is to ask

  39. Coupon Collector Recall the coupon collector problem is to ask “How many ball one needs to throw so that none of the bins is empty?” n

  40. Coupon Collector Recall the coupon collector problem is to ask “How many ball one needs to throw so that none of the bins is empty?” n We already established that E [ X ] = nH n ≈ n (log n + γ )

  41. Coupon Collector Recall the coupon collector problem is to ask “How many ball one needs to throw so that none of the bins is empty?” n We already established that E [ X ] = nH n ≈ n (log n + γ ) The Markov inequality only provides a very weak concentration…

  42. In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2

  43. In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2 n − 1 ∑ Recall that where each follows geometric X = X i X i i =0 n − i distribution with parameter n

  44. In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2 n − 1 ∑ Recall that where each follows geometric X = X i X i i =0 n − i distribution with parameter n are independent, so X 0 , …, X n − 1

  45. In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2 n − 1 ∑ Recall that where each follows geometric X = X i X i i =0 n − i distribution with parameter n are independent, so X 0 , …, X n − 1 Var [ X i ] = n − 1 n − 1 ∑ ∑ Var [ X i ] i =0 i =0

  46. Variance of Geometric Variables Assume follow geometric distribution with Y parameter p ∞ i 2 (1 − p ) i − 1 p = 2 − p ∑ E [ Y 2 ] = p 2 i =1 Var [ Y ] = E [ Y 2 ] − ( E [ Y ]) 2 = 1 − p p 2

  47. n − 1 n − 1 n − 1 n ⋅ i 1 ∑ ∑ ∑ ( n − i ) 2 ≤ n 2 Var [ X ] = Var [ X i ] = ( n − i ) 2 i =0 i =0 i =0 n 2 ) = π 2 n 2 ( 1 2 + 1 1 2 2 + 1 3 2 + … + 1 = n 2 . 6

  48. n − 1 n − 1 n − 1 n ⋅ i 1 ∑ ∑ ∑ ( n − i ) 2 ≤ n 2 Var [ X ] = Var [ X i ] = ( n − i ) 2 i =0 i =0 i =0 n 2 ) = π 2 n 2 ( 1 2 + 1 1 2 2 + 1 3 2 + … + 1 = n 2 . 6 By Chebyshev’s inequality, Pr[ X ≥ nH n + cn ] ≤ π 2 6 c 2

  49. n − 1 n − 1 n − 1 n ⋅ i 1 ∑ ∑ ∑ ( n − i ) 2 ≤ n 2 Var [ X ] = Var [ X i ] = ( n − i ) 2 i =0 i =0 i =0 n 2 ) = π 2 n 2 ( 1 2 + 1 1 2 2 + 1 3 2 + … + 1 = n 2 . 6 By Chebyshev’s inequality, Pr[ X ≥ nH n + cn ] ≤ π 2 6 c 2 The use of Chebyshev’s inequality is often referred to as the “second-moment method”

  50. Random Graph

  51. Random Graph Erd ő s–Rényi random graph G ( n , p )

  52. Random Graph Erd ő s–Rényi random graph G ( n , p ) vertices, each edge appears with probability n p independently

  53. Random Graph Erd ő s–Rényi random graph G ( n , p ) vertices, each edge appears with probability n p independently Given a graph property , define its threshold function P as: r ( n )

  54. Random Graph Erd ő s–Rényi random graph G ( n , p ) vertices, each edge appears with probability n p independently Given a graph property , define its threshold function P as: r ( n ) • if , does not satisfy whp; p ≪ r ( n ) G ∼ G ( n , p ) P • if , satisfies P whp. p ≫ r ( n ) G ∼ G ( n , p )

  55. We will show that the property “ contains a -clique” P = G 4 n − 2/3 has threshold function

  56. We will show that the property “ contains a -clique” P = G 4 n − 2/3 has threshold function S ∈ ( 4 ) [ n ] For every , let be the indicator that X S “ is a clique”. G [ S ]

Recommend


More recommend