Advanced Algorithms (III) Shanghai Jiao Tong University Chihao Zhang March 16th, 2020
Balls-into-Bins
Balls-into-Bins Throw balls into bins uniformly at random m n
Balls-into-Bins Throw balls into bins uniformly at random m n • What is the chance that some bin contains more than one balls? (Birthday paradox)
Balls-into-Bins Throw balls into bins uniformly at random m n • What is the chance that some bin contains more than one balls? (Birthday paradox) • How many balls in the fullest bin? (Max load)
Balls-into-Bins Throw balls into bins uniformly at random m n • What is the chance that some bin contains more than one balls? (Birthday paradox) • How many balls in the fullest bin? (Max load) • How large is to hit all bins (Coupon Collector) m
Birthday Paradox
Birthday Paradox In a group of more than 30 people, which very high chances that two of them have the same birthday
Birthday Paradox In a group of more than 30 people, which very high chances that two of them have the same birthday Pr[no same birthday] ≤ 1 ⋅ ( ) ⋅ ( ) … ( ) n − 1 n − 2 n − m + 1 n n n = exp ( − m ( m − 1) ∑ m − 1 m − 1 i =1 i i =1 ( 1 − i n ) ≤ exp − ) ∏ = n 2 n
Pr[no same birthday] ≤ exp ( − m ( m − 1) ) 2 n
Pr[no same birthday] ≤ exp ( − m ( m − 1) ) 2 n For , , the probability is less than 0.304 m = 30 n = 365
Pr[no same birthday] ≤ exp ( − m ( m − 1) ) 2 n For , , the probability is less than 0.304 m = 30 n = 365 m = O ( n ) For , the probability can be arbitrarily close to 0.
Max Load
Max Load Let be the number of balls in the -th bin X i i
Max Load Let be the number of balls in the -th bin X i i What is We analyze this when X = max i ∈ [ n ] X i ? m = n
Max Load Let be the number of balls in the -th bin X i i What is We analyze this when X = max i ∈ [ n ] X i ? m = n If we can argue that, is less than with X 1 k 1 − O ( n ) 1 probability , then by union bound, Pr[ X ≥ k ] = O (1)
Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k !
Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k ! 2 π k ( k e ) k We apply the Stirling’s formula k ! ≈
Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k ! 2 π k ( k e ) k We apply the Stirling’s formula k ! ≈ k ! ≤ ( k k ) So Pr[ X ≥ k ] ≤ 1 e
Again by union bound, Pr[ X 1 ≥ k ] ≤ ( k ) n − k ≤ 1 n k ! 2 π k ( k e ) k We apply the Stirling’s formula k ! ≈ k ! ≤ ( k k ) So Pr[ X ≥ k ] ≤ 1 e = O ( k = O ( k ( k ) n ) log log n ) e 1 log n We want . Choose
Concentration Bounds
Concentration Bounds We shall develop general tools to obtain “with high probability” results…
Concentration Bounds We shall develop general tools to obtain “with high probability” results… These results are critical for analyzing randomized algorithms
Concentration Bounds We shall develop general tools to obtain “with high probability” results… These results are critical for analyzing randomized algorithms This is the main topic in the coming 4-5 weeks
Markov Inequality
Markov Inequality Markov Inequality For any nonnegative random variable and , X a > 0 Pr[ X > a ] ≤ E [ X ] a
Markov Inequality Markov Inequality For any nonnegative random variable and , X a > 0 Pr[ X > a ] ≤ E [ X ] a Proof . E [ X ] = E [ X ∣ X > a ] ⋅ Pr[ X > a ] + E [ X | X ≤ a ] ⋅ Pr[ X ≤ a ] ≥ a ⋅ Pr[ X > a ]
Applications
Applications • A Las-Vegas randomized algorithm with expected running time terminates in O ( n 2 ) time with O ( n ) 1 − O ( n ) 1 probability
Applications • A Las-Vegas randomized algorithm with expected running time terminates in O ( n 2 ) time with O ( n ) 1 − O ( n ) 1 probability • In -balls-into- -bins problem, . So n n E [ X i ] = 1 Pr [ X 1 > log log n ] ≤ log log n log n log n
Applications • A Las-Vegas randomized algorithm with expected running time terminates in O ( n 2 ) time with O ( n ) 1 − O ( n ) 1 probability • In -balls-into- -bins problem, . So n n E [ X i ] = 1 Pr [ X 1 > log log n ] ≤ log log n log n log n This is far from the truth…
Chebyshev’s Inequality
Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ
Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ E [ f ( X ) ] Pr [ X ≥ a ] = Pr [ f ( X ) ≥ f ( a ) ] ≤ f ( a )
Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ E [ f ( X ) ] Pr [ X ≥ a ] = Pr [ f ( X ) ≥ f ( a ) ] ≤ f ( a ) f ( x ) = x 2 gives the Chebyshev’s inequality
Chebyshev’s Inequality A common trick to improve concentration is to consider instead of for some non- E [ f ( X )] E [ X ] decreasing f : ℝ → ℝ E [ f ( X ) ] Pr [ X ≥ a ] = Pr [ f ( X ) ≥ f ( a ) ] ≤ f ( a ) f ( x ) = x 2 gives the Chebyshev’s inequality Pr[ X ≥ a ] ≤ E [ X 2 ] or Pr [ | X − E [ X ] | ≥ a ] ≤ Var [ X ] a 2 a 2
Coupon Collector
Coupon Collector Recall the coupon collector problem is to ask
Coupon Collector Recall the coupon collector problem is to ask “How many ball one needs to throw so that none of the bins is empty?” n
Coupon Collector Recall the coupon collector problem is to ask “How many ball one needs to throw so that none of the bins is empty?” n We already established that E [ X ] = nH n ≈ n (log n + γ )
Coupon Collector Recall the coupon collector problem is to ask “How many ball one needs to throw so that none of the bins is empty?” n We already established that E [ X ] = nH n ≈ n (log n + γ ) The Markov inequality only provides a very weak concentration…
In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2
In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2 n − 1 ∑ Recall that where each follows geometric X = X i X i i =0 n − i distribution with parameter n
In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2 n − 1 ∑ Recall that where each follows geometric X = X i X i i =0 n − i distribution with parameter n are independent, so X 0 , …, X n − 1
In order to apply Chebyshev’s inequality, we need to compute Var [ X ] = E [ X 2 ] − ( E [ X ]) 2 n − 1 ∑ Recall that where each follows geometric X = X i X i i =0 n − i distribution with parameter n are independent, so X 0 , …, X n − 1 Var [ X i ] = n − 1 n − 1 ∑ ∑ Var [ X i ] i =0 i =0
Variance of Geometric Variables Assume follow geometric distribution with Y parameter p ∞ i 2 (1 − p ) i − 1 p = 2 − p ∑ E [ Y 2 ] = p 2 i =1 Var [ Y ] = E [ Y 2 ] − ( E [ Y ]) 2 = 1 − p p 2
n − 1 n − 1 n − 1 n ⋅ i 1 ∑ ∑ ∑ ( n − i ) 2 ≤ n 2 Var [ X ] = Var [ X i ] = ( n − i ) 2 i =0 i =0 i =0 n 2 ) = π 2 n 2 ( 1 2 + 1 1 2 2 + 1 3 2 + … + 1 = n 2 . 6
n − 1 n − 1 n − 1 n ⋅ i 1 ∑ ∑ ∑ ( n − i ) 2 ≤ n 2 Var [ X ] = Var [ X i ] = ( n − i ) 2 i =0 i =0 i =0 n 2 ) = π 2 n 2 ( 1 2 + 1 1 2 2 + 1 3 2 + … + 1 = n 2 . 6 By Chebyshev’s inequality, Pr[ X ≥ nH n + cn ] ≤ π 2 6 c 2
n − 1 n − 1 n − 1 n ⋅ i 1 ∑ ∑ ∑ ( n − i ) 2 ≤ n 2 Var [ X ] = Var [ X i ] = ( n − i ) 2 i =0 i =0 i =0 n 2 ) = π 2 n 2 ( 1 2 + 1 1 2 2 + 1 3 2 + … + 1 = n 2 . 6 By Chebyshev’s inequality, Pr[ X ≥ nH n + cn ] ≤ π 2 6 c 2 The use of Chebyshev’s inequality is often referred to as the “second-moment method”
Random Graph
Random Graph Erd ő s–Rényi random graph G ( n , p )
Random Graph Erd ő s–Rényi random graph G ( n , p ) vertices, each edge appears with probability n p independently
Random Graph Erd ő s–Rényi random graph G ( n , p ) vertices, each edge appears with probability n p independently Given a graph property , define its threshold function P as: r ( n )
Random Graph Erd ő s–Rényi random graph G ( n , p ) vertices, each edge appears with probability n p independently Given a graph property , define its threshold function P as: r ( n ) • if , does not satisfy whp; p ≪ r ( n ) G ∼ G ( n , p ) P • if , satisfies P whp. p ≫ r ( n ) G ∼ G ( n , p )
We will show that the property “ contains a -clique” P = G 4 n − 2/3 has threshold function
We will show that the property “ contains a -clique” P = G 4 n − 2/3 has threshold function S ∈ ( 4 ) [ n ] For every , let be the indicator that X S “ is a clique”. G [ S ]
Recommend
More recommend