CS 498ABD: Algorithms for Big Data, Spring 2019 Probabilistic Inequalities and Examples Lecture 3 January 22, 2019 Chandra (UIUC) CS498ABD 1 Spring 2019 1 / 38
Outline Probabilistic Inequalities Markov’s Inequality Chebyshev’s Inequality Bernstein-Chernoff-Hoeffding bounds Some examples Chandra (UIUC) CS498ABD 2 Spring 2019 2 / 38
Part I Inequalities Chandra (UIUC) CS498ABD 3 Spring 2019 3 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. Consider flipping a fair coin n times independently, head gives 1 , tail � n � gives zero. How many 1 s? Binomial distribution: k w.p. 1 / 2 n . k Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 38
Massive randomness.. Is not that random. This is known as concentration of mass . This is a very special case of the law of large numbers . Chandra (UIUC) CS498ABD 5 Spring 2019 5 / 38
Side note... Law of large numbers (weakest form)... Informal statement of law of large numbers For n large enough, the middle portion of the binomial distribution looks like (converges to) the normal/Gaussian distribution. Chandra (UIUC) CS498ABD 6 Spring 2019 6 / 38
Massive randomness.. Is not that random. Intuitive conclusion Randomized algorithm are unpredictable in the tactical level, but very predictable in the strategic level. Chandra (UIUC) CS498ABD 7 Spring 2019 7 / 38
Massive randomness.. Is not that random. Intuitive conclusion Randomized algorithm are unpredictable in the tactical level, but very predictable in the strategic level. Use of well known inequalities in analysis. Chandra (UIUC) CS498ABD 7 Spring 2019 7 / 38
Randomized QuickSort : A possible analysis Analysis Random variable Q = # comparisons made by randomized QuickSort on an array of n elements. Chandra (UIUC) CS498ABD 8 Spring 2019 8 / 38
Randomized QuickSort : A possible analysis Analysis Random variable Q = # comparisons made by randomized QuickSort on an array of n elements. Suppose Pr[ Q ≥ 10 nlgn ] ≤ c . Also we know that Q ≤ n 2 . Chandra (UIUC) CS498ABD 8 Spring 2019 8 / 38
Randomized QuickSort : A possible analysis Analysis Random variable Q = # comparisons made by randomized QuickSort on an array of n elements. Suppose Pr[ Q ≥ 10 nlgn ] ≤ c . Also we know that Q ≤ n 2 . E[ Q ] ≤ 10 n log n + ( n 2 − 10 n log n ) c . Chandra (UIUC) CS498ABD 8 Spring 2019 8 / 38
Randomized QuickSort : A possible analysis Analysis Random variable Q = # comparisons made by randomized QuickSort on an array of n elements. Suppose Pr[ Q ≥ 10 nlgn ] ≤ c . Also we know that Q ≤ n 2 . E[ Q ] ≤ 10 n log n + ( n 2 − 10 n log n ) c . Question: How to find c , or in other words bound Pr[ Q ≥ 10 n log n ] ? Chandra (UIUC) CS498ABD 8 Spring 2019 8 / 38
Markov’s Inequality Markov’s inequality Let X be a non-negative random variable over a probability space [ X ] (Ω , Pr) . For any a > 0 , Pr[ X ≥ a ] ≤ E a . Equivalently, for any t > 0 , Pr[ X ≥ tE [ X ]] ≤ 1 / t . Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 38
Markov’s Inequality Markov’s inequality Let X be a non-negative random variable over a probability space [ X ] (Ω , Pr) . For any a > 0 , Pr[ X ≥ a ] ≤ E a . Equivalently, for any t > 0 , Pr[ X ≥ tE [ X ]] ≤ 1 / t . Proof: � E[ X ] = ω ∈ Ω X ( ω ) Pr[ ω ] = � ω, 0 ≤ X ( ω ) < a X ( ω ) Pr[ ω ] + � ω, X ( ω ) ≥ a X ( ω ) Pr[ ω ] � ≥ ω ∈ Ω , X ( ω ) ≥ a X ( ω ) Pr[ ω ] ≥ a � ω ∈ Ω , X ( ω ) ≥ a Pr[ ω ] = a Pr[ X ≥ a ] Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 38
Markov’s Inequality Markov’s inequality Let X be a non-negative random variable over a probability space [ X ] (Ω , Pr) . For any a > 0 , Pr[ X ≥ a ] ≤ E a . Equivalently, for any t > 0 , Pr[ X ≥ tE [ X ]] ≤ 1 / t . Chandra (UIUC) CS498ABD 10 Spring 2019 10 / 38
Markov’s Inequality Markov’s inequality Let X be a non-negative random variable over a probability space [ X ] (Ω , Pr) . For any a > 0 , Pr[ X ≥ a ] ≤ E a . Equivalently, for any t > 0 , Pr[ X ≥ tE [ X ]] ≤ 1 / t . Proof: � ∞ E[ X ] = zf X ( z ) dz 0 � ∞ ≥ zf X ( z ) dz a � ∞ ≥ a f X ( z ) dz a = a Pr[ X ≥ a ] Chandra (UIUC) CS498ABD 10 Spring 2019 10 / 38
Markov’s Inequality: Proof by Picture Chandra (UIUC) CS498ABD 11 Spring 2019 11 / 38
Chebyshev’s Inequality: Variance Variance Given a random variable X over probability space (Ω , Pr) , variance of X is the measure of how much does it deviate from its mean − E[ X ] 2 � ( X − E[ X ]) 2 � � X 2 � value. Formally, Var ( X ) = E = E Derivation Define Y = ( X − E[ X ]) 2 = X 2 − 2 X E[ X ] + E[ X ] 2 . Var ( X ) = E[ Y ] − 2 E[ X ] E[ X ] + E[ X ] 2 � X 2 � = E − E[ X ] 2 � X 2 � = E Chandra (UIUC) CS498ABD 12 Spring 2019 12 / 38
Chebyshev’s Inequality: Variance Independence Random variables X and Y are called mutually independent if ∀ x , y ∈ R , Pr[ X = x ∧ Y = y ] = Pr[ X = x ] Pr[ Y = y ] Lemma If X and Y are independent random variables then Var ( X + Y ) = Var ( X ) + Var ( Y ) . Chandra (UIUC) CS498ABD 13 Spring 2019 13 / 38
Chebyshev’s Inequality: Variance Independence Random variables X and Y are called mutually independent if ∀ x , y ∈ R , Pr[ X = x ∧ Y = y ] = Pr[ X = x ] Pr[ Y = y ] Lemma If X and Y are independent random variables then Var ( X + Y ) = Var ( X ) + Var ( Y ) . Lemma If X and Y are mutually independent, then E[ XY ] = E[ X ] E[ Y ] . Chandra (UIUC) CS498ABD 13 Spring 2019 13 / 38
Chebyshev’s Inequality Chebyshev’s Inequality If VarX < ∞ , for any a ≥ 0 , Pr[ | X − E[ X ] | ≥ a ] ≤ Var ( X ) a 2 Chandra (UIUC) CS498ABD 14 Spring 2019 14 / 38
Chebyshev’s Inequality Chebyshev’s Inequality If VarX < ∞ , for any a ≥ 0 , Pr[ | X − E[ X ] | ≥ a ] ≤ Var ( X ) a 2 Proof. Y = ( X − E[ X ]) 2 is a non-negative random variable. Apply Markov’s Inequality to Y for a 2 . ( X − E[ X ]) 2 ≥ a 2 � � Y ≥ a 2 � � Pr ≤ E [ Y ] / a 2 ⇔ Pr ≤ Var ( X ) / a 2 ⇔ Pr[ | X − E[ X ] | ≥ a ] ≤ Var ( X ) / a 2 Chandra (UIUC) CS498ABD 14 Spring 2019 14 / 38
Chebyshev’s Inequality Chebyshev’s Inequality If VarX < ∞ , for any a ≥ 0 , Pr[ | X − E[ X ] | ≥ a ] ≤ Var ( X ) a 2 Proof. Y = ( X − E[ X ]) 2 is a non-negative random variable. Apply Markov’s Inequality to Y for a 2 . ( X − E[ X ]) 2 ≥ a 2 � � Y ≥ a 2 � � Pr ≤ E [ Y ] / a 2 ⇔ Pr ≤ Var ( X ) / a 2 ⇔ Pr[ | X − E[ X ] | ≥ a ] ≤ Var ( X ) / a 2 Pr[ X ≤ E[ X ] − a ] ≤ Var ( X ) / a 2 AND Pr[ X ≥ E[ X ] + a ] ≤ Var ( X ) / a 2 Chandra (UIUC) CS498ABD 14 Spring 2019 14 / 38
Recommend
More recommend