Bounding Deviation from Expectation Theorem [Markov Inequality] For - PowerPoint PPT Presentation

Bounding Deviation from Expectation Theorem [Markov Inequality] For any non-negative random variable X, and for all a > 0 , Pr ( X ≥ a ) ≤ E [ X ] . a Proof. � � E [ X ] = iPr ( X = i ) ≥ a Pr ( X = i ) = aPr ( X ≥ a ) . i ≥ a Example: The expected number of comparisons executed by the k -select algorithm was 9 n . The probability that it executes 18 n comparisons or more ≤ 9 n 18 n = 1 2 .

Variance Definition The variance of a random variable X is Var [ X ] = E [( X − E [ X ]) 2 ] = E [ X 2 ] − ( E [ X ]) 2 . Definition The standard deviation of a random variable X is � σ ( X ) = Var [ X ] .

Chebyshev’s Inequality Theorem For any random variable X, and any a > 0 , Pr ( | X − E [ X ] | ≥ a ) ≤ Var [ X ] . a 2 Proof. Pr ( | X − E [ X ] | ≥ a ) = Pr (( X − E [ X ]) 2 ≥ a 2 ) By Markov inequality Pr (( X − E [ X ]) 2 ≥ a 2 ) ≤ E [( X − E [ X ]) 2 ] a 2 = Var [ X ] a 2

Theorem For any random variable X and any a > 0 : Pr ( | X − E [ X ] | ≥ a σ [ X ]) ≤ 1 a 2 . Theorem For any random variable X and any ε > 0 : Var [ X ] Pr ( | X − E [ X ] | ≥ ε E [ X ]) ≤ ε 2 ( E [ X ]) 2 .

Theorem If X and Y are independent random variables E [ XY ] = E [ X ] · E [ Y ] . Proof. � � E [ XY ] = i · jPr (( X = i ) ∩ ( Y = j )) = i j � � ijPr ( X = i ) · Pr ( Y = j ) = i j �   �� �  . iPr ( X = i ) jPr ( Y = j ) i j

Theorem If X and Y are independent random variables Var [ X + Y ] = Var [ X ] + Var [ Y ] . Proof. Var [ X + Y ] = E [( X + Y − E [ X ] − E [ Y ]) 2 ] = E [( X − E [ X ]) 2 + ( Y − E [ Y ]) 2 + 2( X − E [ X ])( Y − E [ Y ])] = Var [ X ] + Var [ Y ] + 2 E [ X − E [ X ]] E [ Y − E [ Y ]] Since the random variables X − E [ X ] and Y − E [ Y ] are independent. But E [ X − E [ X ]] = E [ X ] − E [ X ] = 0 .

Bernoulli Trial Let X be a 0-1 random variable such that Pr ( X = 1) = p , Pr ( X = 0) = 1 − p . E [ X ] = 1 · p + 0 · (1 − p ) = p . Var [ X ] = p (1 − p ) 2 + (1 − p )(0 − p ) 2 = p (1 − p )(1 − p + p ) = p (1 − p ) .

A Binomial Random variable Consider a sequence of n independent Bernoulli trials X 1 , ...., X n . Let n � X = X i . i =1 X has a Binomial distribution X ∼ B ( n , p ). � n � p k (1 − p ) n − k . Pr ( X = k ) = k E [ X ] = np . Var [ X ] = np (1 − p ) .

The Geometric Distribution • How many times do we need to perform a trial with probability p for success till we get the first success? • How many times do we need to roll a dice until we get the first 6? Definition A geometric random variable X with parameter p is given by the following probability distribution on n = 1 , 2 , . . . . Pr( X = n ) = (1 − p ) n − 1 p .

Memoryless Distribution Lemma For a geometric random variable with parameter p and n > 0 , Pr( X = n + k | X > k ) = Pr( X = n ) . Proof. X > k ) = Pr(( X = n + k ) ∩ ( X > k )) Pr( X = n + k | Pr( X > k ) = (1 − p ) n + k − 1 p Pr( X = n + k ) = � ∞ Pr( X > k ) i = k (1 − p ) i p (1 − p ) n + k − 1 p = (1 − p ) n − 1 p = Pr( X = n ) . = (1 − p ) k

Conditional Expectation Definition � E [ Y | Z = z ] = y Pr( Y = y | Z = z ) , y where the summation is over all y in the range of Y .

Lemma For any random variables X and Y , � E [ X ] = E y [ E X [ X | Y ]] = Pr( Y = y ) E [ X | Y = y ] , y where the sum is over all values in the range of Y . Proof. � Pr( Y = y ) E [ X | Y = y ] y � � = Pr( Y = y ) x Pr( X = x | Y = y ) y x � � = x Pr( X = x | Y = y ) Pr( Y = y ) x y � � � = x Pr( X = x ∩ Y = y ) = x Pr( X = x ) = E [ X ] . x y x

Example Consider a two phase game: • Phase I: roll one die. Let X be the outcome. • Phase II: Flip X fair coins, let Y be the number of HEADs. • You receive a dollar for each HEAD. Y is distributed B ( X , 1 2 ), E [ Y | X = a ] = a 2 6 � E [ Y ] = E [ Y | X = i ] Pr ( X = i ) i =1 6 2 Pr ( X = i ) = 7 i � = 4 i =1

Geometric Random Variable: Expectation • Let X be a geometric random variable with parameter p . • Let Y = 1 if the first trail is a success, Y = 0 otherwise. • E [ X ] = Pr( Y = 0) E [ X | Y = 0] + Pr( Y = 1) E [ X | Y = 1] = (1 − p ) E [ X | Y = 0] + p E [ X | Y = 1] . • If Y = 0 let Z be the number of trials after the first one. • E [ X ] = (1 − p ) E [ Z + 1] + p · 1 = (1 − p ) E [ Z ] + 1 • But E [ Z ] = E [ X ], giving E [ X ] = 1 / p .

Variance of a Geometric Random Variable • We use Var [ X ] = E [( X − E [ X ]) 2 ] = E [ X 2 ] − ( E [ X ]) 2 . • To compute E [ X 2 ], let Y = 1 if the first trial is a success, Y = 0 otherwise. • Pr( Y = 0) E [ X 2 | Y = 0] + Pr( Y = 1) E [ X 2 | Y = 1] E [ X 2 ] = (1 − p ) E [ X 2 | Y = 0] + p E [ X 2 | Y = 1] . = • If Y = 0 let Z be the number of trials after the first one. • E [ X 2 ] (1 − p ) E [( Z + 1) 2 ] + p · 1 = (1 − p ) E [ Z 2 ] + 2(1 − p ) E [ Z ] + 1 , =

• E [ Z ] = 1 / p and E [ Z 2 ] = E [ X 2 ]. • E [ X 2 ] (1 − p ) E [( Z + 1) 2 ] + p · 1 = (1 − p ) E [ Z 2 ] + 2(1 − p ) E [ Z ] + 1 , = • E [ X 2 ] = (1 − p ) E [ X 2 ]+2(1 − p ) / p +1 = (1 − p ) E [ X 2 ]+(2 − p ) / p , • E [ X 2 ] = (2 − p ) / p 2 .

Variance of a Geometric Random Variable E [ X 2 ] − E [ X ] 2 Var [ X ] = 2 − p − 1 = p 2 p 2 1 − p = p 2 .

Back to the k -select Algorithm • Let X be the total number of comparisons. • Let T i be the number of iterations between the i -th successful call (included) and the i + 1-th (excluded): • X ≤ � log 3 / 2 n n (2 / 3) i T i . i =0 • T i ∼ G (1 / 3), therefore E [ T i ] = 3, Var [ T i ] = 9 / 4. • Expected number of comparisons: E [ X ] ≤ � log 3 / 2 n 3 n (2 / 3) j ≤ 9 n . j =0 • Variance of the number of comparisons: Var [ X ] = � log 3 / 2 n n 2 (2 / 3) 2 i Var [ T i ] ≤ 11 n 2 i =0 11 n 2 Pr ( | X − E [ X ] | ≥ δ E [ X ]) ≤ Var [ X ] δ 2 E [ X ] 2 ≤ δ 2 81 n 2

Example: Coupon Collector’s Problem Suppose that each box of cereal contains a random coupon from a set of n different coupons. How many boxes of cereal do you need to buy before you obtain at least one of every type of coupon? Let X be the number of boxes bought until at least one of every type of coupon is obtained. Let X i be the number of boxes bought while you had exactly i − 1 different coupons. n � X = X i i =1 X i is a geometric random variable with parameter 1 − i − 1 p i = . n

E [ X i ] = 1 n = n − i + 1 . p i � n � � E [ X ] = E X i i =1 n � = E [ X i ] i =1 n n � = n − i + 1 i =1 n 1 � = i = n ln n + Θ( n ) . n i =1

Example: Coupon Collector’s Problem • We place balls independently and uniformly at random in n boxes. • Let X be the number of balls placed until all boxes are not empty. • What is E [ X ]?

• Let X i = number of balls placed when there were exactly i − 1 non-empty boxes. • X = � n i =1 X i . • X i is a geometric random variable with parameter p i = 1 − i − 1 n . • E [ X i ] = 1 n = n − i + 1 . p i � n � n � � E [ X ] = = E [ X i ] E X i i =1 i =1 n n n 1 � � = n − i + 1 = n i = n ln n + Θ( n ) . i =1 i =1

Back to the Coupon Collector’s Problem • Suppose that each box of cereal contains a random coupon from a set of n different coupons. • Let X be the number of boxes bought until at least one of every type of coupon is obtained. • E [ X ] = nH n = n ln n + Θ( n ) • What is Pr( X ≥ 2 E [ X ])? • Applying Markov’s inequality Pr( X ≥ 2 nH n ) ≤ 1 2 . • Can we do better?

• Let X i be the number of boxes bought while you had exactly i − 1 different coupons. • X = � n i =1 X i . • X i is a geometric random variable with parameter p i = 1 − i − 1 n . • Var [ X i ] ≤ 1 n − i +1 ) 2 . n p 2 ≤ ( • n n n � 2 � 2 ≤ π 2 n 2 � � 1 n � � � = n 2 Var [ X ] = Var [ X i ] ≤ . n − i + 1 i 6 i =1 i =1 i =1 • By Chebyshev’s inequality Pr( | X − nH n | ≥ nH n ) ≤ n 2 π 2 / 6 π 2 � 1 � ( nH n ) 2 = 6( H n ) 2 = O . ln 2 n

Direct Bound • The probability of not obtaining the i -th coupon after n ln n + cn steps: � n (ln n + c ) � 1 − 1 1 ≤ e − (ln n + c ) = e c n . n • By a union bound, the probability that some coupon has not been collected after n ln n + cn step is e − c . • The probability that all coupons are not collected after 2 n ln n steps is at most 1 / n .

Bounding Deviation from Expectation Theorem [Markov Inequality] For - PowerPoint PPT Presentation

Bounding Deviation from Expectation Theorem [Markov Inequality] For any non-negative random variable X, and for all a > 0 , Pr ( X a ) E [ X ] . a Proof. E [ X ] = iPr ( X = i ) a Pr ( X = i ) = aPr ( X a ) . i a

Hierarchical Bounding Volume October 11, 2005 () Hierarchical Bounding Volume October 11, 2005

more on expectation 1 2 properties of expectation properties of expectation Linearity, II

Mean Absolute Deviation Mean Absolute Deviation O Definition: Mean Absolute Deviation (MAD) is

CS70: Jean Walrand: Lecture 27. Expectation; Conditional Expectation; B(n, p); G(p) 1. Review of

Expectation Will Perkins January 21, 2013 Expectation Definition The expectation of a random

Feb 27: Expectation, Variance, and Standard Deviation In-class Midterm Exam MOVED to 3/10

3.2 More on Expectation 3.3 Variance and Standard Deviation Anna Karlin Most Slides by Alex Tsun

Towards Secure Distance Bounding Ioana Boureanu, Katerina Mitrokotsa, Serge Vaudenay COLE

On the Need for Provably Secure Distance Bounding Serge Vaudenay COLE POLYTECHNIQUE

Learning From Data Lecture 6 Bounding The Growth Function Bounding the Growth Function Models

Typically represent objects by bounding boxes. People have tried Goal rotated bounding boxes

Computer Graphics MTAT.03.015 Raimond Tunnel The Road So Far... Bounding Box With bounding

Spatial Data Structures Hierarchical Bounding Volumes Hierarchical Bounding Volumes Grids Grids

Distance Bounding for RFID Prof. Gildas Avoine Universit e catholique de Louvain, Belgium

Standard Deviation MDM4U: Mathematics of Data Management A deviation is the difference between any

Foundations of Computer Science Lecture 20 Expected Value of a Sum Linearity of Expectation

Forecast setup: Forecasting is about the future! The practical setup: we are at time t (e.g., at

CPU Scheduling Continued. CS 416: Operating Systems Design, Spring 2011 Department of

Trainyard: A level design post-mortem Matt Rix Magicule Inc. - Im Matt Rix, the creator of

Calculus 1120, Spring 2012 Dan Barbasch October 11, 2012 Dan Barbasch () Calculus 1120, Spring

Some Definition and Example of Markov Chain Bowen Dai The Ohio State University April 5 th 2016

Strengths and weaknesses of quantum examples Srinivasan Arunachalam (MIT) joint with Ronald de

Probability Theory Defd in terms of a probability space or sample space S (or ), a set whose

Bernstein Bound is Tight Repairing Luykx-Preneel Optimal Forgeries Mridul Nandi Indian