probability and statistics
play

Probability and Statistics for Computer Science The weak law of - PowerPoint PPT Presentation

Probability and Statistics for Computer Science The weak law of large numbers gives us a very valuable way of thinking about expecta:ons. ---Prof. Forsythe Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC,


  1. Probability and Statistics ì for Computer Science “The weak law of large numbers gives us a very valuable way of thinking about expecta:ons.” ---Prof. Forsythe Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC, 09.22.2020

  2. Last time ✺ Random Variable ✺ Expected value ✺ Variance & covariance

  3. Last time

  4. Content

  5. Content ✺ Random Variable ✺ Review with ques,ons ✺ The weak law of large numbers ✺ Simula=on & example of airline overbooking

  6. Expected value ✺ The expected value (or expecta,on ) of a random variable X is � E [ X ] = xP ( x ) x The expected value is a weighted sum of all the values X can take

  7. Linearity of Expectation

  8. Expected value of a function of X

  9. Q: What is E[E[X]]? A. E[X] B. 0 C. Can’t be sure

  10. Probability distribution ✺ Given the random variable X , what is E[2| X | +1]? A. 0 p ( x ) P ( X = x ) B. 1 C. 2 D. 3 1/2 E. 5 0 -1 1 X

  11. Probability distribution ✺ Given the random variable S in the 4- sided die, whose range is {2,3,4,5,6,7,8}, probability distribu:on of S. What is E[S] ? p ( s ) A. 4 B. 5 C. 6 1/16 S 5 6 8 2 3 4 7

  12. A neater expression for variance ✺ Variance of Random Variable X is defined as: var [ X ] = E [( X − E [ X ]) 2 ] ✺ It’s the same as: var [ X ] = E [ X 2 ] − E [ X ] 2

  13. Probability distribution and cumulative distribution ✺ Given the random variable X , what is var[2| X | +1]? A. 0 p ( x ) P ( X = x ) B. 1 C. 2 D. 3 1/2 E. -1 0 -1 1 X

  14. Probability distribution ✺ Given the random variable X , what is var[2| X | +1]? Let Y = 2| X |+1 p ( y ) P ( Y = y ) 1 0 3 X

  15. Probability distribution ✺ Give the random variable S in the 4- sided die, whose range is {2,3,4,5,6,7,8}, probability distribu:on of S. p ( s ) What is var[S] ? 1/16 S 5 6 8 2 3 4 7

  16. Content ✺ Random Variable ✺ Review with ques=ons ✺ The weak law of large numbers

  17. Towards the weak law of large numbers ✺ The weak law says that if we repeat a random experiment many :mes, the average of the observa:ons will “converge” to the expected value ✺ For example, if you repeat the profit example, the average earning will “converge” to E[ X ]=20p-10 ✺ The weak law jus:fies using simula:ons (instead of calcula:on) to es:mate the expected values of random variables

  18. Markov’s inequality ✺ For any random variable X that only take s x ≥ 0 and constant a > 0 P ( X ≥ a ) ≤ E [ X ] a ✺ For example, if a = 10 E[X] E [ X ] P ( X ≥ 10 E [ X ]) ≤ 10 E [ X ] = 0 . 1

  19. Proof of Markov’s inequality

  20. Chebyshev’s inequality ✺ For any random variable X and constant a >0 P ( | X − E [ X ] | ≥ a ) ≤ var [ X ] a 2 ✺ If we let a = kσ where σ = std[ X ] P ( | X − E [ X ] | ≥ k σ ) ≤ 1 k 2 ✺ In words, the probability that X is greater than k standard devia:on away from the mean is small

  21. Proof of Chebyshev’s inequality ✺ Given Markov inequality, a>0, x ≥ 0 P ( X ≥ a ) ≤ E [ X ] a ✺ We can rewrite it as P ( | U | ≥ w ) ≤ E [ | U | ] ω > 0 w

  22. Proof of Chebyshev’s inequality ✺ If U = ( X − E [ X ]) 2 P ( | U | ≥ w ) ≤ E [ | U | ] = E [ U ] w w

  23. Proof of Chebyshev’s inequality ✺ Apply Markov inequality to U = ( X − E [ X ]) 2 P ( | U | ≥ w ) ≤ E [ | U | ] = E [ U ] = var [ X ] w w w ✺ Subs:tute and w = a 2 U = ( X − E [ X ]) 2 P (( X − E [ X ]) 2 ≥ a 2 ) ≤ var [ X ] Assume a > 0 a 2 ⇒ P ( | X − E [ X ] | ≥ a ) ≤ var [ X ] a 2

  24. Now we are closer to the law of large numbers

  25. Sample mean and IID samples ✺ We define the sample mean to be the X average of N random variables X 1 , …, X N . ✺ If X 1 , …, X N are independent and have iden,cal probability func:on P ( x ) then the numbers randomly generated from them are called IID samples ✺ The sample mean is a random variable

  26. Sample mean and IID samples ✺ Assume we have a set of IID samples from N random variables X 1 , …, X N that have probability func:on P ( x ) ✺ We use to denote the sample mean of X these IID samples � N i =1 X i X = N

  27. Expected value of sample mean of IID random variables ✺ By linearity of expected value N � N ] = 1 i =1 X i � E [ X ] = E [ E [ X i ] N N i =1

  28. Expected value of sample mean of IID random variables ✺ By linearity of expected value N � N ] = 1 i =1 X i � E [ X ] = E [ E [ X i ] N N i =1 ✺ Given each X i has iden:cal P ( x ) N E [ X ] = 1 � E [ X ] = E [ X ] N i =1

  29. Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1

  30. Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1 ✺ And by independence of these IID random variables N var [ X ] = 1 � var [ X i ] N 2 i =1

  31. Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1 ✺ And by independence of these IID random variables N var [ X ] = 1 � var [ X i ] N 2 i =1 ✺ Given each X i has iden:cal , P ( x ) var [ X i ] = var [ X ] N var [ X ] = 1 var [ X ] = var [ X ] � N 2 N i =1

  32. Expected value and variance of sample mean of IID random variables ✺ The expected value of sample mean is the same as the expected value of the distribu:on E [ X ] = E [ X ] ✺ The variance of sample mean is the distribu:on’s variance divided by the sample size N var [ X ] = var [ X ] N

  33. Weak law of large numbers ✺ Given a random variable X with finite variance, probability distribu:on func:on and the P ( x ) sample mean of size N . X ✺ For any posi:ve number � > 0 N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim ✺ That is: the value of the mean of IID samples is very close with high probability to the expected value of the popula:on when sample size is very large

  34. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2

  35. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N

  36. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] N � 2

  37. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] 0 N � 2 N → ∞

  38. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] 0 N � 2 N → ∞ N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim

  39. Applications of the Weak law of large numbers

  40. Applications of the Weak law of large numbers ✺ The law of large numbers jus,fies using simula,ons (instead of calcula:on) to es:mate the expected values of random variables N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim ✺ The law of large numbers also jus,fies using histogram of large random samples to approximate the probability distribu:on func:on , see proof on P ( x ) Pg. 353 of the textbook by DeGroot, et al.

  41. Histogram of large random IID samples approximates the probability distribution ✺ The law of large numbers jus:fies using histograms to approximate the probability distribu:on. Given N IID random variables X 1 , …, X N ✺ According to the law of large numbers � N i =1 Y i N → ∞ E [ Y i ] Y = N ✺ As we know for indicator func:on E [ Y i ] = P ( c 1 ≤ X i < c 2 ) = P ( c 1 ≤ X < c 2 )

  42. Simulation of the sum of two-dice ✺ hpp://www.randomservices.org/ random/apps/DiceExperiment.html

  43. Probability using the property of Independence: Airline overbooking ✺ An airline has a flight with s seats. They always sell t ( t > s ) :ckets for this flight. If :cket holders show up independently with probability p , what is the probability that the flight is overbooked ? t � P( overbooked) C ( t, u ) p u (1 − p ) t − u = u = s +1

  44. Simulation of airline overbooking ✺ An airline has a flight with 7 seats. They always sell 12 :ckets for this flight. If :cket holders show up independently with probability p , es:mate the following values ✺ Expected value of the number of :cket holders who show up ✺ Probability that the flight being overbooked ✺ Expected value of the number of :cket holders who can’t fly due to the flight is overbooked.

Recommend


More recommend