Probability and Statistics for Computer Science The weak law of - PowerPoint PPT Presentation

Probability and Statistics ì for Computer Science “The weak law of large numbers gives us a very valuable way of thinking about expecta:ons.” ---Prof. Forsythe Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC, 09.22.2020

Last time ✺ Random Variable ✺ Expected value ✺ Variance & covariance

Last time

Content

Content ✺ Random Variable ✺ Review with ques,ons ✺ The weak law of large numbers ✺ Simula=on & example of airline overbooking

Expected value ✺ The expected value (or expecta,on ) of a random variable X is � E [ X ] = xP ( x ) x The expected value is a weighted sum of all the values X can take

Linearity of Expectation

Expected value of a function of X

Q: What is E[E[X]]? A. E[X] B. 0 C. Can’t be sure

Probability distribution ✺ Given the random variable X , what is E[2| X | +1]? A. 0 p ( x ) P ( X = x ) B. 1 C. 2 D. 3 1/2 E. 5 0 -1 1 X

Probability distribution ✺ Given the random variable S in the 4- sided die, whose range is {2,3,4,5,6,7,8}, probability distribu:on of S. What is E[S] ? p ( s ) A. 4 B. 5 C. 6 1/16 S 5 6 8 2 3 4 7

A neater expression for variance ✺ Variance of Random Variable X is defined as: var [ X ] = E [( X − E [ X ]) 2 ] ✺ It’s the same as: var [ X ] = E [ X 2 ] − E [ X ] 2

Probability distribution and cumulative distribution ✺ Given the random variable X , what is var[2| X | +1]? A. 0 p ( x ) P ( X = x ) B. 1 C. 2 D. 3 1/2 E. -1 0 -1 1 X

Probability distribution ✺ Given the random variable X , what is var[2| X | +1]? Let Y = 2| X |+1 p ( y ) P ( Y = y ) 1 0 3 X

Probability distribution ✺ Give the random variable S in the 4- sided die, whose range is {2,3,4,5,6,7,8}, probability distribu:on of S. p ( s ) What is var[S] ? 1/16 S 5 6 8 2 3 4 7

Content ✺ Random Variable ✺ Review with ques=ons ✺ The weak law of large numbers

Towards the weak law of large numbers ✺ The weak law says that if we repeat a random experiment many :mes, the average of the observa:ons will “converge” to the expected value ✺ For example, if you repeat the profit example, the average earning will “converge” to E[ X ]=20p-10 ✺ The weak law jus:fies using simula:ons (instead of calcula:on) to es:mate the expected values of random variables

Markov’s inequality ✺ For any random variable X that only take s x ≥ 0 and constant a > 0 P ( X ≥ a ) ≤ E [ X ] a ✺ For example, if a = 10 E[X] E [ X ] P ( X ≥ 10 E [ X ]) ≤ 10 E [ X ] = 0 . 1

Proof of Markov’s inequality

Chebyshev’s inequality ✺ For any random variable X and constant a >0 P ( | X − E [ X ] | ≥ a ) ≤ var [ X ] a 2 ✺ If we let a = kσ where σ = std[ X ] P ( | X − E [ X ] | ≥ k σ ) ≤ 1 k 2 ✺ In words, the probability that X is greater than k standard devia:on away from the mean is small

Proof of Chebyshev’s inequality ✺ Given Markov inequality, a>0, x ≥ 0 P ( X ≥ a ) ≤ E [ X ] a ✺ We can rewrite it as P ( | U | ≥ w ) ≤ E [ | U | ] ω > 0 w

Proof of Chebyshev’s inequality ✺ If U = ( X − E [ X ]) 2 P ( | U | ≥ w ) ≤ E [ | U | ] = E [ U ] w w

Proof of Chebyshev’s inequality ✺ Apply Markov inequality to U = ( X − E [ X ]) 2 P ( | U | ≥ w ) ≤ E [ | U | ] = E [ U ] = var [ X ] w w w ✺ Subs:tute and w = a 2 U = ( X − E [ X ]) 2 P (( X − E [ X ]) 2 ≥ a 2 ) ≤ var [ X ] Assume a > 0 a 2 ⇒ P ( | X − E [ X ] | ≥ a ) ≤ var [ X ] a 2

Now we are closer to the law of large numbers

Sample mean and IID samples ✺ We define the sample mean to be the X average of N random variables X 1 , …, X N . ✺ If X 1 , …, X N are independent and have iden,cal probability func:on P ( x ) then the numbers randomly generated from them are called IID samples ✺ The sample mean is a random variable

Sample mean and IID samples ✺ Assume we have a set of IID samples from N random variables X 1 , …, X N that have probability func:on P ( x ) ✺ We use to denote the sample mean of X these IID samples � N i =1 X i X = N

Expected value of sample mean of IID random variables ✺ By linearity of expected value N � N ] = 1 i =1 X i � E [ X ] = E [ E [ X i ] N N i =1

Expected value of sample mean of IID random variables ✺ By linearity of expected value N � N ] = 1 i =1 X i � E [ X ] = E [ E [ X i ] N N i =1 ✺ Given each X i has iden:cal P ( x ) N E [ X ] = 1 � E [ X ] = E [ X ] N i =1

Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1

Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1 ✺ And by independence of these IID random variables N var [ X ] = 1 � var [ X i ] N 2 i =1

Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1 ✺ And by independence of these IID random variables N var [ X ] = 1 � var [ X i ] N 2 i =1 ✺ Given each X i has iden:cal , P ( x ) var [ X i ] = var [ X ] N var [ X ] = 1 var [ X ] = var [ X ] � N 2 N i =1

Expected value and variance of sample mean of IID random variables ✺ The expected value of sample mean is the same as the expected value of the distribu:on E [ X ] = E [ X ] ✺ The variance of sample mean is the distribu:on’s variance divided by the sample size N var [ X ] = var [ X ] N

Weak law of large numbers ✺ Given a random variable X with finite variance, probability distribu:on func:on and the P ( x ) sample mean of size N . X ✺ For any posi:ve number � > 0 N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim ✺ That is: the value of the mean of IID samples is very close with high probability to the expected value of the popula:on when sample size is very large

Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2

Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N

Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] N � 2

Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] 0 N � 2 N → ∞

Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] 0 N � 2 N → ∞ N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim

Applications of the Weak law of large numbers

Applications of the Weak law of large numbers ✺ The law of large numbers jus,fies using simula,ons (instead of calcula:on) to es:mate the expected values of random variables N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim ✺ The law of large numbers also jus,fies using histogram of large random samples to approximate the probability distribu:on func:on , see proof on P ( x ) Pg. 353 of the textbook by DeGroot, et al.

Histogram of large random IID samples approximates the probability distribution ✺ The law of large numbers jus:fies using histograms to approximate the probability distribu:on. Given N IID random variables X 1 , …, X N ✺ According to the law of large numbers � N i =1 Y i N → ∞ E [ Y i ] Y = N ✺ As we know for indicator func:on E [ Y i ] = P ( c 1 ≤ X i < c 2 ) = P ( c 1 ≤ X < c 2 )

Simulation of the sum of two-dice ✺ hpp://www.randomservices.org/ random/apps/DiceExperiment.html

Probability using the property of Independence: Airline overbooking ✺ An airline has a flight with s seats. They always sell t ( t > s ) :ckets for this flight. If :cket holders show up independently with probability p , what is the probability that the flight is overbooked ? t � P( overbooked) C ( t, u ) p u (1 − p ) t − u = u = s +1

Simulation of airline overbooking ✺ An airline has a flight with 7 seats. They always sell 12 :ckets for this flight. If :cket holders show up independently with probability p , es:mate the following values ✺ Expected value of the number of :cket holders who show up ✺ Probability that the flight being overbooked ✺ Expected value of the number of :cket holders who can’t fly due to the flight is overbooked.

Probability and Statistics for Computer Science The weak law of - PowerPoint PPT Presentation

Probability and Statistics for Computer Science The weak law of large numbers gives us a very valuable way of thinking about expecta:ons. ---Prof. Forsythe Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC,

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Categorical Probability and Statistics Peter McCullagh Department of Statistics University of

Counting and Probability Whats to come? Counting and Probability Whats to come?

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

ACMS 20340 Statistics for Life Sciences Chapter 9: Introducing Probability Why Consider

Statistics 1B Statistics 1B 1 (11) 0. Lecture 1. Introduction and probability review

Statistics 370 Probability and Statistics for Engineers Instructor: Peter Bloomfield Course

Chapter II.2: Basic Probability Theory and Statistics 1. What is a probability? 1.1. Probability

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Asset Pricing II April 23, 2004 David has taught you well, but . . . Typeset by Foil T EX

Single-Particle Spectroscopy of 133 Sn via the (d,p) reaction in inverse kinematics Kate L. Jones

Three Theorems About Package Bidding Based largely on Ascending Auctions with Package

Midterm 2 Review Midterm Topics Leader Election Consensus Formulation Synchronous

CSE443 Compilers Dr. Carl Alphonce ruhansa@buffalo.edu Ruhan Sa alphonce@buffalo.edu 343

3.8 Functions of random variables 3.7, 3.9, 3.11 Multiple random variables (discrete) Prof.

JavaScript Dr. Steven Bitner What is it? JavaScript is a scripting language No compiling

Variable Screening You will often have many candidate variables to use as independent variables