Probability Recap David Dalpiaz STAT 430, Fall 2017 1
Administration Questions? Comments? Concerns? 2
Probability Rules Complement Rule: P [ A c ] = 1 − P [ A ] Conditional Probability: P [ A | B ] = P [ A ∩ B ] P [ B ] 3
Probability Rules Multiplication Rule: P [ A ∩ B ] = P [ B ] · P [ A | B ] · For a number of events E 1 , E 2 , . . . E n , the multiplication rule can be expanded into the chain rule : � � P [ � n E n | � n − 1 i =1 E i ] = P [ E 1 ] · P [ E 2 | E 1 ] · P [ E 3 | E 1 ∩ E 2 ] · · · P i =1 E i 4
Bayes’ Theorem Define a partition of a sample space Ω to be A 1 , A 2 , . . . , A n such that A i ∩ A j = ∅ for all i � = j , and n � A i = Ω . i =1 Let A 1 , A 2 , . . . , A n form a partition of some sample space. Then for an event B we have Bayes’ Theorem : P [ A i | B ] = P [ A i ] P [ B | A i ] P [ A i ] P [ B | A i ] = � n P [ B ] i =1 P [ A i ] P [ B | A i ] 5
Indepedence Two events A and B are said to be independent if they satisfy P [ A ∩ B ] = P [ A ] · P [ B ] A collection of events E 1 , E 2 , . . . E n is said to be independent if = � � P E i P [ E i ] i ∈ S i ∈ S for every subset S of { 1 , 2 , . . . n } . If this is the case, then the chain rule is greatly simplified to: n P [ � n � i =1 E i ] = P [ E i ] i =1 6
Distributions We often talk about the distribution of a random variable, which can be thought of as: distribution = list of possible values + associated probabilities 7
Discrete Random Variables The distribution of a discrete random variable X is most often specified by a list of possible values and a probability mass function, p ( x ). The mass function directly gives probabilities, that is, p ( x ) = p X ( x ) = P [ X = x ] . 8
Binomial Distribution � � n p x (1 − p ) n − x , p ( x | n , p ) = x = 0 , 1 , . . . , n , n ∈ N , 0 < p < 1 . x • The function p ( x | n , p ) is the mass function. • It is a function of x , the possible values of the random variable X . • It is conditional on the parameters n and p . • x = 0 , 1 , . . . , n indicates the sample space . • n ∈ N and 0 < p < 1 specify the parameter space . X ∼ bin( n , p ) . 9
Continuous Random Variables The distribution of a continuous random variable X is most often specified by a set of possible values and a probability density function, f ( x ). The probability of the event a < X < b is calculated as � b P [ a < X < b ] = f ( x ) dx . a Note that densities are not probabilities. 10
Normal Distribution � � 2 � − 1 � x − µ 1 f ( x | µ, σ 2 ) = √ 2 π · exp , 2 σ σ ∞ < x < ∞ , −∞ < µ < ∞ , σ > 0 . • The function f ( x | µ, σ 2 ) is the density function. • It is a function of x , the possible values of the random variable X . • It is conditional on the paramters µ and σ 2 . • −∞ < x < ∞ indicates the sample space. • −∞ < µ < ∞ and σ > 0 specify the parameter space. X ∼ N ( µ, σ 2 ) 11
Expectations For discrete random variables, we define the expectation of the function of a random variable X as follows. � E [ g ( X )] � g ( x ) p ( x ) x For continuous random variables we have a similar definition. � E [ g ( X )] � g ( x ) f ( x ) dx For specific functions g , expectations are given names. 12
Mean The mean of a random variable X is given by µ X = mean[ X ] � E [ X ] . So for a discrete random variable, we would have � mean[ X ] = x · p ( x ) x For a continuous random variable we would simply replace the sum by an integral. 13
Variance The variance of a random variable X is given by σ 2 X = var[ X ] � E [( X − E [ X ]) 2 ] = E [ X 2 ] − ( E [ X ]) 2 . The **standard deviation of a random variable X is given by � � σ X = sd[ X ] � σ 2 X = var[ X ] . The covariance or random variables X and Y is given by cov[ X , Y ] � E [( X − E [ X ])( Y − E [ Y ])] = E [ XY ] − E [ X ] · E [ Y ] . 14
Likelihood Consider n iid random variables X 1 , X 2 , . . . X n . We can then write their likelihood as n � L ( θ | x 1 , x 2 , . . . x n ) = f ( x i ; θ ) i = i where f ( x i ; θ ) is the density (or mass) function of random variable X i evaluated at x i with parameter θ . Whereas a probability is a function of a possible observed value given a particular parameter value, a likelihood is the opposite. It is a function of a possible parameter value given observed data. 15
Recommend
More recommend