Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 31, 2017 Applied Bayesian Statistics 1 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Random Variables Mathematically, a random variable is a function that maps a sample space into the real numbers: X : S → R . Countable (discrete). 1 Uncountable (continuous). 2 Example: 3 coin tosses S = { HHH , HHT , HTH , THH , THT , TTH , TTT } We may want to create a random variable, X , defined as the number of tails . X = { 0 , 1 , 2 , 3 } Applied Bayesian Statistics 2 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Probability Mathematically, a probability function assigns numbers (between 0 and 1) to subsets of a sample space: P : B → [ 0 , 1 ] , ∀B ⊆ S . Two interpretations: ( Frequentist ) Based on long-run relative frequencies of possible 1 outcomes. ( Bayesian ) Based on belief about how likely each possible outcome is. 2 Applied Bayesian Statistics 3 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Probability Mathematically, a probability function assigns numbers (between 0 and 1) to subsets of a sample space: P : B → [ 0 , 1 ] , ∀B ⊆ S . Two interpretations: ( Frequentist ) Based on long-run relative frequencies of possible 1 outcomes. ( Bayesian ) Based on belief about how likely each possible outcome is. 2 Regardless of interpretation, same basic probability laws apply, e.g., P ( A ) ≥ 0, P ( S ) = 1, P ( A ∪ B ) = P ( A ) + P ( B ) , for mutually exclusive A and B . Applied Bayesian Statistics 3 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Probability distributions A probability distribution is a list of all possible values of a random variable and their corresponding probabilities. Discrete random variable: probability mass function (PMF) 1 PMF: f ( x ) = Prob ( X = x ) ≥ 0 � Mean: E ( X ) = xf ( x ) x � [ x − E ( X )] 2 f ( x ) Variance: V ( X ) = x Continuous random variable: probability density function (PDF) 2 Prob ( X = x ) = 0 for all x � PDF: f ( x ) ≥ 0 , Prob ( X ∈ B ) = f ( x ) dx B � Mean: E ( X ) = xf ( x ) dx � � � 2 Variance: V ( X ) = x − E ( X ) f ( x ) dx Applied Bayesian Statistics 4 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Parametric families of distributions A statistical analysis typically proceeds by selecting a PMF (or PDF) that seems to match the distribution of a sample. We rarely know the PMF (or PDF) exactly, but we may assume it is from a parametric family of distributions, and estimate the parameters. Discrete random variables 1 Binomial (Bernoulli is a special case) Poisson NegativeBinomial Continuous random variables 2 Normal Gamma (Exponential and χ 2 are special cases) InverseGamma Beta (Uniform is a special case) Applied Bayesian Statistics 5 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ Bernoulli ( θ ) Only two outcomes, (success/failure, 0/1, zero/nonzero, etc.), where θ is the probability of success. X ∈ { 0 , 1 } � 1 − θ, if x = 0 , PMF: f ( x ) = Prob ( X = x ) = θ, if x = 1 . � Mean: E ( X ) = xf ( x ) = 0 ( 1 − θ ) + 1 θ = θ x � [ x − θ ] 2 f ( x ) = ( 0 − θ ) 2 ( 1 − θ ) + ( 1 − θ ) 2 θ = θ ( 1 − θ ) Variance: V ( X ) = x Applied Bayesian Statistics 6 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ Binomial ( n , θ ) X = number of “successes” in n independent “Bernoulli trials,” where θ is the probability of success on each trial. X ∈ { 0 , 1 , . . . , n } � n � θ x ( 1 − θ ) n − x . PMF: f ( x ) = Prob ( X = x ) = x Mean: E ( X ) = n θ Variance: V ( X ) = n θ ( 1 − θ ) Applied Bayesian Statistics 7 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ Poisson ( λ ) X = number of events that occur in a unit of time. X ∈ { 0 , 1 , . . . } PMF: f ( x ) = Prob ( X = x ) = λ x e − λ . x ! Mean: E ( X ) = λ Variance: V ( X ) = λ Note: Can be parameterized with λ = n θ , where θ is the expected number of events per unit time. E ( X ) = V ( X ) = n θ . Applied Bayesian Statistics 8 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ NegativeBinomial ( r , θ ) X = number of “failures” until r “successes” in a sequence of independent “Bernoulli trials,” where θ is the probability of success on each trial. X ∈ { 0 , 1 , . . . , n } � x + r − 1 � θ r ( 1 − θ ) x . PMF: f ( x ) = Prob ( X = x ) = x Mean: E ( X ) = r ( 1 − θ ) θ Variance: V ( X ) = r ( 1 − θ ) θ 2 Note: The geometric distribution is a special case: Geom ( θ ) = NB ( 1 , θ ) . Note: MANY different ways to specify the NB distribution. The important thing to note is that NB is a discrete count distribution that is a more flexible model than Poisson. Applied Bayesian Statistics 9 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ Normal ( µ, σ 2 ) X ∈ ( −∞ , ∞ ) � � 2 � � x − µ 1 − 1 PDF: f ( x ) = √ exp . 2 σ 2 πσ Mean: E ( X ) = µ Variance: V ( X ) = σ 2 Applied Bayesian Statistics 10 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ Gamma ( a , b ) X ∈ ( 0 , ∞ ) b a Γ( a ) x a − 1 e − bx . PDF: f ( x ) = Mean: E ( X ) = a b Variance: V ( X ) = a b 2 Parameters: shape a > 0, rate b > 0. Applied Bayesian Statistics 11 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ InverseGamma ( a , b ) If Y ∼ Gamma ( a , b ) , then X = 1 Y ∼ InverseGamma ( a , b ) . X ∈ ( 0 , ∞ ) b a Γ( a ) x − a − 1 e − b / x . PDF: f ( x ) = b Mean: E ( X ) = a − 1 , for a > 1. b 2 Variance: V ( X ) = ( a − 1 ) 2 ( a − 2 ) , for a > 2. Parameters: shape a > 0, rate b > 0. Applied Bayesian Statistics 12 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
X ∼ Beta ( a , b ) X ∈ [ 0 , 1 ] Γ( a + b ) Γ( a )Γ( b ) x a − 1 ( 1 − x ) b − 1 . PDF: f ( x ) = a Mean: E ( X ) = a + b ab Variance: V ( X ) = ( a + b ) 2 ( a + b + 1 ) Parameters: a > 0, b > 0. Applied Bayesian Statistics 13 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Joint distributions A random vector of p random variables: X = ( X 1 , X 2 , . . . , X p ) . For now, suppose we have just p = 2 random variables, X and Y . ( X , Y ) can be discrete or continuous. Applied Bayesian Statistics 14 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Joint distributions Discrete ( X , Y ) 1 joint PMF : f ( x , y ) = Prob ( X = x , Y = y ) � marginal PMF for X : f X ( x ) = Prob ( X = x ) = f ( x , y ) y � marginal PMF for Y : f Y ( y ) = Prob ( Y = y ) = f ( x , y ) x Continuous ( X , Y ) 2 joint PDF : f ( x , y ) � Prob [( X , Y ) ∈ B ] = f ( x , y ) dxdy B � marginal PDF for X : f X ( x ) = f ( x , y ) dy � marginal PDF for Y : f Y ( y ) = f ( x , y ) dx Applied Bayesian Statistics 15 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Discrete random variables Example Patients are randomly assigned a dose and followed to determine whether they develop a tumor. X ∈ { 5 , 10 , 20 } is the dose; Y ∈ { 0 , 1 } is 1 if a tumor develops and 0 otherwise. X Y 5 10 20 The joint PMF is given by 0 0.469 0.124 0.049 1 0.231 0.076 0.051 Applied Bayesian Statistics 16 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Discrete random variables Example Find the marginal PMFs of X and Y . Applied Bayesian Statistics 17 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Discrete random variables Example Find the marginal PMFs of X and Y . � f Y ( 0 ) = f ( x , 0 ) = 0 . 469 + 0 . 124 + 0 . 049 = 0 . 642 x � f Y ( 1 ) = f ( x , 1 ) = 0 . 231 + 0 . 076 + 0 . 051 = 0 . 358 x Applied Bayesian Statistics 17 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Discrete random variables Example Find the marginal PMFs of X and Y . � f Y ( 0 ) = f ( x , 0 ) = 0 . 469 + 0 . 124 + 0 . 049 = 0 . 642 x � f Y ( 1 ) = f ( x , 1 ) = 0 . 231 + 0 . 076 + 0 . 051 = 0 . 358 x f X ( 5 ) = 0 . 7 , f X ( 10 ) = 0 . 2 , f X ( 20 ) = 0 . 1 X Y 5 10 20 0 0.469 0.124 0.049 0.642 1 0.231 0.076 0.051 0.358 0.7 0.2 0.1 1 Applied Bayesian Statistics 17 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Discrete random variables conditional PMF of Y given X : f ( y | x ) = Prob ( Y = y | X = x ) = Prob ( X = x , Y = y ) = f ( x , y ) Prob ( X = x ) f X ( x ) joint conditional = marginal Applied Bayesian Statistics 18 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>
Recommend
More recommend