Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 - PowerPoint PPT Presentation

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /15

Review: Continuous priors, discrete data ‘Bent’ coin: unknown probability θ of heads. Prior f ( θ ) = 2 θ on [0,1]. Data: heads on one toss. Question: Find the posterior pdf to this data. Bayes hypoth. prior likelihood numerator posterior 2 θ 2 d θ 3 θ 2 d θ θ 2 θ d θ θ f 1 0 2 θ 2 d θ = 2 / 3 Total 1 T = 1 Posterior pdf: f ( θ | x ) = 3 θ 2 . January 1, 2017 2 /15

Review: Continuous priors, continuous data Bayesian update table Bayes hypoth. prior likeli. numerator posterior f ( θ | x ) d θ = f ( x | θ ) f ( θ ) d θ θ f ( θ ) d θ f ( x | θ ) f ( x | θ ) f ( θ ) d θ f ( x ) total 1 f ( x ) 1 J f ( x ) = f ( x | θ ) f ( θ ) d θ Notice that we overuse the letter f . It is a generic symbol meaning ‘whatever function is appropriate here’. January 1, 2017 3 /15

Romeo and Juliet See class 14 slides January 1, 2017 4 /15

Updating with normal prior and normal likelihood A normal prior is conjugate to a normal likelihood with known σ . Data: x 1 , x 2 , . . . , x n Normal likelihood. x 1 , x 2 , . . . , x n ∼ N( θ, σ 2 ) Assume θ is our unknown parameter of interest, σ is known. Normal prior. θ ∼ N( µ prior , σ 2 ). prior Normal Posterior. θ ∼ N( µ post , σ 2 ). post We have simple updating formulas that allow us to avoid complicated algebra or integrals (see next slide). hypoth. prior likelihood posterior f ( θ ) ∼ N( µ prior , σ 2 f ( x | θ ) ∼ N( θ, σ 2 ) f ( θ | x ) ∼ N( µ post , σ 2 θ prior ) post ) − ( θ − µ prior ) 2 − ( θ − µ post ) 2 ( ) ( − ( x − θ ) 2 ) ( ) θ c 1 exp c 2 exp c 3 exp 2 σ 2 2 σ 2 2 σ 2 post prior January 1, 2017 5 /15

Board question: Normal-normal updating formulas 1 n a µ prior + bx ¯ 1 σ 2 a = b = , µ post = , = . post σ 2 σ 2 a + b a + b prior Suppose we have one data point x = 2 drawn from N( θ, 3 2 ) Suppose θ is our parameter of interest with prior θ ∼ N(4 , 2 2 ). 0. Identify µ prior , σ prior , σ , n , and ¯ x . 1. Make a Bayesian update table, but leave the posterior as an unsimplified product. 2. Use the updating formulas to find the posterior. 3. By doing enough of the algebra, understand that the updating formulas come by using the updating table and doing a lot of algebra. January 1, 2017 6 /15

Concept question: normal priors, normal likelihood Plot 3 Plot 5 0.8 Plot 2 0.6 Prior 0.4 Plot 4 Plot 1 0.2 0.0 0 2 4 6 8 10 12 14 Blue graph = prior Red lines = data in order: 3, 9, 12 (a) Which plot is the posterior to just the first data value? (Click on the plot number.) January 1, 2017 7 /15

Concept question: normal priors, normal likelihood Plot 3 Plot 5 0.8 Plot 2 0.6 Prior 0.4 Plot 4 Plot 1 0.2 0.0 0 2 4 6 8 10 12 14 Blue graph = prior Red lines = data in order: 3, 9, 12 (b) Which graph is posterior to all 3 data values? (Click on the plot number.) January 1, 2017 8 /15

Board question: normal/normal x 1 + ... + x n For data x 1 , . . . , x n with data mean ¯ x = n 1 n a µ prior + bx ¯ 1 σ 2 a = b = , µ post = , post = . σ 2 σ 2 a + b a + b prior Question. On a basketball team the average free throw percentage over all players is a N(75 , 36) distribution. In a given year individual players free throw percentage is N( θ, 16) where θ is their career average. This season Sophie Lie made 85 percent of her free throws. What is the posterior expected value of her career percentage θ ? January 1, 2017 9 /15

Conjugate priors A prior is conjugate to a likelihood if the posterior is the same type of distribution as the prior. Updating becomes algebra instead of calculus. hypothesis data prior likelihood posterior Bernoulli/Beta θ ∈ [0 , 1] x beta( a, b ) Bernoulli( θ ) beta( a + 1 , b ) or beta( a, b + 1) c 1 θ a − 1 (1 − θ ) b − 1 c 3 θ a (1 − θ ) b − 1 θ x = 1 θ c 1 θ a − 1 (1 − θ ) b − 1 c 3 θ a − 1 (1 − θ ) b θ x = 0 1 − θ Binomial/Beta θ ∈ [0 , 1] x beta( a, b ) binomial( N, θ ) beta( a + x, b + N − x ) c 1 θ a − 1 (1 − θ ) b − 1 c 2 θ x (1 − θ ) N − x c 3 θ a + x − 1 (1 − θ ) b + N − x − 1 (fixed N ) θ x Geometric/Beta θ ∈ [0 , 1] x beta( a, b ) geometric( θ ) beta( a + x, b + 1) c 1 θ a − 1 (1 − θ ) b − 1 θ x (1 − θ ) c 3 θ a + x − 1 (1 − θ ) b θ x N( µ prior , σ 2 N( θ, σ 2 ) N( µ post , σ 2 Normal/Normal θ ∈ ( −∞ , ∞ ) prior ) post ) x ( − ( θ − µ prior ) 2 ) ( − ( x − θ ) 2 ) ( ( θ − µ post ) 2 ) (fixed σ 2 ) θ x c 1 exp c 2 exp c 3 exp 2 σ 2 2 σ 2 2 σ 2 prior post There are many other likelihood/conjugate prior pairs. January 1, 2017 10 /15

Concept question: conjugate priors Which are conjugate priors? hypothesis data prior likelihood N( µ prior , σ 2 a) Exponential/Normal θ ∈ [0 , ∞ ) x prior ) exp( θ ) ( ) − ( θ − µ prior ) 2 θ e − θx θ x c 1 exp 2 σ 2 prior b) Exponential/Gamma θ ∈ [0 , ∞ ) x Gamma( a, b ) exp( θ ) c 1 θ a − 1 e − bθ θ e − θx θ x N( µ prior , σ 2 c) Binomial/Normal θ ∈ [0 , 1] x prior ) binomial( N, θ ) ( ) − ( θ − µ prior ) 2 c 2 θ x (1 − θ ) N − x (fixed N ) θ x c 1 exp 2 σ 2 prior 1. none 2. a 3. b 4. c 5. a,b 6. a,c 7. b,c 8. a,b,c January 1, 2017 11 /15

Variance can increase Normal-normal: variance always decreases with data. Beta-binomial: variance usually decreases with data. 6 beta(2,12) beta(21,19) 5 beta(21,12) beta(12,12) 4 3 2 1 0 0.0 0.2 0.4 0.6 0.8 1.0 Variance of beta(2,12) (blue) is bigger than that of beta(12,12) (magenta), but beta(12,12) can be a posterior to beta(2,12) January 1, 2017 12 /15

Table discussion: likelihood principle Suppose the prior has been set. Let x 1 and x 2 be two sets of data. Which of the following are true. (a) If the likelihoods f ( x 1 | θ ) and f ( x 2 | θ ) are the same then they result in the same posterior. (b) If x 1 and x 2 result in the same posterior then their likelihood functions are the same. (c) If the likelihoods f ( x 1 | θ ) and f ( x 2 | θ ) are proportional then they result in the same posterior. (d) If two likelihood functions are proportional then they are equal. January 1, 2017 13 /15

Concept question: strong priors Say we have a bent coin with unknown probability of heads θ . We are convinced that θ ≤ 0 . 7. Our prior is uniform on [0 , 0 . 7] and 0 from 0.7 to 1. We flip the coin 65 times and get 60 heads. Which of the graphs below is the posterior pdf for θ ? 80 A B C D E F 60 40 20 0 0.0 0.2 0.4 0.6 0.8 1.0 January 1, 2017 14 /15

MIT OpenCourseWare https://ocw.mit.edu 18.05 Introduction to Probability and Statistics Spring 2014 For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 - PowerPoint PPT Presentation

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /15 Review: Continuous priors, discrete data Bent coin: unknown probability of heads. Prior f ( ) = 2 on [0,1]. Data: heads on one toss. Question: Find the

Conjugate Priors: Beta and Normal; Choosing Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan

Conjugate Priors: Beta and Normal; Choosing Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan

Conjugate Priors: Beta and Normal 18.05 Spring 2018 Review: Continuous priors, discrete data

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /20 Review: Continuous

Choosing Priors Probability Intervals 18.05 Spring 2014 Conjugate priors A prior is conjugate

Choosing Priors Probability Intervals 18.05 Spring 2014 January 1, 2017 1 /25 Conjugate

Lecture 5 Jan-Willem van de Meent Conjugate Priors <latexit

Bayesian Inference Harvard Math Camp - Econometrics Ashesh Rambachan Summer 2018 Outline What

Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

Compute b f ( x | ) f ( ) d a January 1, 2017 1 /26 Beta distribution Beta ( a

Bayesian Updating: Continuous Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Beta

Why Beta Priors: Main Result and Its Proof Invariance-Based Proof (cont-d) How to Get a General

Steins method for normal approximation of linear statistics of beta-ensembles Gaultier Lambert

Shrinkage priors Dr. Jarad Niemi Iowa State University August 24, 2017 Jarad Niemi (Iowa State)

Foveal Maintenance systems Steady Fixation Pursuits conjugate (version) disjunctive (vergence)

Mixture of g Priors for Bayesian Variable Selection Feng Liang, Rui Paulo et al. Sheng Zhang

Conjugate prior summary Distribution Likelihood p ( x | ) Prior p ( ) Distribution (1

Informative Priors for Graphical Model Structure James Cussens, University of York

Beta star measurement G. Wang and M.Bai Yellow beta star and chromatic beta beat measurement

Conjugate Direction minimization Lectures for PHD course on Numerical optimization Enrico

Tracking Perform ance of the MMax Conjugate Gradient Algorithm Bei Xie and Tam al Bose

CS 5220: More Sparse LA David Bindel 2017-10-26 1 Reminder: Conjugate Gradients What if we only

Solution Sheet Gero Walter Lund University, 15.12.2015

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 - PowerPoint PPT Presentation

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /15 Review: Continuous priors, discrete data Bent coin: unknown probability of heads. Prior f ( ) = 2 on [0,1]. Data: heads on one toss. Question: Find the

Conjugate Priors: Beta and Normal; Choosing Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan

Conjugate Priors: Beta and Normal; Choosing Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan

Conjugate Priors: Beta and Normal 18.05 Spring 2018 Review: Continuous priors, discrete data

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /20 Review: Continuous

Choosing Priors Probability Intervals 18.05 Spring 2014 Conjugate priors A prior is conjugate

Choosing Priors Probability Intervals 18.05 Spring 2014 January 1, 2017 1 /25 Conjugate

Lecture 5 Jan-Willem van de Meent Conjugate Priors &lt;latexit

Bayesian Inference Harvard Math Camp - Econometrics Ashesh Rambachan Summer 2018 Outline What

Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

Compute b f ( x | ) f ( ) d a January 1, 2017 1 /26 Beta distribution Beta ( a

Bayesian Updating: Continuous Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Beta

Why Beta Priors: Main Result and Its Proof Invariance-Based Proof (cont-d) How to Get a General

Steins method for normal approximation of linear statistics of beta-ensembles Gaultier Lambert

Shrinkage priors Dr. Jarad Niemi Iowa State University August 24, 2017 Jarad Niemi (Iowa State)

Foveal Maintenance systems Steady Fixation Pursuits conjugate (version) disjunctive (vergence)

Mixture of g Priors for Bayesian Variable Selection Feng Liang, Rui Paulo et al. Sheng Zhang

Conjugate prior summary Distribution Likelihood p ( x | ) Prior p ( ) Distribution (1

Informative Priors for Graphical Model Structure James Cussens, University of York

Beta star measurement G. Wang and M.Bai Yellow beta star and chromatic beta beat measurement

Conjugate Direction minimization Lectures for PHD course on Numerical optimization Enrico

Tracking Perform ance of the MMax Conjugate Gradient Algorithm Bei Xie and Tam al Bose

CS 5220: More Sparse LA David Bindel 2017-10-26 1 Reminder: Conjugate Gradients What if we only

Solution Sheet Gero Walter Lund University, 15.12.2015

Lecture 5 Jan-Willem van de Meent Conjugate Priors <latexit