bayesian linear regression
play

Bayesian Linear Regression Seung-Hoon Na Chonbuk National - PowerPoint PPT Presentation

Bayesian Linear Regression Seung-Hoon Na Chonbuk National University Bayesian Linear Regression Compute the full posterior over and 2 Case 1) the noise variance 2 is known Use Gaussian prior Case 2) the noise


  1. Bayesian Linear Regression Seung-Hoon Na Chonbuk National University

  2. Bayesian Linear Regression • Compute the full posterior over 𝒙 and 𝜏 2 • Case 1) the noise variance 𝜏 2 is known – Use Gaussian prior • Case 2) the noise variance 𝜏 2 is unknown – Use normal inverse gamma (NIG) prior

  3. Posterior: 𝜏 2 is known • The likelihood offset putting an improper prior on 𝜈 Further assume that the output is centered: • The conjugate prior:

  4. Posterior: 𝜏 2 is known • The posterior: – If and then the posterior mean reduces to the ridge estimate with −1 𝜏 2 𝜐 2 𝑱 + 𝒀 𝑼 𝒀 𝒀 𝑈 𝒛 𝒙 𝑂 =

  5. Posterior: 𝜏 2 is known • 1D example – the true parameters: 𝑥 0 = −0.3 , 𝑥 1 = 0.5 • Sequential Bayesian inference • Posterior given the first n data points

  6. 𝑜 = 0 𝑜 = 1 𝑜 = 2 𝑜 = 20 𝑥 0 = −0.3 , 𝑥 1 = 0.5

  7. Posterior Predictive: 𝜏 2 is known • The posterior predictive distribution at a test point x : Gaussian • The plug-in approximation: constant error bar

  8. Posterior Predictive: 𝜏 2 is known

  9. Posterior Predictive: 𝜏 2 is known 10 samples from the posterior predictive 10 samples from the plugin approximation to posterior predictive.

  10. Bayesian linear regression: 𝜏 2 is unknown • The likelihood: • The natural conjugate prior:

  11. Inverse Wishart Distribution • Similarly, If D = 1, the Wishart reduces to the Gamma distribution

  12. Inverse Wishart Distribution If D = 1, this reduces to the inverse Gamma

  13. Bayesian linear regression: 𝜏 2 is unknown • The posterior: • The posterior marginals

  14. Bayesian linear regression: 𝜏 2 is unknown • The posterior predictive: Student T distribution • Given new test inputs

  15. Bayesian linear regression: 𝜏 2 is unknown – Uninformative prior • It is common to set 𝑏 0 = 𝑐 0 = 0 , corresponding to an uninformative prior for 𝜏 2 , and to set • The unit information prior:

  16. Bayesian linear regression: 𝜏 2 is unknown – Uninformative prior • An uninformative prior: use the uninformative limit of the conjugate g-prior, which corresponds to setting 𝑕 = ∞

  17. Bayesian linear regression: 𝜏 2 is unknown – Uninformative prior • The marginal distribution of the weights:

  18. Bayesian linear regression: Evidence procedure • Evidence procedure – an empirical Bayes procedure for picking the hyper- parameters – Choose to maximize the marginal likelihood, where 𝜇 = 1/𝜏 2 is the precision of the observation noise and 𝛽 is the precision of the prior – Provides an alternative to using cross validation

  19. Bayesian linear regression: Evidence procedure

Recommend


More recommend