Bayesian Linear Regression Seung-Hoon Na Chonbuk National University
Bayesian Linear Regression • Compute the full posterior over 𝒙 and 𝜏 2 • Case 1) the noise variance 𝜏 2 is known – Use Gaussian prior • Case 2) the noise variance 𝜏 2 is unknown – Use normal inverse gamma (NIG) prior
Posterior: 𝜏 2 is known • The likelihood offset putting an improper prior on 𝜈 Further assume that the output is centered: • The conjugate prior:
Posterior: 𝜏 2 is known • The posterior: – If and then the posterior mean reduces to the ridge estimate with −1 𝜏 2 𝜐 2 𝑱 + 𝒀 𝑼 𝒀 𝒀 𝑈 𝒛 𝒙 𝑂 =
Posterior: 𝜏 2 is known • 1D example – the true parameters: 𝑥 0 = −0.3 , 𝑥 1 = 0.5 • Sequential Bayesian inference • Posterior given the first n data points
𝑜 = 0 𝑜 = 1 𝑜 = 2 𝑜 = 20 𝑥 0 = −0.3 , 𝑥 1 = 0.5
Posterior Predictive: 𝜏 2 is known • The posterior predictive distribution at a test point x : Gaussian • The plug-in approximation: constant error bar
Posterior Predictive: 𝜏 2 is known
Posterior Predictive: 𝜏 2 is known 10 samples from the posterior predictive 10 samples from the plugin approximation to posterior predictive.
Bayesian linear regression: 𝜏 2 is unknown • The likelihood: • The natural conjugate prior:
Inverse Wishart Distribution • Similarly, If D = 1, the Wishart reduces to the Gamma distribution
Inverse Wishart Distribution If D = 1, this reduces to the inverse Gamma
Bayesian linear regression: 𝜏 2 is unknown • The posterior: • The posterior marginals
Bayesian linear regression: 𝜏 2 is unknown • The posterior predictive: Student T distribution • Given new test inputs
Bayesian linear regression: 𝜏 2 is unknown – Uninformative prior • It is common to set 𝑏 0 = 𝑐 0 = 0 , corresponding to an uninformative prior for 𝜏 2 , and to set • The unit information prior:
Bayesian linear regression: 𝜏 2 is unknown – Uninformative prior • An uninformative prior: use the uninformative limit of the conjugate g-prior, which corresponds to setting = ∞
Bayesian linear regression: 𝜏 2 is unknown – Uninformative prior • The marginal distribution of the weights:
Bayesian linear regression: Evidence procedure • Evidence procedure – an empirical Bayes procedure for picking the hyper- parameters – Choose to maximize the marginal likelihood, where 𝜇 = 1/𝜏 2 is the precision of the observation noise and 𝛽 is the precision of the prior – Provides an alternative to using cross validation
Bayesian linear regression: Evidence procedure
Recommend
More recommend