bayesian estimation information theory
play

Bayesian Estimation & Information Theory Jonathan Pillow - PowerPoint PPT Presentation

Bayesian Estimation & Information Theory Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18 Bayesian Estimation three basic ingredients: 1. Likelihood jointly determine the posterior 2. Prior L (


  1. Bayesian Estimation & Information Theory Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18

  2. Bayesian Estimation three basic ingredients: 1. Likelihood jointly determine the posterior 2. Prior L (ˆ “cost” of making an estimate θ , θ ) 3. Loss function if the true value is • fully specifies how to generate an estimate from the data Bayesian estimator is defined as: Z ˆ L (ˆ θ ( m ) = arg min θ , θ ) p ( θ | m ) d θ “Bayes’ risk” ˆ θ

  3. Typical Loss functions and Bayesian estimators L (ˆ θ , θ ) = (ˆ θ − θ ) 2 1. squared error loss 0 need to find minimizing the expected loss: Differentiate with respect to and set to zero: “posterior mean” also known as Bayes’ Least Squares (BLS) estimator

  4. Typical Loss functions and Bayesian estimators L (ˆ θ , θ ) = 1 − δ (ˆ “zero-one” loss 
 θ − θ ) 2. (1 unless ) 0 expected loss: which is minimized by: • posterior maximum (or “mode”). • known as maximum a posteriori (MAP) estimate.

  5. MAP vs. Posterior Mean estimate: 0.3 0.2 0.1 gamma pdf 0 0 2 4 6 8 10 Note: posterior maximum and mean not always the same!

  6. Typical Loss functions and Bayesian estimators 3. “L1” loss 0 expected loss: HW problem: What is the Bayesian estimator for this loss function?

  7. Simple Example: Gaussian noise & prior 1. Likelihood additive Gaussian noise zero-mean Gaussian 2. Prior 3. Loss function: doesn’t matter (all agree here) posterior distribution MAP estimate variance

  8. Likelihood 8 m 0 - 8 - 8 0 8 θ

  9. Likelihood 8 m 0 - 8 - 8 0 8 θ

  10. Likelihood 8 m 0 - 8 - 8 0 8 θ

  11. Likelihood 8 - 8 0 8 m 0 - 8 - 8 0 8 - 8 0 8 θ

  12. Prior 8 m 0 - 8 - 8 0 8 θ

  13. Computing the posterior likelihood prior posterior ∝ 0 x m θ 0 0 0

  14. Making an Bayesian Estimate: likelihood prior posterior m* ∝ 0 x m θ 0 0 0 bias 0 0 0

  15. High Measurement Noise: large bias likelihood prior posterior ∝ 0 x m θ 0 0 0 larger bias 0 0 0

  16. Low Measurement Noise: small bias likelihood prior posterior ∝ 0 x m θ 0 0 0 small bias 0 0 0

  17. Bayesian Estimation: • Likelihood and prior combine to form posterior • Bayesian estimate is always biased towards the prior (from the ML estimate)

  18. Application #1: Biases in Motion Perception + Which grating moves faster?

  19. Application #1: Biases in Motion Perception + Which grating moves faster?

  20. Explanation from Weiss, Simoncelli & Adelson (2002): likelihood posterior prior prior likelihood 0 0 Noisier measurements, so likelihood is broader ⇒ posterior has larger shift toward 0 (prior = no motion) • In the limit of a zero-contrast grating, likelihood becomes infinitely broad ⇒ percept goes to zero-motion. • Claim: explains why people actually speed up when driving in fog!

  21. summary • 3 ingredients for Bayesian estimation (prior, likelihood, loss) • Bayes’ least squares (BLS) estimator (posterior mean) • maximum a posteriori (MAP) estimator (posterior mode) • accounts for stimulus-quality dependent bias in motion perception (Weiss, Simoncelli & Adelson 2002)

Recommend


More recommend