introduction to the read paper young statisticians section
play

Introduction to the Read Paper Young Statisticians Section Mark - PowerPoint PPT Presentation

Introduction to the Read Paper Young Statisticians Section Mark Girolami Department of Statistical Science University College London The Royal Statistical Society Errol Street London October 13, 2010 Riemann manifold Langevin and


  1. Introduction to the Read Paper Young Statisticians Section Mark Girolami Department of Statistical Science University College London The Royal Statistical Society Errol Street London October 13, 2010

  2. Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2

  3. Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2

  4. Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects

  5. Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold

  6. Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold ◮ Deterministic mechanics on Riemann manifold

  7. Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold ◮ Deterministic mechanics on Riemann manifold ◮ Focus on Hamiltonian Monte Carlo for next 27 minutes

  8. Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) .

  9. Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p

  10. Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ )

  11. Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2

  12. Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }}

  13. Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }} ◮ The complete method to sample from the desired marginal p ( θ ) follows p n + 1 | θ n p ( p n + 1 ) = N ( 0 , M ) ∼ θ n + 1 | p n + 1 p ( θ n + 1 | p n + 1 ) ∼

  14. Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }} ◮ The complete method to sample from the desired marginal p ( θ ) follows p n + 1 | θ n p ( p n + 1 ) = N ( 0 , M ) ∼ θ n + 1 | p n + 1 p ( θ n + 1 | p n + 1 ) ∼ ◮ Integrator provides proposals for p ( θ | p ) conditional

  15. Illustrative Example - Bivariate Gaussian ◮ Target density N ( 0 , Σ ) where » 1 ρ – Σ = ρ 1 ◮ For ρ large e.g. 0.98 sampling from this distribution is challenging ◮ Overall Hamiltonian 1 2 x T Σ − 1 x + 1 2 p T p

  16. HMC Integration ǫ = 0 . 18, L = 20, implicit identity matrix for metric ● ● ● ● ● ● ● 1.5 1.88 1.5 ● ● ● ● ● ● ● ● ● ● 1.86 ● 1.0 ● ● ● ● ● ● ● Hamiltonian 1.0 ● ● 1.84 ● 0.5 ● ● ● ● ● θ 2 p 2 ● ● ● 1.82 ● ● 0.0 ● ● ● ● 0.5 ● ● ● 1.80 −0.5 ● ● ● ● ● ● 1.78 −1.0 ● 0.0 ● ● ● ● ● 1.76 ● ● ● −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 5 10 15 20 θ 1 p 1 Integration Step

  17. Metropolis Algorithm, Parameters of Stoch Vol Model, Acc Rate 25% 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

  18. Metropolis Algorithm, Parameters of Stoch Vol Model, Acc Rate 25% 1 0.995 0.99 0.985 0.98 0.975 0.97 0.965 0.13 0.135 0.14 0.145 0.15 0.155 0.16 0.165 0.17 0.175 0.18

  19. HMC Algorithm, Parameters of Stoch Vol Model, Acc Rate 95% 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

  20. HMC Algorithm, Parameters of Stoch Vol Model, Acc Rate 95% 1 0.995 0.99 0.985 0.98 0.975 0.97 0.965 0.13 0.135 0.14 0.145 0.15 0.155 0.16 0.165 0.17 0.175 0.18

  21. Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk

  22. Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC

  23. Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC ◮ Diagonal elements of M reflect scale and off-diagonal elements capture correlation structure of target - (no off-diagonal terms in physical interpretation)

  24. Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC ◮ Diagonal elements of M reflect scale and off-diagonal elements capture correlation structure of target - (no off-diagonal terms in physical interpretation) ◮ Require knowledge of target density to set M - this requires extensive tuning via pilot runs of sampler

Recommend


More recommend