Introduction to the Read Paper Young Statisticians Section Mark Girolami Department of Statistical Science University College London The Royal Statistical Society Errol Street London October 13, 2010
Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2
Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2
Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects
Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold
Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold ◮ Deterministic mechanics on Riemann manifold
Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold ◮ Deterministic mechanics on Riemann manifold ◮ Focus on Hamiltonian Monte Carlo for next 27 minutes
Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) .
Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p
Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ )
Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2
Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }}
Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }} ◮ The complete method to sample from the desired marginal p ( θ ) follows p n + 1 | θ n p ( p n + 1 ) = N ( 0 , M ) ∼ θ n + 1 | p n + 1 p ( θ n + 1 | p n + 1 ) ∼
Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }} ◮ The complete method to sample from the desired marginal p ( θ ) follows p n + 1 | θ n p ( p n + 1 ) = N ( 0 , M ) ∼ θ n + 1 | p n + 1 p ( θ n + 1 | p n + 1 ) ∼ ◮ Integrator provides proposals for p ( θ | p ) conditional
Illustrative Example - Bivariate Gaussian ◮ Target density N ( 0 , Σ ) where » 1 ρ – Σ = ρ 1 ◮ For ρ large e.g. 0.98 sampling from this distribution is challenging ◮ Overall Hamiltonian 1 2 x T Σ − 1 x + 1 2 p T p
HMC Integration ǫ = 0 . 18, L = 20, implicit identity matrix for metric ● ● ● ● ● ● ● 1.5 1.88 1.5 ● ● ● ● ● ● ● ● ● ● 1.86 ● 1.0 ● ● ● ● ● ● ● Hamiltonian 1.0 ● ● 1.84 ● 0.5 ● ● ● ● ● θ 2 p 2 ● ● ● 1.82 ● ● 0.0 ● ● ● ● 0.5 ● ● ● 1.80 −0.5 ● ● ● ● ● ● 1.78 −1.0 ● 0.0 ● ● ● ● ● 1.76 ● ● ● −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 5 10 15 20 θ 1 p 1 Integration Step
Metropolis Algorithm, Parameters of Stoch Vol Model, Acc Rate 25% 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Metropolis Algorithm, Parameters of Stoch Vol Model, Acc Rate 25% 1 0.995 0.99 0.985 0.98 0.975 0.97 0.965 0.13 0.135 0.14 0.145 0.15 0.155 0.16 0.165 0.17 0.175 0.18
HMC Algorithm, Parameters of Stoch Vol Model, Acc Rate 95% 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
HMC Algorithm, Parameters of Stoch Vol Model, Acc Rate 95% 1 0.995 0.99 0.985 0.98 0.975 0.97 0.965 0.13 0.135 0.14 0.145 0.15 0.155 0.16 0.165 0.17 0.175 0.18
Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk
Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC
Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC ◮ Diagonal elements of M reflect scale and off-diagonal elements capture correlation structure of target - (no off-diagonal terms in physical interpretation)
Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC ◮ Diagonal elements of M reflect scale and off-diagonal elements capture correlation structure of target - (no off-diagonal terms in physical interpretation) ◮ Require knowledge of target density to set M - this requires extensive tuning via pilot runs of sampler
Recommend
More recommend