Introduction to the Read Paper Young Statisticians Section Mark - PowerPoint PPT Presentation

Introduction to the Read Paper Young Statisticians Section Mark Girolami Department of Statistical Science University College London The Royal Statistical Society Errol Street London October 13, 2010

Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2

Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects

Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold

Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold ◮ Deterministic mechanics on Riemann manifold

Riemann manifold Langevin and Hamiltonian Monte Carlo Methods, Girolami, M. & Calderhead, B., J.R.Statist. Soc . B (2011), 73 , Part 2 ◮ Advancing MC methods via underlying geometry of fundamental objects ◮ Develop proposal mechanisms based on ◮ Stochastic diffusions on Riemann manifold ◮ Deterministic mechanics on Riemann manifold ◮ Focus on Hamiltonian Monte Carlo for next 27 minutes

Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) .

Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p

Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ )

Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2

Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }}

Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }} ◮ The complete method to sample from the desired marginal p ( θ ) follows p n + 1 | θ n p ( p n + 1 ) = N ( 0 , M ) ∼ θ n + 1 | p n + 1 p ( θ n + 1 | p n + 1 ) ∼

Hamiltonian Monte Carlo for Computational Statistical Inference ◮ Target density p ( θ ) , introduce auxiliary variable p ∼ p ( p ) = N ( p | 0 , M ) . ◮ Negative log-density L ( θ ) ≡ log p ( θ ) , then H ( θ , p ) = −L ( θ ) + 1 2 log ( 2 π ) D | M | + 1 2 p T M − 1 p ◮ Interpreted as separable Hamiltonian in position & momentum variables d θ d τ = ∂ H d p d τ = − ∂ H ∂ p = M − 1 p ∂ θ = ∇ θ L ( θ ) ◮ Energy (approximate), volume preserving and reversible integrator follows p ( τ ) + ǫ ∇ θ L ( θ ( τ )) / 2 p ( τ + ǫ/ 2 ) = θ ( τ ) + ǫ M − 1 p ( τ + ǫ/ 2 ) θ ( τ + ǫ ) = p ( τ + ǫ ) = p ( τ + ǫ/ 2 ) + ǫ ∇ θ L ( θ ( τ + ǫ )) / 2 ◮ Detailed balance satisfied by min { 1 , exp {− H ( θ ∗ , p ∗ ) + H ( θ , p ) }} ◮ The complete method to sample from the desired marginal p ( θ ) follows p n + 1 | θ n p ( p n + 1 ) = N ( 0 , M ) ∼ θ n + 1 | p n + 1 p ( θ n + 1 | p n + 1 ) ∼ ◮ Integrator provides proposals for p ( θ | p ) conditional

Illustrative Example - Bivariate Gaussian ◮ Target density N ( 0 , Σ ) where » 1 ρ – Σ = ρ 1 ◮ For ρ large e.g. 0.98 sampling from this distribution is challenging ◮ Overall Hamiltonian 1 2 x T Σ − 1 x + 1 2 p T p

HMC Integration ǫ = 0 . 18, L = 20, implicit identity matrix for metric ● ● ● ● ● ● ● 1.5 1.88 1.5 ● ● ● ● ● ● ● ● ● ● 1.86 ● 1.0 ● ● ● ● ● ● ● Hamiltonian 1.0 ● ● 1.84 ● 0.5 ● ● ● ● ● θ 2 p 2 ● ● ● 1.82 ● ● 0.0 ● ● ● ● 0.5 ● ● ● 1.80 −0.5 ● ● ● ● ● ● 1.78 −1.0 ● 0.0 ● ● ● ● ● 1.76 ● ● ● −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 5 10 15 20 θ 1 p 1 Integration Step

Metropolis Algorithm, Parameters of Stoch Vol Model, Acc Rate 25% 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Metropolis Algorithm, Parameters of Stoch Vol Model, Acc Rate 25% 1 0.995 0.99 0.985 0.98 0.975 0.97 0.965 0.13 0.135 0.14 0.145 0.15 0.155 0.16 0.165 0.17 0.175 0.18

HMC Algorithm, Parameters of Stoch Vol Model, Acc Rate 95% 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

HMC Algorithm, Parameters of Stoch Vol Model, Acc Rate 95% 1 0.995 0.99 0.985 0.98 0.975 0.97 0.965 0.13 0.135 0.14 0.145 0.15 0.155 0.16 0.165 0.17 0.175 0.18

Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk

Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC

Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC ◮ Diagonal elements of M reflect scale and off-diagonal elements capture correlation structure of target - (no off-diagonal terms in physical interpretation)

Hamiltonian Monte Carlo for Posterior Inference ◮ Deterministic proposal for θ ensures greater efficiency over Metropolis random walk ◮ Small fly in the ointment - tuning of values of matrix M essential for efficient performance of HMC ◮ Diagonal elements of M reflect scale and off-diagonal elements capture correlation structure of target - (no off-diagonal terms in physical interpretation) ◮ Require knowledge of target density to set M - this requires extensive tuning via pilot runs of sampler

Introduction to the Read Paper Young Statisticians Section Mark - PowerPoint PPT Presentation

Introduction to the Read Paper Young Statisticians Section Mark Girolami Department of Statistical Science University College London The Royal Statistical Society Errol Street London October 13, 2010 Riemann manifold Langevin and

Report I of Labour Statisticians General Report ber 2013 International Conference of La 2 to

Module V: Vector Spaces Module V Math 237 Module V Section V.0 Section V.1 Section V.2

PAPER PROJECT 1 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 1: TYPES

PAPER PROJECT 3 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 3: TYPES

Improving the collection of labour migration statistics to of Labour Statisticians better inform

Report II: of Labour Statisticians Statistics of work, er 2013 employment and employment and

The Young Statisticians Writing Competition Some tips on writing 1 So - youve got your

The Young Statisticians Writing Competition The why, the what and the how 1 The why

A Walk Up the Stack Jean Bolot 1. Intersections with Franois First paper read First paper

Young Scot Discover everything Young Scot has to offer... Young Scot is the national youth

Half Year Results Presentation 2019 6 months ended 30 June 2019 Section 1 Section 2 Section 3

2018 Full year results presentation 12 months ended 31 December 2018 1 Section 1 Section 2

How to Read and Present a Scientific Paper Jiri Srba Thanks to Emmanuel Fleury for providing his

How to Read and Present a Scientific Paper *=[ Brian Nielsen. Thanks to Emmanuel Fleury for

Paper Reading and Presentation CMPS 7010 Research Seminar A Three-Phase Approach How to read

TRAINING STATISTICIANS IN INTERNATIONAL ORGANIZATIONS Presentation by William E. Alexander at

Stationary states in 2D systems driven by L evy noises Bart lomiej Dybiec and Krzysztof

with population imbalance Shoichiro Tsutsui (RIKEN Nishina Center for Accelerator-Based Science)

Scaling Analysis of MCMC algorithms Alexandre Thiry 1 1 University of Warwick MCQMC, February

Harmonic Analysis on data sets in high-dimensional space Mauro Maggioni Mathematics and Computer

Gibbs Sampling Bayesian Networks: A First Attempt with Cilk++ Alexander Dubbs May 13, 2010

Basic Statistics and Probability Theory Based on Foundations of Statistical NLP C. Manning

webinar series NM SMART Grid Center Student Research Spotlight Presenters: Jeewon Choi (UNM),

TDDD17 Informatjon Security Topic: Database Privacy Olaf Hartjg olaf.hartjg@liu.se