second order particle mcmc for bayesian parameter
play

Second-order particle MCMC for Bayesian parameter inference IFAC - PowerPoint PPT Presentation

Second-order particle MCMC for Bayesian parameter inference IFAC World Congress 2014, Cape Town, South Africa, August 28, 2014. Johan Dahlin johan.dahlin@liu.se Division of Automatic Control, Linkping University, Sweden. AUTOMATIC CONTROL


  1. Second-order particle MCMC for Bayesian parameter inference IFAC World Congress 2014, Cape Town, South Africa, August 28, 2014. Johan Dahlin johan.dahlin@liu.se Division of Automatic Control, Linköping University, Sweden. AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  2. This is collaborative work with Dr. Fredrik Lindsten (University of Cambridge, United Kingdom) Prof. Thomas B. Schön (Uppsala University, Sweden) AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  3. Summary Aim Efficient Bayesian parameter inference in nonlinear SSMs. Methods Markov chain Monte Carlo. Sequential Monte Carlo methods. Contributions Efficient estimation of first/second order information. Inclusion of first/second order in the proposal. PMH1 and PMH2. AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  4. Example: Earthquakes between 1900 and 2013 40 Number of major earthquakes 35 30 25 20 15 10 5 1900 1920 1940 1960 1980 2000 2020 Year 1.0 0.06 0.8 0.6 0.04 Density ACF 0.4 0.02 0.2 0.0 0.00 −0.2 0 10 20 30 40 50 0 5 10 15 20 Number of earthquakes Lag AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  5. Example: A simple model of annual earthquake counts � x t +1 ; φ x t , σ 2 � x t +1 | x t ∼ N , � � y t | x t ∼ P y t ; β exp( x t ) . Task: Estimate π ( θ ) = p ( θ | y 1: T ) ∝ p θ ( y 1: T ) p ( θ ) with θ = { φ , σ } . AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  6. Metropolis-Hastings algorithm Compute Accept or Propose acc. prob reject? - Propose: θ ′ ∼ q ( θ ′ | θ k − 1 ) . - Compute acceptance probability: � � π ( θ ′ ) q ( θ k − 1 | θ ′ ) α ( θ ′ , θ k − 1 ) = min 1 , π ( θ k − 1 ) q ( θ ′ | θ k − 1 ) - Accept or reject? θ ′ → θ k w.p. α ( θ ′ , θ k − 1 ) . AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  7. Particle Metropolis-Hastings algorithm (cont.) Compute Accept or Propose acc. prob reject? - Propose: θ ′ ∼ q ( θ ′ | θ k − 1 , u ′ ) and u ′ ∼ PF ( θ ′ ) . - Compute � p θ ′ ( y 1: T | u ′ ) and the acceptance probability: p ( θ ′ ) p θ ′ ( y 1: T | u ′ ) q ( θ k − 1 | θ ′ , u ′ ) � α ( θ ′ , θ k − 1 ) = 1 ∧ q ( θ ′ | θ k − 1 , u k − 1 ) . p ( θ k − 1 ) p θ k − 1 ( y 1: T | u k − 1 ) � - Accept or reject? θ ′ → θ k and u ′ → u k w.p. α ( θ ′ , θ k − 1 ) . AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  8. Particle filtering Resampling Propagation Weighting Given the particle system (the random variables u ) � � N x ( i ) 1: T , w ( i ) u = i =1 , 1: T we can obtain (consistent) estimates of: - the likelihood p θ ( y 1: T ) . - the first and second order information of π ( θ ) . AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  9. Zeroth order proposal (PMH0) Gaussian random walk θ ′′ = θ ′ + ǫz, z ∼ N ( z ; 0 , 1) . gives the zeroth order (marginal) proposal � � θ ′′ ; θ ′ , ǫ 2 I d q ( θ ′′ | θ ′ , u ′ ) = N . AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  10. Example: Parameter inference in earthquake model −300 Iteration: 2 0.5 ● −320 Log−likelihood ● −340 0.4 −360 −380 ● ● ● −400 0.3 0.0 0.5 1.0 1.5 2.0 σ Iteration 0.2 8 Marginal posterior density 6 0.1 4 2 0.0 0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 φ φ AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  11. Example: Parameter inference in earthquake model AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  12. Example: Parameter inference in earthquake model AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  13. Example: Parameter inference in earthquake model 10 20 8 15 6 Density Density 10 4 5 2 0 0 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.00 0.05 0.10 0.15 0.20 0.25 0.30 φ σ φ σ Posterior mean 0.86 0.15 Posterior median 0.86 0.15 Posterior mode 0.90 0.14 AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  14. Example: State inference in the earthquake model Number of major earthquakes 50 40 30 20 10 0 1900 1920 1940 1960 1980 2000 2020 Year 1.0 Estimated latent state 0.5 0.0 −0.5 −1.0 1900 1920 1940 1960 1980 2000 2020 Year AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  15. Fixed-lag particle smoothing: overview The first and second order information can be estimated using � � N x ( i ) 1: T , w ( i ) u = i =1 , 1: T and the fixed-lag particle smoother approximation, p θ ( ❞ x t : t − 1 | y 1: T ) ≈ � p θ ( ❞ x t : t − 1 | y 1: κ t ) , κ t = min { T, t + ∆ } , � with no additional computational complexity. AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  16. Fixed-lag particle smoothing: motivation 60 40 20 state 0 -20 -40 -60 0 2 4 6 8 time AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  17. First order proposal (PMH1) Noisy gradient-based ascent update θ ′′ = θ ′ + ǫ 2 2 S ( θ ′ ) + ǫz, z ∼ N ( z ; 0 , 1) , with the first order information � � S ( θ ′ )= ∇ θ log π ( θ ) θ = θ ′ , gives the first order proposal � � θ ′′ ; θ ′ + ǫ 2 � S ( θ ′ | u ′ ) , ǫ 2 I d q ( θ ′′ | θ ′ , u ′ ) = N . 2 AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  18. Second order proposal (PMH2) Noisy Newton update θ ′′ = θ ′ + ǫ 2 � � − 1 S ( θ ′ ) + ǫ � � − 1 / 2 z, J ( θ ′ ) J ( θ ′ ) z ∼ N ( z ; 0 , 1) , 2 with the second order information � J ( θ ′ )= −∇ 2 � θ log π ( θ ) θ = θ ′ , gives the second order proposal � � θ ′′ ; θ ′ + ǫ 2 � � � − 1 , ǫ 2 � � � − 1 � q ( θ ′′ | θ ′ , u ′ ) = N S ( θ ′ | u ′ ) J ( θ ′ | u ′ ) J ( θ ′ | u ′ ) . 2 AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  19. Example: Parameter inference in earthquake model PMH0 PMH1 PMH2 0.5 0.5 0.5 ● ● ● 0.4 0.4 0.4 ● 0.3 0.3 0.3 σ σ σ 0.2 0.2 0.2 0.1 0.1 0.1 0.0 0.0 0.0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 φ φ φ AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  20. Example: Parameter inference in earthquake model AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  21. Integrated autocorrelation time 1.0 1.0 1.0 PMH0: φ PMH1: φ PMH2: φ 0.8 0.8 0.8 0.6 0.6 0.6 ACF ACF ACF 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag Lag 1.0 1.0 1.0 PMH0: σ PMH1: σ PMH2: σ 0.8 0.8 0.8 0.6 0.6 0.6 ACF ACF ACF 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag Lag AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  22. Integrated autocorrelation time (cont.) IACT: the number of iterations between two uncorrelated samples. Acceptance rate max IACT PMH0 0.47 31.82 PMH1 0.38 21.38 PMH2 0.54 14.15 AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  23. Conclusions Results Shorter burn-in phase. Increased efficiency (about 2 times). Simplified tuning. Retained linear computational complexity in N . Methods Include u into the proposal. Particle fixed-lag smoothing (almost for free). Laplace approximation / Random walk on a Riemann manifold. Future work Non-reversible Markov chains. Adaptive methods. Approximate Bayesian computations. AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  24. Thank you for your attention! Questions, comments and suggestions are most welcome. Further developments Extended version available at arXiv. The paper and code to replicate the results within it are found at: http://work.johandahlin.com/ . AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  25. Particle Metropolis-Hastings algorithm The target distribution is given by the parameter proposal π ( θ ) = p θ ( y 1: T ) p ( θ ) . p ( y 1: T ) An unbiased estimator of the likelihood is given by � � � p θ ( y 1: T | u ) � = p θ ( y 1: T | u ) m θ ( u ) ❞ u = p θ ( y 1: T ) . � E m An extended target is given by p θ ( y 1: T | u ) m θ ( u ) p ( θ ) p θ ( y 1: T | u ) m θ ( u ) π ( θ ) π ( θ, u ) = � = � . p ( y 1: T ) p θ ( y 1: T ) AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  26. Particle Metropolis-Hastings algorithm (cont.) � � � p θ ( y 1: T | u ) m θ ( u ) π ( θ ) π ( θ, u ) ❞ u = ❞ u p θ ( y 1: T ) � π ( θ ) p θ ( y 1: T | u ) m θ ( u ) ❞ u = � , p θ ( y 1: T ) � �� � = p θ ( y 1: T ) = π ( θ ) . That is, the marginal is the desired target distribution and the Markov chain is kept invariant. AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

  27. (bootstrap) Particle filtering Resampling Propagation Weighting t − 1 = x a ( i ) - Resampling: P ( a ( i ) w ( j ) x ( i ) = j ) = � t − 1 and set � t − 1 . t t � � � - Propagation: x ( i ) x ( i ) x ( i ) �� ∼ R θ = f θ ( x t | � t − 1 ) . x t t t − 1 � � - Weighting: w ( i ) x ( i ) x ( i ) = W θ t , � = g θ ( y t | x t ) . t t − 1 AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

Recommend


More recommend