sequential monte carlo
play

Sequential Monte Carlo: Selected Methodological Applications Adam - PowerPoint PPT Presentation

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo: Selected Methodological Applications Adam M. Johansen a.m.johansen@warwick.ac.uk Warwick University Centre for Scientific Computing 1 Introduction


  1. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo: Selected Methodological Applications Adam M. Johansen a.m.johansen@warwick.ac.uk Warwick University Centre for Scientific Computing 1

  2. Introduction Estimation Rare Events Filtering Summary References Outline ◮ Sequential Monte Carlo ◮ Applications ◮ Parameter Estimation ◮ Rare Event Simulation ◮ Filtering of Piecewise Deterministic Processes 2

  3. Introduction Estimation Rare Events Filtering Summary References Background 3

  4. Introduction Estimation Rare Events Filtering Summary References Monte Carlo Estimating π ◮ Rain is uniform. ◮ Circle is inscribed in square. ◮ A square = 4 r 2 . ◮ A circle = πr 2 . ◮ p = A circle A square = π 4 . ◮ 383 of 500 “successes”. π = 4 383 ◮ ˆ 500 = 3 . 06. ◮ Also obtain confidence intervals. 4

  5. Introduction Estimation Rare Events Filtering Summary References Monte Carlo The Monte Carlo Method ◮ Given a probability density, f , � I = ϕ ( x ) f ( x ) dx E ◮ Simple Monte Carlo solution: iid ◮ Sample X 1 , . . . , X N ∼ f . � n ◮ Estimate ˆ I = 1 ϕ ( X N ). N i =1 ◮ Justified by the law of large numbers. . . ◮ and the central limit theorem. 5

  6. Introduction Estimation Rare Events Filtering Summary References Monte Carlo Importance Sampling ◮ Given g , such that ◮ f ( x ) > 0 ⇒ g ( x ) > 0 ◮ and f ( x ) /g ( x ) < ∞ , define w ( x ) = f ( x ) /g ( x ) and: � � I = ϕ ( x ) f ( x ) dx = ϕ ( x ) w ( x ) g ( x ) dx. ◮ This suggests the importance sampling estimator: iid ◮ Sample X 1 , . . . , X N ∼ g . � N ◮ Estimate ˆ I = 1 w ( X i ) ϕ ( X i ). N i =1 6

  7. Introduction Estimation Rare Events Filtering Summary References Monte Carlo Markov Chain Monte Carlo ◮ Typically difficult to construct a good proposal density. ◮ MCMC works by constructing an ergodic Markov chain of invariant distribution π , X n using it’s ergodic averages: N � 1 ϕ ( X i ) N i =1 to approach E π [ ϕ ]. ◮ Justified by ergodic theorems / central limit theorems. ◮ We aren’t going to take this approach. 7

  8. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo A Motivating Example: Filtering ◮ Let X 1 , . . . denote the position of an object which follows Markovian dynamics. ◮ Let Y 1 , . . . denote a collection of observations: Y i | X i = x i ∼ g ( ·| x i ). ◮ We wish to estimate, as observations arrive, p ( x 1: n | y 1: n ). ◮ A recursion obtained from Bayes rule exists but is intractable in most cases. 8

  9. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo More Generally ◮ The problem in the previous example is really tracking a sequence of distributions. ◮ Key structural property of the smoothing distributions: increasing state spaces. ◮ Other problems with the same structure exist. ◮ Any problem of sequentially approximating a sequence of such distributions, p n , can be addressed in the same way. 9

  10. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo Importance Sampling in This Setting ◮ Given p n ( x 1: n ) for n = 1 , 2 , . . . . ◮ We could sample from a sequence q n ( x 1: n ) for each n . ◮ Or we could let q n ( x 1: n ) = q n ( x n | x 1: n − 1 ) q n − 1 ( x 1: n ) and re-use our samples. ◮ The importance weights become: w n ( x 1: n ) ∝ p n ( x 1: n ) p n ( x 1: n ) q n ( x 1: n ) = q n ( x n | x 1: n − 1 ) q n − 1 ( x 1: n − 1 ) p n ( x 1: n ) = q n ( x n | x 1: n − 1 ) p n − 1 ( x 1: n − 1 ) w n − 1 ( x 1: n − 1 ) 10

  11. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo Sequential Importance Sampling At time 1. For i = 1 : N , sample X ( i ) ∼ q 1 ( · ) . 1 “ ” � � X ( i ) p 1 X ( i ) 1 For i = 1 : N , compute W i 1 ∝ w 1 = “ ” . 1 X ( i ) q 1 1 At time n, n ≥ 2. Sampling Step � � For i = 1 : N , sample X ( i ) ·| X ( i ) n ∼ q n . n − 1 Weighting Step For i = 1 : N , compute “ ” � � X ( i ) 1: n − 1 ,X ( i ) p n X ( i ) 1: n − 1 , X ( i ) n w n = “ ” “ ˛ ” n ˛ X ( i ) X ( i ) ˛ X ( i ) p n − 1 q n n � 1: n − 1 � n − 1 and W ( i ) ∝ W ( i ) X ( i ) 1: n − 1 , X ( i ) n − 1 w n . n n 11

  12. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo Sequential Importance Resampling At time n, n ≥ 2. Sampling Step � � For i = 1 : N , sample X ( i ) X ( i ) ·| � n,n ∼ q n . n − 1 Resampling Step For i = 1 : N , compute “ ” � � X ( i ) n − 1 ,X ( i ) e p n X ( i ) n − 1 , X ( i ) n,n � w n = “ ” “ ˛ ” n,n ˛ X ( i ) X ( i ) X ( i ) e ˛ e p n − 1 q n n,n n − 1 n − 1 “ ” X ( i ) n − 1 ,X ( i ) e w n and W ( i ) n,n = “ ” . n P N X ( j ) n − 1 ,X ( j ) e j =1 w n n,n n ∼ � N X ( i ) j =1 W ( j ) ” ( dx 1: n ) . For i = 1 : N , sample � n δ “ X ( j ) n − 1 ,X ( j ) e n,n 12

  13. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo SMC Samplers Actually, these techniques can be used to sample from any sequence of distributions (Del Moral et al., 2006). ◮ Given a sequence of target distributions, η n , on E n . . . , � n ◮ construct a synthetic sequence � η n on spaces E p p =1 ◮ by introducing Markov kernels, L p from E p +1 to E p : n − 1 � � η n ( x 1: n ) = η n ( x n ) L p ( x p +1 , x p ) , p =1 ◮ These distributions ◮ have the target distributions as time marginals, ◮ have the correct structure to employ SMC techniques. 13

  14. Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo SMC Outline ◮ Given a sample { X ( i ) 1: n − 1 } N i =1 targeting � η n − 1 , ◮ sample X ( i ) n ∼ K n ( X ( i ) n − 1 , · ), ◮ calculate 1: n ) = η n ( X ( i ) n ) L n − 1 ( X ( i ) n , X ( i ) n − 1 ) W n ( X ( i ) . η n − 1 ( X ( i ) n − 1 ) K n ( X ( i ) n − 1 , X ( i ) n ) ◮ Resample, yielding: { X ( i ) 1: n } N i =1 targeting � η n . ◮ Hints that we’d like to use η n − 1 ( x n − 1 ) K n ( x n − 1 , x n ) � L n − 1 ( x n , x n − 1 ) = n − 1 , x n ) . η n − 1 ( x ′ n − 1 ) K n ( x ′ 14

  15. Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models Parameter Estimation in Latent Variable Models Joint work with Arnaud Doucet and Manuel Davy. 15

  16. Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models Maximum { Likelihood | a Posteriori } Estimation ◮ Consider a model with: ◮ parameters, θ , ◮ latent variables, x , and ◮ observed data, y . ◮ Aim to maximise Marginal likelihood � p ( y | θ ) = p ( x, y | θ ) dx or posterior � p ( θ | y ) ∝ p ( x, y | θ ) p ( θ ) dx. ◮ Traditional approach is Expectation-Maximisation (EM) ◮ Requires objective function in closed form. ◮ Susceptible to trapping in local optima. 16

  17. Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models A Probabilistic Approach ◮ A distribution of the form π ( θ | y ) ∝ p ( θ ) p ( y | θ ) γ will become concentrated, as γ → ∞ on the maximisers of p ( y | θ ) under weak conditions (Hwang, 1980). ◮ Key point: Synthetic distributions of the form: γ � π γ ( θ, x 1: γ | y ) ∝ p ( θ ) ¯ p ( x i , y | θ ) i =1 admit the marginals π γ ( θ | y ) ∝ p ( θ ) p ( y | θ ) γ . ¯ 17

  18. Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models Maximum Likelihood via SMC ◮ Use a sequence of distributions η n = π γ n for some { γ n } . ◮ Has previously been suggested in an MCMC context (Doucet et al., 2002). ◮ Requires extremely slow “annealing”. ◮ Separation between distributions is large. ◮ SMC has two main advantages: ◮ Introducing bridging distributions, for γ = ⌊ γ ⌋ + � γ � , of: ⌊ γ ⌋ � π γ ( θ, x 1: ⌊ γ ⌋ +1 | y ) ∝ p ( θ ) p ( x ⌊ γ ⌋ +1 , y | θ ) � γ � ¯ p ( x i , y | θ ) i =1 is straightforward. ◮ Population of samples improves robustness. 18

  19. Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models Three Algorithms ◮ A generic SMC sampler can be written down directly. . . ◮ Easy case: ◮ Sample from p ( x n | y, θ n − 1 ) and p ( θ n | x n , y ). ◮ Weight according to p ( y | θ n − 1 ) γ n − γ n − 1 . ◮ General case: ◮ Sample existing variables from a η n − 1 -invariant kernel: ( θ n , X n, 1: γ n − 1 ) ∼ K n − 1 (( θ n − 1 , X n − 1 ) , · ) . ◮ Sample new variables from an arbitrary proposal: X n,γ n − 1 +1: γ n ∼ q ( ·| θ n ) . ◮ Use the composition of a time-reversal and optimal auxiliary kernel. ◮ Weight expression does not involve the marginal likelihood. 19

Recommend


More recommend