Ergodicity and Accuracy in Optimal Particle Filters David Kelly and - PowerPoint PPT Presentation

Ergodicity and Accuracy in Optimal Particle Filters David Kelly and Andrew Stuart Courant Institute New York University New York NY www.dtbkelly.com December 9, 2016 Math colloquium , CMU David Kelly (CIMS) Data assimilation December 9, 2016 1 / 33

What is data assimilation? David Kelly (CIMS) Data assimilation December 9, 2016 2 / 33

What is data assimilation? Suppose u is an evolving state, satisfying known dynamics : du ( t ) = F ( u ( t )) + ˙ eg . W dt with some unknown initial condition u 0 ∼ µ 0 . There is a true trajectory of u that is producing partial, noisy observations at times t = h , 2 h , . . . , nh : y n = Hu n + ξ n where H is a linear operator (think low rank projection), u n = u ( nh ), and ξ n ∼ N (0 , Γ) iid. The aim of data assimilation is to characterize the conditional distribution of u n given the observations { y 1 , . . . , y n } David Kelly (CIMS) Data assimilation December 9, 2016 3 / 33

Eg (A chaotic toy model) The state u ∈ R d satisfies the Lorenz-96 equations du i dt = − u i − 2 u i − 1 + u i − 1 u i +1 − u i + F i = 1 , . . . , d with periodic BCs ( u 0 = u d ). Model for turbulent ‘atmospheric variables’ in the midlatitudes. We observe the nodes with odd index: y ( nh ) = ( u 1 ( nh ) , u 3 ( nh ) , . . . , u d ( nh )) + noise David Kelly (CIMS) Data assimilation December 9, 2016 4 / 33

The conditional distribution is updated via the filtering cycle . David Kelly (CIMS) Data assimilation December 9, 2016 5 / 33

Illustration (Initialization) y Ψ Figure: The blue circle represents our initial uncertainty u 0 ∼ µ 0 . x obs David Kelly (CIMS) Data assimilation December 9, 2016 6 / 33

Illustration (Forecast step) y Ψ Figure: Apply the time h flow map Ψ. This produces a new probability measure which is our forecasted estimate of u 1 . This is x called the forecast step. obs David Kelly (CIMS) Data assimilation December 9, 2016 6 / 33

Illustration (Make an observation) y Ψ Figure: We make an observation y 1 = Hu 1 + ξ 1 . In the picture, we only observe the x variable. x obs David Kelly (CIMS) Data assimilation December 9, 2016 6 / 33

Illustration (Analysis step) y Figure: Using Bayes formula we compute the conditional distribution of u 1 | y 1 . This new Ψ measure (called the posterior) is the new estimate of u 1 . The uncertainty of the estimate is reduced by x incorporating the obs observation. The forecast distribution steers the update from the observation. David Kelly (CIMS) Data assimilation December 9, 2016 6 / 33

Bayes’ formula filtering update Let Y n = { y 1 , . . . , y n } . We want to compute the conditional density P ( u n +1 | Y n +1 ), using P ( u n | Y n ) and y n +1 . By Bayes’ formula, we have P ( u n +1 | Y n +1 ) = P ( u n +1 | Y n , y n +1 ) ∝ P ( y n +1 | u n +1 ) P ( u n +1 | Y n ) But we need to compute the integral � P ( u n +1 | Y n ) = P ( u n +1 | Y n , u n ) P ( u n | Y n ) du n . David Kelly (CIMS) Data assimilation December 9, 2016 7 / 33

In general there are no closed formulas for the Bayesian densities. One typically approximates the densities with a sampling procedure. David Kelly (CIMS) Data assimilation December 9, 2016 8 / 33

Particle filters approximate the posterior empirically N � 1 N δ ( u k − u ( n ) P ( u k | Y k ) ≈ k ) n =1 the particles { u ( n ) k } N n =1 can be updated in different ways, giving rise to different particle filters. David Kelly (CIMS) Data assimilation December 9, 2016 9 / 33

Model Assumption We always assume a conditionally Gaussian model u k +1 = ψ ( u k ) + η k η k ∼ N (0 , Σ) i.i.d. , where ψ is deterministic, with observations y k +1 = Hu k +1 + ξ k +1 ξ k ∼ N (0 , Γ) i.i.d. , This facilitates the implementation and theory for particles filters and is a realistic assumption for many practical problems. We denote the posterior, with density P ( u k | Y k ), by µ k and denote the particle filter approximations by µ N k . David Kelly (CIMS) Data assimilation December 9, 2016 10 / 33

The motivation: importance sampling If p ( x ) is a probability density, the empirical approximation of p is given by N � p ( x ) ≈ 1 δ ( x − x ( n ) ) N n =1 where x ( n ) are samples from p . When p is difficult to sample from, we can instead use the importance sampling approximation N � x ( n ) ) p ( � p ( x ) ≈ Z − 1 x ( n ) ) δ ( x − � N x ( n ) ) q ( � n =1 x ( n ) are samples from a different probability density q and Z N is where � normalization constant Z N = � N x ( n ) ) p ( � n =1 x ( n ) ) q ( � David Kelly (CIMS) Data assimilation December 9, 2016 11 / 33

The motivation: standard particle filter We have samples { u ( n ) k } N n =1 from P ( u k | Y k ) that we wish to update into samples from P ( u k +1 | Y k +1 ). Note that u k | Y k is a Markov chain with kernel p k ( u k , du k +1 ) = Z − 1 P ( y k +1 | u k +1 ) P ( u k +1 | u k ) If we could draw u ( n ) k +1 from p k ( u ( n ) k , du k +1 ) then we would have u ( n ) k +1 ∼ P ( u k +1 | Y k +1 ). David Kelly (CIMS) Data assimilation December 9, 2016 11 / 33

The motivation: standard particle filter u ( n ) It is too difficult to sample directly, so we instead draw � k +1 from q ( u k +1 ) = P ( u k +1 | u ( n ) k ) and get the importance sampling approximation N � u ( n ) u ( n ) P ( u k +1 | Y k +1 ) ≈ Z − 1 P ( y k +1 | � k +1 ) δ ( u k +1 − � k +1 ) N n =1 Thus the weights in � N n =1 w ( n ) k +1 δ ( u k +1 − u ( n ) k +1 ) are computed by � � − 1 w ( n ) , ∗ u ( n ) u ( n ) k +1 | 2 k +1 = P ( y k +1 | � k +1 ) ∝ exp 2 | y k +1 − H � Γ w ( n ) , ∗ w ( n ) k +1 k +1 = � N n =1 w ( n ) , ∗ k +1 Notation : | · | A = � A − 1 · , ·� David Kelly (CIMS) Data assimilation December 9, 2016 11 / 33

A different approach Another approach p k ( u ( n ) k , du k +1 ) ∝ P ( y k +1 | u k +1 ) P ( u k +1 | u ( n ) k ) � � � � − 1 − 1 2 | u k +1 − ψ ( u ( n ) = Z − 1 2 | y k +1 − Hu k +1 | 2 Z − 1 k ) | 2 exp Σ exp Γ Σ Γ � � � � − 1 − 1 2 | y k +1 − H ψ ( u ( n ) 2 | u k +1 − m ( n ) = Z − 1 k ) | 2 Z − 1 k +1 | 2 exp exp S C S C by product of Gaussian densities formulae , and C − 1 = Σ − 1 + H T Γ − 1 H S = H Σ H T + Γ m ( n ) k +1 = C (Σ − 1 ψ ( u ( n ) k ) + H T Γ − 1 y k +1 ) = ( I − KH ) ψ ( u k ) + Ky k +1 David Kelly (CIMS) Data assimilation December 9, 2016 11 / 33

� � 2 | u k +1 − m ( n ) If q ( u k +1 ) = Z − 1 − 1 k +1 | 2 exp then the importance sampling C C approximation is given by � � N � − 1 2 | y k +1 − H ψ ( u ( n ) u ( n ) P ( u k +1 | Y k +1 ) ≈ Z − 1 k ) | 2 exp δ ( u k +1 − � k +1 ) S N n =1 u ( n ) where � k +1 are sampled from q , ie u ( n ) k +1 = m ( n ) k +1 + ζ ( n ) k +1 = ( I − KH ) ψ ( u ( n ) k ) + Ky k +1 + ζ ( n ) � k +1 where ζ ( n ) k +1 ∼ N (0 , C ). The weights are computed by � � w ( n ) , ∗ − 1 w ( n ) , ∗ 2 | y k +1 − H ψ ( u ( n ) w ( n ) k +1 k ) | 2 k +1 = exp , k +1 = � N S n =1 w ( n ) , ∗ k +1 David Kelly (CIMS) Data assimilation December 9, 2016 11 / 33

The optimal particle filter David Kelly (CIMS) Data assimilation December 9, 2016 16 / 33

Propagate the particles y x obs u ( n ) k +1 = ( I − KH ) ψ ( u ( n ) k ) + Ky k +1 + ζ ( n ) ζ ( n ) � , k +1 ∼ N (0 , C ) i.i.d. k +1 David Kelly (CIMS) Data assimilation December 9, 2016 16 / 33

Weight the particles using the observation y x obs � � w ( n ) , ∗ − 1 w ( n ) , ∗ 2 | y k +1 − H ψ ( u ( n ) w ( n ) k ) | 2 k +1 k +1 = exp , k +1 = � N S n =1 w ( n ) , ∗ k +1 David Kelly (CIMS) Data assimilation December 9, 2016 16 / 33

Resample the weighted particles y × 2 x obs N � 1 N δ ( u k +1 − u ( n ) P ( u k +1 | Y k +1 ) ≈ k +1 ) n =1 David Kelly (CIMS) Data assimilation December 9, 2016 16 / 33

The optimal particle filter We represent the optimal particle filter as a random dynamical system u ( n ) k +1 = ( I − KH ) ψ ( u ( n ) k ) + Ky k +1 + ζ ( n ) ζ ( n ) � , ∼ N (0 , C ) i.i.d. k k N � u ( n ) ] ( r ( n ) u ( m ) k +1 ) � k +1 = 1 [ x ( m ) k +1 . k +1 , x ( m +1) k +1 m =1 where r ( n ) k +1 is uniformly distributed on [0 , 1] and x ( m +1) = x ( m ) k +1 + w ( m ) k +1 k +1 u ( m ) k +1 with probability w ( m ) ie. pick � k +1 . Note that U k = ( u (1) k , . . . , u ( n ) k ) is a Markov chain. David Kelly (CIMS) Data assimilation December 9, 2016 17 / 33

What do we know about particle filters? David Kelly (CIMS) Data assimilation December 9, 2016 18 / 33

Theory for filtering distributions The true posterior (filtering distribution) µ k is known to be Conditionally ergodic : If µ ′ k , µ ′′ k are two copies of the filtering distribution with µ ′ 0 and µ ′′ 0 = δ u ′ 0 = δ u ′′ 0 then d TV ( µ ′ k , µ ′′ k ) = O ( δ k ) as k → ∞ , where δ ∈ (0 , 1). This is a type of stability . And accurate (with sufficient assumptions on observations): E � m k − u k � 2 = O (obs noise) lim sup k →∞ where u k is the trajectory producing Y k , m k = E ( u k | Y k ) and we take E over all randomness. David Kelly (CIMS) Data assimilation December 9, 2016 19 / 33

Ergodicity and Accuracy in Optimal Particle Filters David Kelly and - PowerPoint PPT Presentation

Ergodicity and Accuracy in Optimal Particle Filters David Kelly and Andrew Stuart Courant Institute New York University New York NY www.dtbkelly.com December 9, 2016 Math colloquium , CMU David Kelly (CIMS) Data assimilation December 9, 2016

Overview of Discrete-Time Filters First-order filters Ideal filters Practical filters

Overview of Discrete-Time Filters Discrete-Time Filters Overview First-order filters N M

Robust and structural ergodicity of stochastic reaction networks Corentin Briat and Mustafa

Analyzing fluid flows via the ergodicity defect ergodicity defect Sherry E. Scott FFT 2013

CS 4495 Computer Vision Tracking 2: Particle Filters Aaron Bobick School of Interactive

Homework 2 CSE 573: Artificial Intelligence Autumn 2012 Particle Filters Particle Filters for

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters

Variational mapping particle filter combining particle filters with local optimal transport Manuel

Finite Impulse Response (FIR) Digital Filters Digital filters are rapidly replacing classic

Practical Analog Filters Overview Types of practical filters Filter specifications

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise

Project 2: Basic particle system Constrained Particle System Tinkertoys Requirements for

Chaos and ergodicity in the one and two handrail models dimensional dripping handrail models

Linear ensemble transform filters: A unified perspective on ensemble Kalman and particle filters

Directional recurrence, ergodicity, and weak mixing Ay se S ahin DePaul University June

Amenability, unique ergodicity and random orderings Alexander S. Kechris Warsaw, July 10, 2012

Estimation of DSGE models St ephane Adjemian e du Maine, GAINS & CEPREMAP Universit

Conditional Expectation as the Basis for Bayesian Updating Hermann G. Matthies Bojana V. Rosi

Asymmetric Bayesian Personalized Ranking for One-Class Collaborative Filtering Shan Ouyang, Lin

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

Localization of Simultaneous Moving Sound Sources for Mobile Robot Using a Frequency-Domain

Stochastic particle methods in Bayesian statistical learning P. Del Moral (INRIA team ALEA) INRIA

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at

Ensemble transform filters for geophysical data assimilation Sebastian Reich in collaboration

Sambuz

Useful Links

Newsletter

Mail Us

Ergodicity and Accuracy in Optimal Particle Filters David Kelly and - PowerPoint PPT Presentation

Ergodicity and Accuracy in Optimal Particle Filters David Kelly and Andrew Stuart Courant Institute New York University New York NY www.dtbkelly.com December 9, 2016 Math colloquium , CMU David Kelly (CIMS) Data assimilation December 9, 2016

Overview of Discrete-Time Filters First-order filters Ideal filters Practical filters

Overview of Discrete-Time Filters Discrete-Time Filters Overview First-order filters N M

Robust and structural ergodicity of stochastic reaction networks Corentin Briat and Mustafa

Analyzing fluid flows via the ergodicity defect ergodicity defect Sherry E. Scott FFT 2013

CS 4495 Computer Vision Tracking 2: Particle Filters Aaron Bobick School of Interactive

Homework 2 CSE 573: Artificial Intelligence Autumn 2012 Particle Filters Particle Filters for

AngularJS Unit Testing AngularJS Filters and Services with Karma &amp; Jasmine Filters

Variational mapping particle filter combining particle filters with local optimal transport Manuel

Finite Impulse Response (FIR) Digital Filters Digital filters are rapidly replacing classic

Practical Analog Filters Overview Types of practical filters Filter specifications

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise

Project 2: Basic particle system Constrained Particle System Tinkertoys Requirements for

Chaos and ergodicity in the one and two handrail models dimensional dripping handrail models

Linear ensemble transform filters: A unified perspective on ensemble Kalman and particle filters

Directional recurrence, ergodicity, and weak mixing Ay se S ahin DePaul University June

Amenability, unique ergodicity and random orderings Alexander S. Kechris Warsaw, July 10, 2012

Estimation of DSGE models St ephane Adjemian e du Maine, GAINS &amp; CEPREMAP Universit

Conditional Expectation as the Basis for Bayesian Updating Hermann G. Matthies Bojana V. Rosi

Asymmetric Bayesian Personalized Ranking for One-Class Collaborative Filtering Shan Ouyang, Lin

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

Localization of Simultaneous Moving Sound Sources for Mobile Robot Using a Frequency-Domain

Stochastic particle methods in Bayesian statistical learning P. Del Moral (INRIA team ALEA) INRIA

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at

Ensemble transform filters for geophysical data assimilation Sebastian Reich in collaboration

Sambuz

Useful Links

Newsletter

Mail Us

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters

Estimation of DSGE models St ephane Adjemian e du Maine, GAINS & CEPREMAP Universit