Ergodicity in data assimilation methods David Kelly Andy Majda Xin Tong Courant Institute New York University New York NY www.dtbkelly.com March 16, 2016 Applied math seminar, UC Berkeley / LBNL David Kelly (CIMS) Data assimilation March 16, 2016 1 / 24
What is data assimilation? Suppose u satisfies du dt = F ( u ) with some unknown initial condition u 0 . We are most interested in geophysical models, so think high dimensional, nonlinear, possibly stochastic. Suppose we make partial, noisy observations at times t = h , 2 h , . . . , nh , . . . y n = Hu n + ξ n where H is a linear operator (think low rank projection), u n = u ( nh ), and ξ n ∼ N (0 , Γ) iid. The aim of data assimilation is to say something about the conditional distribution of u n given the observations { y 1 , . . . , y n } David Kelly (CIMS) Data assimilation March 16, 2016 2 / 24
Illustration (Initialization) y Ψ Figure: The blue circle represents our guess of u 0 . Due to the uncertainty in u 0 , this is a probability measure. x obs David Kelly (CIMS) Data assimilation March 16, 2016 3 / 24
Illustration (Forecast step) y Ψ Figure: Apply the time h flow map Ψ. This produces a new probability measure which is our forecasted estimate of u 1 . This is x called the forecast step. obs David Kelly (CIMS) Data assimilation March 16, 2016 3 / 24
Illustration (Make an observation) y Ψ Figure: We make an observation y 1 = Hu 1 + ξ 1 . In the picture, we only observe the x variable. x obs David Kelly (CIMS) Data assimilation March 16, 2016 3 / 24
Illustration (Analysis step) y Figure: Using Bayes formula we compute the conditional distribution of u 1 | y 1 . This new Ψ measure (called the posterior) is the new estimate of u 1 . The uncertainty of the estimate is reduced by x incorporating the obs observation. The forecast distribution steers the update from the observation. David Kelly (CIMS) Data assimilation March 16, 2016 3 / 24
Bayes’ formula filtering update Let Y n = { y 0 , y 1 , . . . , y n } . We want to compute the conditional density P ( u n +1 | Y n +1 ), using P ( u n | Y n ) and y n +1 . By Bayes’ formula, we have P ( u n +1 | Y n +1 ) = P ( u n +1 | Y n , y n +1 ) ∝ P ( y n +1 | u n +1 ) P ( u n +1 | Y n ) But we need to compute the integral � P ( u n +1 | Y n ) = P ( u n +1 | Y n , u n ) P ( u n | Y n ) du n . In geophysical models, we can have u ∈ R N where N = O (10 8 ). The rigorous Bayesian approach is computationally infeasible. David Kelly (CIMS) Data assimilation March 16, 2016 4 / 24
The Kalman Filter For a linear model u n +1 = Mu n + η n +1 , the Bayesian integral is Gaussian and can be computed explicitly. The conditional density is characterized by its mean and covariance m n +1 = (1 − K n +1 H ) � m n + K n +1 Hy n +1 C n +1 = ( I − K n +1 H ) � C n +1 , where m n +1 , � • ( � C n +1 ) is the forecast mean and covariance. C n +1 H T ) − 1 is the Kalman gain . • K n +1 = � C n +1 H T (Γ + H � The procedure of updating ( m n , C n ) �→ ( m n +1 , C n +1 ) is known as the Kalman filter . David Kelly (CIMS) Data assimilation March 16, 2016 5 / 24
Ensemble Kalman filter (Evensen 94) y Ψ Figure: Start with K ensemble members drawn from some distribution. Empirical representation of u 0 . The ensemble members x are denoted u ( k ) 0 . obs Only KN numbers are stored. Better than Kalman if K < N . David Kelly (CIMS) Data assimilation March 16, 2016 6 / 24
Ensemble Kalman filter (Forecast step) y Ψ Figure: Apply the dynamics Ψ to each ensemble member. x obs David Kelly (CIMS) Data assimilation March 16, 2016 6 / 24
Ensemble Kalman filter (Make obs) y Ψ Figure: Make an observation. x obs David Kelly (CIMS) Data assimilation March 16, 2016 6 / 24
Ensemble Kalman filter (Analysis step) y Ψ Figure: Approximate the forecast distribution with a Gaussian. Fit the Gaussian using the empirical statistics of x the ensemble. obs David Kelly (CIMS) Data assimilation March 16, 2016 6 / 24
How to implement the Gaussian approximation The naive method is to simply write: P ( y 1 | u 1 ) P ( u 1 ) ∝ exp( − 1 2 | Γ − 1 / 2 ( y 1 − Hu 1 ) | 2 ) exp( − 1 − 1 / 2 ( u 1 − � 2 | � m 1 ) | 2 ) C with the empirical statistics K � m 1 = 1 Ψ ( k ) ( u ( k ) � 0 ) K k =1 � � � � T � K 1 Ψ ( k ) ( u ( k ) Ψ ( k ) ( u ( k ) � C 1 = 0 ) − � m 1 0 ) − � m 1 . K − 1 k =1 In the linear model case Ψ( u n ) = Mu n + η n , this produces an unbiased estimate of the posterior mean, but a biased estimate of the covariance. David Kelly (CIMS) Data assimilation March 16, 2016 7 / 24
How to implement the Gaussian approximation A better approach is to sample using Randomized Maximum Likelihood (RML) method: Draw the sample u ( k ) by minimizing the functional 1 1 − Hu ) | 2 + 1 − 1 / 2 2 | Γ − 1 / 2 ( y ( k ) ( u − Ψ( u ( k ) 2 | � 0 )) | 2 C 1 1 where y ( k ) = y 1 + ξ ( k ) is a perturbed observation. 1 1 In the linear case Ψ( u n ) = Mu n + η n , this produces iid Gaussian samples with mean and covariance satisfying the Kalman update equations, with � C in place of the true forecast covariance. We end up with u ( k ) = (1 − K 1 H )Ψ( u ( k ) 0 ) + K 1 Hy ( k ) 1 1 David Kelly (CIMS) Data assimilation March 16, 2016 8 / 24
Ensemble Kalman filter (Perturb obs) y Ψ Figure: Turn the observation into K artificial observations by perturbing by the same source of observational x noise. obs y ( k ) = y 1 + ξ ( k ) 1 1 David Kelly (CIMS) Data assimilation March 16, 2016 9 / 24
Ensemble Kalman filter (Analysis step) y Ψ Figure: Update each member using the Kalman update formula. The Kalman gain K 1 is computed using the x ensemble covariance. obs u ( k ) = (1 − K 1 H )Ψ( u ( k ) 0 ) + K 1 Hy ( k ) K 1 = � C 1 H T (Γ + H � C 1 H T ) − 1 1 1 � K 1 (Ψ( u ( k ) m n +1 )(Ψ( u ( k ) � m n +1 ) T C 1 = 0 ) − � 0 ) − � K − 1 David Kelly (CIMS) k =1 Data assimilation March 16, 2016 9 / 24
Stability / ergodicity of filters We ask whether the filter inherits important physical properties from the underlying model. For instance, if the model is known to be ergodic , can the same be said of the filter? The truth-filter process ( u n , u (1) n , . . . , u ( K ) ) is a homogeneous Markov n chain. We will seek ergodicity results for the pair rather than the filter alone. David Kelly (CIMS) Data assimilation March 16, 2016 10 / 24
The theoretical framework A Markov chain { X n } n ∈ N on a state space X is called geometrically ergodic if it has a unique invariant measure π and for any initialization X 0 we have � � � � � � � � ≤ C ( X 0 ) r n � E X 0 f ( X n ) − f ( x ) π ( dx ) for some r ∈ (0 , 1) and any m-ble bdd f . The Meyn-Tweedie approach is to verify two assumptions that guarantee geometric ergodicity: 1 - Lyapunov function / Energy dissipation: E n | X n +1 | 2 ≤ α | X n | 2 + β with α ∈ (0 , 1), β > 0. 2 - Minorization: Find compact C ⊂ X , measure ν supported on C , κ > 0 such that P ( x , A ) ≥ κν ( A ) for all x ∈ C , A ⊂ X . David Kelly (CIMS) Data assimilation March 16, 2016 11 / 24
Inheriting an energy principle Suppose we know the model satisfies an energy principle E n | Ψ( u ) | 2 ≤ α | u | 2 + β for α ∈ (0 , 1) , β > 0. Does the filter inherit the energy principle? n +1 | 2 ≤ α ′ | u ( k ) n | 2 + β ′ E n | u ( k ) David Kelly (CIMS) Data assimilation March 16, 2016 12 / 24
Observable energy (Tong, Majda, K. 15) We have u ( k ) n +1 = ( I − K n +1 H )Ψ( u ( k ) n ) + K n +1 Hy ( k ) n +1 Start by looking at the observed part: Hu ( k ) n +1 = ( H − HK n +1 H )Ψ( u ( k ) n ) + HK n +1 Hy ( k ) n +1 . But notice that ( H − HK n +1 H ) = ( H − H � C n +1 H T ( I + H � C n +1 H T ) − 1 H ) = ( I + H � C n +1 H T ) − 1 H Hence | ( H − HK n +1 H )Ψ( u ( k ) n ) | ≤ | H Ψ( u ( k ) n ) | David Kelly (CIMS) Data assimilation March 16, 2016 13 / 24
Observable energy (Tong, Majda, K. 15) We have the energy estimate n +1 | 2 ≤ (1 + δ ) | H Ψ( u ( k ) n ) | 2 + β ′ E n | Hu ( k ) for arb small δ . Unfortunately, the same trick doesn’t work for the unobserved variables ... However, if we assume an observable energy criterion instead: n ) | 2 ≤ α | Hu ( k ) n | 2 + β | H Ψ( u ( k ) ( ⋆ ) Then we obtain a Lyapunov function for the observed components of the filter: n | 2 ≤ α ′ | Hu ( k ) n | 2 + β ′ . | Hu ( k ) eg . ( ⋆ ) is true for linear dynamics if there is no interaction between observed and unobserved variables at infinity. David Kelly (CIMS) Data assimilation March 16, 2016 14 / 24
Can we get around the problem by tweaking the algorithm? David Kelly (CIMS) Data assimilation March 16, 2016 15 / 24
Recommend
More recommend