 
              Time-parallel algorithms for 4D-Var Mike Fisher Selime G¨ urol ECMWF 7 October 2013 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 1 / 30
Outline Motivation 1 Weak-Constraint 4D-Var 2 Notation in terms of 4D vectors and matrices 3 Parallelisation in the time dimension 4 Standard formulations The saddle-point formulation Results from a toy system Conclusions 5 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 2 / 30
Motivation 4D-Var is a highly sequential algorithm: ◮ The minimisation must wait until observations are available. ◮ Each minimisation consists of a sequence of iterations. ◮ Each iteration requires an integration of the tangent-linear model, followed by an integration of the adjoint. ◮ Each integration requires timesteps to be performed in sequence, from one end of the analysis window to the other end. Up to now, parallelisation of 4D-Var has been achieved by a spatial (typically horizontal) decomposition, and distribution over processors, of the model grid. This approach will not be sufficient to keep 4D-Var a viable algorithm on next-generation computer architectures. 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 3 / 30
Motivation Parallelisation in the spatial domain allows the number of gridpoints associated with each processor to be kept constant as resolution increases. However, since higher resolutions require shorter timesteps, the work per processor increases with resolution. To keep the work per processor independent of the resolution of the model, we need to parallelise in the time domain. Parallelisation in time and space allows the number of gridpoints and the number of timesteps allocated to each processor to be kept constant as resolution increases. The aim of the talk is to show that parallelisation in time is possible. 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 4 / 30
Weak-constraint 4D-Var I will concentrate on Weak-constraint 4D-Var. ◮ However, we believe that parallelisation in time is also possible in strong-constraint 4D-Var, using the saddle-point algorithem defined later in the talk. Let us define the analysis window as t 0 ≤ t ≤ t N +1 We wish to estimate the sequence of states x 0 . . . x N (valid at times t 0 . . . t N ), given: ◮ A prior x b (valid at t 0 ). ◮ A set of observations y 0 . . . y N Each y k is a vector containing, typically, a large number of measurements of a variety of variables distributed spatially and in the time interval [ t k , t k +1 ). 4D-Var is a maximum likelihood method. We define the estimate as the sequence of states that minimizes the cost function: J ( x 0 . . . x N ) = − log ( p ( x 0 . . . x N | x b ; y 0 . . . y N )) + const . 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 5 / 30
Weak-constraint 4D-Var Using Bayes’ theorem, and assuming unbiased Gaussian errors, the weak-constraint 4D-Var cost function can be written as: ( x 0 − x b ) T B − 1 ( x 0 − x b ) J ( x 0 . . . x N ) = N ( H k ( x k ) − y k ) T R − 1 � + ( H k ( x k ) − y k ) k k =0 N q ) T Q − 1 � + ( q k − ¯ ( q k − ¯ q ) . k k =1 where q k = x k − M k ( x k − 1 ) B , R k and Q k are covariance matrices of background, observation and model error. H k is an operator that maps model variables x k to observed variables y k , and M k represents an integration of the numerical model from time t k − 1 to time t k . 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 6 / 30
Weak Constraint 4D-Var: Quadratic Inner Loops The inner loops of incremental weak-constraint 4D-Var minimise: 1 2 ( δ x 0 − b ) T B − 1 ( δ x 0 − b ) J ( δ x 0 , . . . , δ x N ) = N +1 ( H k δ x k − d k ) T R − 1 � ( H k δ x k − d k ) k 2 k =0 N +1 ( δ q k − c k ) T Q − 1 � ( δ q k − c k ) k 2 k =1 where δ q k = δ x k − M k δ x k − 1 , and where b , c k and d k come from the outer loop: b = x b − x 0 = ¯ q − q k c k d k = y k − H k ( x k ) 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 7 / 30
Weak Constraint 4D-Var: Quadratic Inner Loops We simplify the notation by defining some 4D vectors and matrices:     δ x 0 δ x 0 δ x 1 δ q 1     δ x = δ p =  .   .  . .     . .     δ x N δ q N These vectors are related through δ q k = δ x k − M k δ x k − 1 . We can write this relationship in matrix form as: δ p = L δ x where:   I − M 1 I     − M 2 I L =     ... ...     − M N I 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 8 / 30
Weak Constraint 4D-Var: Quadratic Inner Loops We will also define:     R 0 B R 1 Q 1     R =  , D =  ,     ... ...       R N Q N       H 0 b d 0 H 1 c 1 d 1       H = b = d =  ,  .    .   .  ... . .       . .     H N c N d N 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 9 / 30
Weak Constraint 4D-Var: Quadratic Inner Loops With these definitions, we can write the inner-loop cost function either as a function of δ x : J ( δ x ) = ( L δ x − b ) T D − 1 ( L δ x − b ) + ( H δ x − d ) T R − 1 ( H δ x − d ) Or as a function of δ p : J ( δ p ) = ( δ p − b ) T D − 1 ( δ p − b ) + ( HL − 1 δ p − d ) T R − 1 ( HL − 1 δ p − d ) 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 10 / 30
Weak Constraint 4D-Var: Quadratic Inner Loops   I − M 1 I     − M 2 I L =     ... ...     − M N I δ p = L δ x can be done in parallel: δ q k = δ x k − M k δ x k − 1 . We know all the δ x k − 1 ′ s . We can apply all the M k ′ s simultaneously. An algorithm involving only L is time-parallel. δ x = L − 1 δ p is sequential: δ x k = M k δ x k − 1 + δ q k . We have to generate each δ x k − 1 in turn before we can apply the next M k . An algorithm involving L − 1 is sequential. 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 11 / 30
Forcing Formulation J ( δ p ) = ( δ p − b ) T D − 1 ( δ p − b ) + ( HL − 1 δ p − d ) T R − 1 ( HL − 1 δ p − d ) This version of the cost function is sequential, since it contains L − 1 . The form of cost function resembles that of strong-constraint 4D-Var, and it can be minimised using techniques that have been developed for strong-constrint 4D-Var. In particular, we can precondition it using D 1 / 2 to diagonalise the first term: J ( χ ) = χ T χ + ( HL − 1 δ p − d ) T R − 1 ( HL − 1 δ p − d ) where δ p = D 1 / 2 χ + b . 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 12 / 30
4D State Formulation J ( δ x ) = ( L δ x − b ) T D − 1 ( L δ x − b ) + ( H δ x − d ) T R − 1 ( H δ x − d ) This version of the cost function is parallel. It does not contain L − 1 . Unfortunately, it is difficult to precondition. 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 13 / 30
4D State Formulation J ( δ x ) = ( L δ x − b ) T D − 1 ( L δ x − b ) + ( H δ x − d ) T R − 1 ( H δ x − d ) The usual method of preconditioning used in 4D-Var defines a control variable χ that diagonalizes the first term of the cost function δ x = L − 1 ( D 1 / 2 χ + b ) With this change-of-variable, the cost function becomes: J ( χ ) = χ T χ + ( H δ x − d ) T R − 1 ( H δ x − d ) But, we have introduced a sequential model integration (i.e. L − 1 ) into the preconditioner. Replacing L − 1 by something cheaper destroys the preconditioning because D is extremely ill-conditioned. 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 14 / 30
4D State Formulation If we approximate L by ˜ L in the preconditioner, the Hessian matrix of the first term of the cost function becomes D 1 / 2 ˜ L − T L T D − 1 L˜ L − 1 D 1 / 2 Because D is highly ill-conditioned, this matrix is not close to the identity matrix unless ˜ L is a very good approximation of L . 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 15 / 30
Lagrangian Dual (4D-PSAS) A third possibility for minimising the cost function is the Lagrangian dual (known as 4D-PSAS in the meteorological community): L − 1 DL − T H T δ w = δ x where δ w = arg min δ w F ( δ w ) 1 2 δ w T ( R + HL − 1 DL − T H T ) δ w + δ w T z and where F ( δ w ) = with z a complicated expression involving b and d . Clearly, this is a sequential algorithm, since it contains L − 1 . 6 th WMO Data Assimilation Symposium Mike Fisher, Selime G¨ urol (ECMWF) 7 October 2013 16 / 30
Recommend
More recommend