Definition and minimization of the cost function Least squares problems Formalism “background value + new observations” � x b � ← − background Y = y ← − new obs The cost function becomes: 1 1 2 � x − x b � 2 2 � H ( x ) − y � 2 J ( x ) = + b o � �� � � �� � J b J o ( x − x b ) T B − 1 ( x − x b ) + ( H ( x ) − y ) T R − 1 ( H ( x ) − y ) = The necessary condition for the existence of a unique minimum ( p ≥ n ) is automatically fulfilled. E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems If the problem is time dependent ◮ Observations are distributed in time: y = y ( t ) . ◮ The observation cost function becomes: N J o ( x ) = 1 � � H i ( x ( t i )) − y ( t i ) � 2 o 2 i = 0 E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems If the problem is time dependent ◮ Observations are distributed in time: y = y ( t ) . ◮ The observation cost function becomes: N J o ( x ) = 1 � � H i ( x ( t i )) − y ( t i ) � 2 o 2 i = 0 ◮ There is a model describing the evolution of x : d x = M ( x ) dt with x ( t = 0 ) = x 0 . Then J is often no longer minimized w.r.t. x , but w.r.t. x 0 only, or to some other parameters. N N J o ( x 0 ) = 1 o = 1 � � � H i ( x ( t i )) − y ( t i ) � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 2 i = 0 i = 0 E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems If the problem is time dependent N 1 + 1 � 2 � x 0 − x b 0 � 2 � H i ( x ( t i )) − y ( t i ) � 2 J ( x 0 ) = b o 2 i = 0 � �� � � �� � background term J b observation term J o E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and M are linear then J o is quadratic. E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and M are linear then J o is quadratic. ◮ However it generally does not have a unique minimum, since the number of observations is generally less than the size of x 0 (the problem is underdetermined: p < n ). Example: let ( x t 1 , x t 2 ) = ( 1 , 1 ) and y = 1 . 1 an observa- tion of 1 2 ( x 1 + x 2 ) . � x 1 + x 2 � 2 J o ( x 1 , x 2 ) = 1 − 1 . 1 2 2 E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and M are linear then J o is quadratic. ◮ However it generally does not have a unique minimum, since the number of observations is generally less than the size of x 0 (the problem is underdetermined). ◮ Adding J b makes the problem of minimizing J = J o + J b well posed. Example: let ( x t 1 , x t 2 ) = ( 1 , 1 ) and y = 1 . 1 an observa- tion of 1 2 ( x 1 + x 2 ) . Let ( x b 1 , x b 2 ) = ( 0 . 9 , 1 . 05 ) � x 1 + x 2 � 2 J ( x 1 , x 2 ) = 1 + 1 � ( x 1 − 0 . 9 ) 2 + ( x 2 − 1 . 05 ) 2 � − 1 . 1 2 2 2 � �� � � �� � J b J o → ( x ∗ 1 , x ∗ 2 ) = ( 0 . 94166 ..., 1 . 09166 ... ) − E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and/or M are nonlinear then J o is no longer quadratic. E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and/or M are nonlinear then J o is no longer quadratic. Example: the Lorenz system (1963) d x d t = α ( y − x ) d y d t = β x − y − xz d z d t = − γ z + xy E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems http://www.chaos-math.org E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and/or M are nonlinear then J o is no longer quadratic. Example: the Lorenz system (1963) d x d t = α ( y − x ) d y d t = β x − y − xz d z d t = − γ z + xy E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and/or M are nonlinear then J o is no longer quadratic. Example: the Lorenz system (1963) d x d t = α ( y − x ) d y d t = β x − y − xz d z d t = − γ z + xy N J o ( y 0 ) = 1 � ( x ( t i ) − x obs ( t i )) 2 dt 2 i = 0 E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and/or M are nonlinear then J o is no longer quadratic. E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems Uniqueness of the minimum ? N J ( x 0 ) = J b ( x 0 )+ J o ( x 0 ) = 1 b + 1 � 2 � x 0 − x b � 2 � H i ( M 0 → t i ( x 0 )) − y ( t i ) � 2 o 2 i = 0 ◮ If H and/or M are nonlinear then J o is no longer quadratic. ◮ Adding J b makes it “more quadratic” ( J b is a regularization term), but J = J o + J b may however have several (local) minima. E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Least squares problems A fundamental remark before going into minimization aspects Once J is defined (i.e. once all the ingredients are chosen: control variables, norms, observations. . . ), the problem is entirely defined. Hence its solution. The “physical” (i.e. the most important) part of data assimilation lies in the definition of J . The rest of the job, i.e. minimizing J , is “only” technical work. E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Outline Introduction: model problem Definition and minimization of the cost function Least squares problems Linear (time independent) problems The adjoint method E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Reminder: norms and scalar products u 1 . R n . u = ∈ I . u n n � � Euclidian norm: � u � 2 = u T u = u 2 i i = 1 n � Associated scalar product: ( u , v ) = u T v = u i v i i = 1 � Generalized norm: let M a symmetric positive definite matrix n n � � M -norm: � u � 2 M = u T M u = m ij u i u j i = 1 j = 1 n n � � Associated scalar product: ( u , v ) M = u T M v = m ij u i v j i = 1 j = 1 E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Reminder: norms and scalar products R n u : Ω ⊂ I − → I R u ∈ L 2 (Ω) x − → u ( x ) � � Euclidian (or L 2 ) norm: � u � 2 = u 2 ( x ) d x Ω � Associated scalar product: ( u , v ) = u ( x ) v ( x ) d x Ω E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Reminder: derivatives and gradients f : E − → I R ( E being of finite or infinite dimension) � Directional (or Gˆ ateaux) derivative of f at point x ∈ E in direction d ∈ E : ∂ f f ( x + α d ) − f ( x ) ∂ d ( x ) = ˆ f [ x ]( d ) = lim α α → 0 Example: partial derivatives ∂ f are directional derivatives in the direction of ∂ x i the members of the canonical basis ( d = e i ) E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Reminder: derivatives and gradients f : E − → I R ( E being of finite or infinite dimension) � Gradient (or Fr´ echet derivative): E being an Hilbert space, f is Fr´ echet differentiable at point x ∈ E iff ∃ p ∈ E such that f ( x + h ) = f ( x ) + ( p , h ) + o ( � h � ) ∀ h ∈ E p is the derivative or gradient of f at point x , denoted f ′ ( x ) or ∇ f ( x ) . � h → ( p ( x ) , h ) is a linear function, called differential function or tangent linear function or Jacobian of f at point x E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Reminder: derivatives and gradients f : E − → I R ( E being of finite or infinite dimension) � Gradient (or Fr´ echet derivative): E being an Hilbert space, f is Fr´ echet differentiable at point x ∈ E iff ∃ p ∈ E such that f ( x + h ) = f ( x ) + ( p , h ) + o ( � h � ) ∀ h ∈ E p is the derivative or gradient of f at point x , denoted f ′ ( x ) or ∇ f ( x ) . � h → ( p ( x ) , h ) is a linear function, called differential function or tangent linear function or Jacobian of f at point x � Important (obvious) relationship: ∂ f ∂ d ( x ) = ( ∇ f ( x ) , d ) E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Minimum of a quadratic function in finite dimension Theorem: Generalized (or Moore-Penrose) inverse R p . Let M a p × n matrix, with rank n , and b ∈ I (hence p ≥ n) Let J ( x ) = � Mx − b � 2 = ( Mx − b ) T ( Mx − b ) . x = M + b , where M + = ( M T M ) − 1 M T J is minimum for ˆ (generalized, or Moore-Penrose, inverse). E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Minimum of a quadratic function in finite dimension Theorem: Generalized (or Moore-Penrose) inverse R p . Let M a p × n matrix, with rank n , and b ∈ I (hence p ≥ n) Let J ( x ) = � Mx − b � 2 = ( Mx − b ) T ( Mx − b ) . x = M + b , where M + = ( M T M ) − 1 M T J is minimum for ˆ (generalized, or Moore-Penrose, inverse). Corollary: with a generalized norm Let N a p × p symmetric definite positive matrix. Let J 1 ( x ) = � Mx − b � 2 N = ( Mx − b ) T N ( Mx − b ) . x = ( M T NM ) − 1 M T N b . J 1 is minimum for ˆ E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Link with data assimilation This gives the solution to the problem n J o ( x ) = 1 2 � Hx − y � 2 min o x ∈ I R in the case of a linear observation operator H . J o ( x ) = 1 x = ( H T R − 1 H ) − 1 H T R − 1 y 2 ( Hx − y ) T R − 1 ( Hx − y ) − → ˆ E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Link with data assimilation Similarly: J ( x ) = J b ( x ) + J o ( x ) 1 1 2 � x − x b � 2 2 � H ( x ) − y � 2 = + b o 2 ( x − x b ) T B − 1 ( x − x b ) + 1 1 2 ( Hx − y ) T R − 1 ( Hx − y ) = ( Mx − b ) T N ( Mx − b ) = � Mx − b � 2 = N � I n � x b � B − 1 � � � 0 with M = b = N = R − 1 H y 0 E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Link with data assimilation Similarly: J ( x ) = J b ( x ) + J o ( x ) 1 1 2 � x − x b � 2 2 � H ( x ) − y � 2 = + b o 1 2 ( x − x b ) T B − 1 ( x − x b ) + 1 2 ( Hx − y ) T R − 1 ( Hx − y ) = ( Mx − b ) T N ( Mx − b ) = � Mx − b � 2 = N � I n � x b � B − 1 � � � 0 with M = b = N = R − 1 H y 0 which leads to x = x b + ( B − 1 + H T R − 1 H ) − 1 H T R − 1 ˆ ( y − Hx b ) � �� � � �� � gain matrix innovation vector The gain matrix also reads BH T ( HBH T + R ) − 1 Remark: (Sherman-Morrison-Woodbury formula) E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Link with data assimilation Remark = B − 1 + H T R − 1 H = [ Cov (ˆ x )] − 1 Hess ( J ) � �� � � �� � accuracy convexity (cf BLUE) E. Blayo - Variational approach to data assimilation
Definition and minimization of the cost function Linear (time independent) problems Remark Given the size of n and p , it is generally impossible to handle explicitly H , B and R . So the direct computation of the gain matrix is impossible. � even in the linear case (for which we have an explicit expression for ˆ x ), the computation of ˆ x is performed using an optimization algorithm. E. Blayo - Variational approach to data assimilation
The adjoint method Outline Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization E. Blayo - Variational approach to data assimilation
The adjoint method Rationale Outline Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization E. Blayo - Variational approach to data assimilation
The adjoint method Rationale Descent methods Descent methods for minimizing the cost function require the knowledge of (an estimate of) its gradient. x k + 1 = x k + α k d k −∇ J ( x k ) gradient method − [ Hess ( J )( x k )] − 1 ∇ J ( x k ) Newton method − B k ∇ J ( x k ) quasi-Newton methods (BFGS, . . . ) with d k = �∇ J ( x k ) � 2 −∇ J ( x k ) + �∇ J ( x k − 1 ) � 2 d k − 1 conjugate gradient ... ... E. Blayo - Variational approach to data assimilation
The adjoint method Rationale The computation of ∇ J ( x k ) may be difficult if the dependency of J with regard to the control variable x is not direct. Example: ◮ u ( x ) solution of an ODE ◮ K a coefficient of this ODE ◮ u obs ( x ) an observation of u ( x ) J ( K ) = 1 2 � u ( x ) − u obs ( x ) � 2 ◮ E. Blayo - Variational approach to data assimilation
The adjoint method Rationale The computation of ∇ J ( x k ) may be difficult if the dependency of J with regard to the control variable x is not direct. Example: ◮ u ( x ) solution of an ODE ◮ K a coefficient of this ODE ◮ u obs ( x ) an observation of u ( x ) J ( K ) = 1 2 � u ( x ) − u obs ( x ) � 2 ◮ u , u − u obs > ˆ J [ K ]( k ) = ( ∇ J ( K ) , k ) = < ˆ u = ∂ u u K + α k − u K with ˆ ∂ k ( K ) = lim α α → 0 E. Blayo - Variational approach to data assimilation
The adjoint method Rationale It is often difficult (or even impossible) to obtain the gradient through the computation of growth rates. Example: � d x ( t )) u 1 = M ( x ( t )) t ∈ [ 0 , T ] . . with u = dt . x ( t = 0 ) = u u N � T J ( u ) = 1 � x ( t ) − x obs ( t ) � 2 − → requires one model run 2 0 ∂ J ( u ) [ J ( u + α e 1 ) − J ( u )] /α ∂ u 1 . . . . ∇ J ( u ) = ≃ . . ∂ J [ J ( u + α e N ) − J ( u )] /α ( u ) ∂ u N − → N + 1 model runs E. Blayo - Variational approach to data assimilation
The adjoint method Rationale In most actual applications, N = [ u ] is large (or even very large: e.g. N = O ( 10 8 − 10 9 ) in meteorology) − → this method cannot be used. Alternatively, the adjoint method provides a very efficient way to compute ∇ J . E. Blayo - Variational approach to data assimilation
The adjoint method Rationale In most actual applications, N = [ u ] is large (or even very large: e.g. N = O ( 10 8 − 10 9 ) in meteorology) − → this method cannot be used. Alternatively, the adjoint method provides a very efficient way to compute ∇ J . On the contrary, do not forget that, if the size of the control variable is very small ( < 10 − 20), ∇ J can be easily estimated by the computation of growth rates. E. Blayo - Variational approach to data assimilation
The adjoint method Rationale Reminder: adjoint operator � General definition: Let X and Y two prehilbertian spaces (i.e. vector spaces with scalar products). Let A : X − → Y an operator. The adjoint operator A ∗ : Y − → X is defined by: < Ax , y > Y = < x , A ∗ y > X ∀ x ∈ X , ∀ y ∈ Y , In the case where X and Y are Hilbert spaces and A is linear, then A ∗ always exists (and is unique). E. Blayo - Variational approach to data assimilation
The adjoint method Rationale Reminder: adjoint operator � General definition: Let X and Y two prehilbertian spaces (i.e. vector spaces with scalar products). Let A : X − → Y an operator. The adjoint operator A ∗ : Y − → X is defined by: < Ax , y > Y = < x , A ∗ y > X ∀ x ∈ X , ∀ y ∈ Y , In the case where X and Y are Hilbert spaces and A is linear, then A ∗ always exists (and is unique). � Adjoint operator in finite dimension: R n − R m a linear operator (i.e. a matrix). Then its adjoint A : I → I operator A ∗ (w.r. to Euclidian norms) is A T . E. Blayo - Variational approach to data assimilation
The adjoint method A simple example Outline Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization E. Blayo - Variational approach to data assimilation
The adjoint method A simple example The continuous case The assimilation problem � − u ′′ ( x ) + c ( x ) u ′ ( x ) = f ( x ) x ∈ ] 0 , 1 [ f ∈ L 2 (] 0 , 1 [) ◮ u ( 0 ) = u ( 1 ) = 0 ◮ c ( x ) is unknown ◮ u obs ( x ) an observation of u ( x ) � 1 Cost function: J ( c ) = 1 � 2 dx � u ( x ) − u obs ( x ) ◮ 2 0 E. Blayo - Variational approach to data assimilation
The adjoint method A simple example The continuous case The assimilation problem � − u ′′ ( x ) + c ( x ) u ′ ( x ) = f ( x ) x ∈ ] 0 , 1 [ f ∈ L 2 (] 0 , 1 [) ◮ u ( 0 ) = u ( 1 ) = 0 ◮ c ( x ) is unknown ◮ u obs ( x ) an observation of u ( x ) � 1 Cost function: J ( c ) = 1 � 2 dx � u ( x ) − u obs ( x ) ◮ 2 0 ateaux-derivative: ˆ ∇ J → Gˆ J [ c ]( δ c ) = < ∇ J ( c ) , δ c > � 1 � � u c + αδ c − u c ˆ u ( x ) − u obs ( x ) J [ c ]( δ c ) = ˆ u ( x ) dx with ˆ u = lim α → 0 α 0 What is the equation satisfied by ˆ u ? E. Blayo - Variational approach to data assimilation
The adjoint method A simple example � u ′′ ( x ) + c ( x ) ˆ u ′ ( x ) = − δ c ( x ) u ′ ( x ) − ˆ x ∈ ] 0 , 1 [ tangent u ( 0 ) = ˆ ˆ u ( 1 ) = 0 linear model E. Blayo - Variational approach to data assimilation
The adjoint method A simple example � u ′′ ( x ) + c ( x ) ˆ u ′ ( x ) = − δ c ( x ) u ′ ( x ) − ˆ x ∈ ] 0 , 1 [ tangent u ( 0 ) = ˆ ˆ u ( 1 ) = 0 linear model Going back to ˆ J: scalar product of the TLM with a variable p � 1 � 1 � 1 u ′′ p + u ′ p = − δ c u ′ p − ˆ c ˆ 0 0 0 E. Blayo - Variational approach to data assimilation
The adjoint method A simple example � u ′′ ( x ) + c ( x ) ˆ u ′ ( x ) = − δ c ( x ) u ′ ( x ) − ˆ x ∈ ] 0 , 1 [ tangent u ( 0 ) = ˆ ˆ u ( 1 ) = 0 linear model Going back to ˆ J: scalar product of the TLM with a variable p � 1 � 1 � 1 u ′′ p + u ′ p = − δ c u ′ p − ˆ c ˆ 0 0 0 Integration by parts: � 1 � 1 u ( − p ′′ − ( c p ) ′ ) = ˆ u ′ ( 1 ) p ( 1 ) − ˆ u ′ ( 0 ) p ( 0 ) − δ c u ′ p ˆ 0 0 E. Blayo - Variational approach to data assimilation
The adjoint method A simple example � u ′′ ( x ) + c ( x ) ˆ u ′ ( x ) = − δ c ( x ) u ′ ( x ) − ˆ x ∈ ] 0 , 1 [ tangent u ( 0 ) = ˆ ˆ u ( 1 ) = 0 linear model Going back to ˆ J: scalar product of the TLM with a variable p � 1 � 1 � 1 u ′′ p + u ′ p = − δ c u ′ p − ˆ c ˆ 0 0 0 Integration by parts: � 1 � 1 u ( − p ′′ − ( c p ) ′ ) = ˆ u ′ ( 1 ) p ( 1 ) − ˆ u ′ ( 0 ) p ( 0 ) − δ c u ′ p ˆ 0 0 − p ′′ ( x ) − ( c ( x ) p ( x )) ′ = u ( x ) − u obs ( x ) � x ∈ ] 0 , 1 [ adjoint p ( 0 ) = p ( 1 ) = 0 model E. Blayo - Variational approach to data assimilation
The adjoint method A simple example � u ′′ ( x ) + c ( x ) ˆ u ′ ( x ) = − δ c ( x ) u ′ ( x ) − ˆ x ∈ ] 0 , 1 [ tangent u ( 0 ) = ˆ ˆ u ( 1 ) = 0 linear model Going back to ˆ J: scalar product of the TLM with a variable p � 1 � 1 � 1 u ′′ p + u ′ p = − δ c u ′ p − ˆ c ˆ 0 0 0 Integration by parts: � 1 � 1 u ( − p ′′ − ( c p ) ′ ) = ˆ u ′ ( 1 ) p ( 1 ) − ˆ u ′ ( 0 ) p ( 0 ) − δ c u ′ p ˆ 0 0 − p ′′ ( x ) − ( c ( x ) p ( x )) ′ = u ( x ) − u obs ( x ) � x ∈ ] 0 , 1 [ adjoint p ( 0 ) = p ( 1 ) = 0 model ∇ J ( c ( x )) = − u ′ ( x ) p ( x ) Then E. Blayo - Variational approach to data assimilation
The adjoint method A simple example Remark Formally, we just made u , TLM ∗ ( p )) ( TLM (ˆ u ) , p ) = (ˆ We indeed computed the adjoint of the tangent linear model. E. Blayo - Variational approach to data assimilation
The adjoint method A simple example Remark Formally, we just made u , TLM ∗ ( p )) ( TLM (ˆ u ) , p ) = (ˆ We indeed computed the adjoint of the tangent linear model. Actual calculations ◮ Solve for the direct model � − u ”( x ) + c ( x ) u ′ ( x ) = f ( x ) x ∈ ] 0 , 1 [ u ( 0 ) = u ( 1 ) = 0 ◮ Then solve for the adjoint model − p ”( x ) − ( c ( x ) p ( x )) ′ = u ( x ) − u obs ( x ) � x ∈ ] 0 , 1 [ p ( 0 ) = p ( 1 ) = 0 ◮ Hence the gradient: ∇ J ( c ( x )) = − u ′ ( x ) p ( x ) E. Blayo - Variational approach to data assimilation
The adjoint method A simple example The discrete case Model � − u ′′ ( x ) + c ( x ) u ′ ( x ) = f ( x ) x ∈ ] 0 , 1 [ u ( 0 ) = u ( 1 ) = 0 � − u i + 1 − 2 u i + u i − 1 u i + 1 − u i + c i = f i i = 1 . . . N − → h 2 h u 0 = u N + 1 = 0 Cost function � 1 N J ( c ) = 1 � � 2 → 1 � � 2 � u ( x ) − u obs ( x ) u i − u obs dx − i 2 2 0 i = 1 Gˆ ateaux derivative: � 1 N � � � � � ˆ u ( x ) − u obs ( x ) u i − u obs J [ c ]( δ c ) = ˆ u ( x ) ˆ dx − → u i i 0 i = 1 E. Blayo - Variational approach to data assimilation
The adjoint method A simple example Tangent linear model � − ˆ u ′′ ( x ) + c ( x ) ˆ u ′ ( x ) = − δ c ( x ) u ′ ( x ) x ∈ ] 0 , 1 [ ˆ u ( 0 ) = ˆ u ( 1 ) = 0 � − ˆ u i + 1 − 2 ˆ u i + ˆ u i − 1 ˆ u i + 1 − ˆ u i u i + 1 − u i + c i = − δ c i i = 1 . . . N h 2 h h ˆ u 0 = ˆ u N + 1 = 0 Adjoint model � − p ′′ ( x ) − ( c ( x ) p ( x )) ′ = u ( x ) − u obs ( x ) x ∈ ] 0 , 1 [ p ( 0 ) = p ( 1 ) = 0 � − p i + 1 − 2 p i + p i − 1 − c i p i − c i − 1 p i − 1 = u i − u obs i = 1 . . . N i h 2 h p 0 = p N + 1 = 0 Gradient . . . u i + 1 − u i ∇ J ( c ( x )) = − u ′ ( x ) p ( x ) − → − p i h . . . E. Blayo - Variational approach to data assimilation
The adjoint method A simple example Remark: with matrix notations What we do when determining the adjoint model is simply transposing the matrix which defines the tangent linear model U , M T P ) ( M ˆ U , P ) = (ˆ In the preceding example: 2 α − β 1 − α + β 1 0 0 · · · . . − α 2 α − β 2 − α + β 2 . M ˆ U = F with M = ... ... ... 0 0 . . . − α 2 α − β N − 1 − α + β N − 1 0 0 − α 2 α − β N · · · α = 1 / h 2 , β i = c i / h E. Blayo - Variational approach to data assimilation
The adjoint method A simple example Remark: with matrix notations What we do when determining the adjoint model is simply transposing the matrix which defines the tangent linear model U , M T P ) ( M ˆ U , P ) = (ˆ In the preceding example: 2 α − β 1 − α + β 1 0 0 · · · . . − α 2 α − β 2 − α + β 2 . M ˆ U = F with M = ... ... ... 0 0 . . . − α 2 α − β N − 1 − α + β N − 1 0 0 − α 2 α − β N · · · α = 1 / h 2 , β i = c i / h But M is generally not explicitly built in actual complex models... E. Blayo - Variational approach to data assimilation
The adjoint method A more complex (but still linear) example Outline Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization E. Blayo - Variational approach to data assimilation
The adjoint method A more complex (but still linear) example Control of the coefficient of a 1-D diffusion equation � � ∂ u ∂ t − ∂ K ( x ) ∂ u = f ( x , t ) x ∈ ] 0 , L [ , t ∈ ] 0 , T [ ∂ x ∂ x u ( 0 , t ) = u ( L , t ) = 0 t ∈ [ 0 , T ] u ( x , 0 ) = u 0 ( x ) x ∈ [ 0 , L ] ◮ K ( x ) is unknown ◮ u obs ( x , t ) an available observation of u ( x , t ) � T � L Minimize J ( K ( x )) = 1 � 2 dx dt � u ( x , t ) − u obs ( x , t ) 2 0 0 E. Blayo - Variational approach to data assimilation
The adjoint method A more complex (but still linear) example Gˆ ateaux derivative � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ K ]( k ) = u ( x , t ) ˆ dx dt 0 0 E. Blayo - Variational approach to data assimilation
The adjoint method A more complex (but still linear) example Gˆ ateaux derivative � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ K ]( k ) = u ( x , t ) ˆ dx dt 0 0 Tangent linear model � � � � ∂ t − ∂ ∂ ˆ u K ( x ) ∂ ˆ u = ∂ k ( x ) ∂ u x ∈ ] 0 , L [ , t ∈ ] 0 , T [ ∂ x ∂ x ∂ x ∂ x u ( 0 , t ) = ˆ ˆ u ( L , t ) = 0 t ∈ [ 0 , T ] u ( x , 0 ) = 0 ˆ x ∈ [ 0 , L ] E. Blayo - Variational approach to data assimilation
The adjoint method A more complex (but still linear) example Gˆ ateaux derivative � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ K ]( k ) = u ( x , t ) ˆ dx dt 0 0 Tangent linear model � � � � ∂ ˆ ∂ t − ∂ u K ( x ) ∂ ˆ u = ∂ k ( x ) ∂ u x ∈ ] 0 , L [ , t ∈ ] 0 , T [ ∂ x ∂ x ∂ x ∂ x u ( 0 , t ) = ˆ ˆ u ( L , t ) = 0 t ∈ [ 0 , T ] u ( x , 0 ) = 0 ˆ x ∈ [ 0 , L ] Adjoint model � � ∂ p ∂ t + ∂ K ( x ) ∂ p = u − u obs x ∈ ] 0 , L [ , t ∈ ] 0 , T [ ∂ x ∂ x p ( 0 , t ) = p ( L , t ) = 0 t ∈ [ 0 , T ] p ( x , T ) = 0 x ∈ [ 0 , L ] final condition !! → backward integration E. Blayo - Variational approach to data assimilation
The adjoint method A more complex (but still linear) example Gˆ ateaux derivative of J � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ K ]( k ) = ˆ u ( x , t ) dx dt 0 0 � T � L k ( x ) ∂ u ∂ p = ∂ x dx dt ∂ x 0 0 Gradient of J � T ∂ u ∂ x ( ., t ) ∂ p ∇ J = ∂ x ( ., t ) dt function of x 0 E. Blayo - Variational approach to data assimilation
The adjoint method A more complex (but still linear) example Discrete version: N I � � u n same as for the preceding ODE, but with i n = 0 i = 1 Matrix interpretation: M is much more complex than previously !! E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Outline Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition General formal derivation � dX ( x , t ) = M ( X ( x , t )) ( x , t ) ∈ Ω × [ 0 , T ] ◮ Model dt X ( x , 0 ) = U ( x ) ◮ Observations Y with observation operator H : H ( X ) ≡ Y � T ◮ Cost function J ( U ) = 1 � H ( X ) − Y � 2 2 0 E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition General formal derivation � dX ( x , t ) = M ( X ( x , t )) ( x , t ) ∈ Ω × [ 0 , T ] ◮ Model dt X ( x , 0 ) = U ( x ) ◮ Observations Y with observation operator H : H ( X ) ≡ Y � T ◮ Cost function J ( U ) = 1 � H ( X ) − Y � 2 2 0 Gˆ ateaux derivative of J � T X U + α u − X U ˆ < ˆ with ˆ X , H ∗ ( HX − Y ) > J [ U ]( u ) = X = lim α α → 0 0 where H ∗ is the adjoint of H , the tangent linear operator of H . E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Tangent linear model d ˆ X ( x , t ) = M (ˆ X ) ( x , t ) ∈ Ω × [ 0 , T ] dt ˆ X ( x , 0 ) = u ( x ) where M is the tangent linear operator of M . E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Tangent linear model d ˆ X ( x , t ) = M (ˆ X ) ( x , t ) ∈ Ω × [ 0 , T ] dt ˆ X ( x , 0 ) = u ( x ) where M is the tangent linear operator of M . Adjoint model � dP ( x , t ) + M ∗ ( P ) = H ∗ ( HX − Y ) ( x , t ) ∈ Ω × [ 0 , T ] dt P ( x , T ) = 0 backward integration E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Tangent linear model d ˆ X ( x , t ) = M (ˆ X ) ( x , t ) ∈ Ω × [ 0 , T ] dt ˆ X ( x , 0 ) = u ( x ) where M is the tangent linear operator of M . Adjoint model � dP ( x , t ) + M ∗ ( P ) = H ∗ ( HX − Y ) ( x , t ) ∈ Ω × [ 0 , T ] dt P ( x , T ) = 0 backward integration Gradient ∇ J ( U ) = − P ( ., 0 ) function of x E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Example: the Burgers’ equation The assimilation problem ∂ x − ν ∂ 2 u ∂ t + u ∂ u ∂ u ∂ x 2 = f x ∈ ] 0 , L [ , t ∈ [ 0 , T ] u ( 0 , t ) = ψ 1 ( t ) t ∈ [ 0 , T ] u ( L , t ) = ψ 2 ( t ) t ∈ [ 0 , T ] u ( x , 0 ) = u 0 ( x ) x ∈ [ 0 , L ] ◮ u 0 ( x ) is unknown ◮ u obs ( x , t ) an observation of u ( x , t ) � T � L Cost function: J ( u 0 ) = 1 � 2 dx dt � u ( x , t ) − u obs ( x , t ) ◮ 2 0 0 E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Gˆ ateaux derivative � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ u 0 ]( h 0 ) = ˆ u ( x , t ) dx dt 0 0 E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Gˆ ateaux derivative � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ u 0 ]( h 0 ) = ˆ u ( x , t ) dx dt 0 0 Tangent linear model − ν ∂ 2 ˆ ∂ t + ∂ ( u ˆ ∂ ˆ u u ) u ∂ x 2 = 0 x ∈ ] 0 , L [ , t ∈ [ 0 , T ] ∂ x u ( 0 , t ) = 0 ˆ t ∈ [ 0 , T ] u ( L , t ) = 0 ˆ t ∈ [ 0 , T ] u ( x , 0 ) = h 0 ( x ) ˆ x ∈ [ 0 , L ] E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Gˆ ateaux derivative � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ u 0 ]( h 0 ) = ˆ u ( x , t ) dx dt 0 0 Tangent linear model − ν ∂ 2 ˆ ∂ ˆ ∂ t + ∂ ( u ˆ u u ) u ∂ x 2 = 0 x ∈ ] 0 , L [ , t ∈ [ 0 , T ] ∂ x u ( 0 , t ) = 0 ˆ t ∈ [ 0 , T ] u ( L , t ) = 0 ˆ t ∈ [ 0 , T ] u ( x , 0 ) = h 0 ( x ) ˆ x ∈ [ 0 , L ] Adjoint model ∂ x + ν ∂ 2 p ∂ p ∂ t + u ∂ p � u − u obs � ∂ x 2 = x ∈ ] 0 , L [ , t ∈ [ 0 , T ] p ( 0 , t ) = 0 t ∈ [ 0 , T ] p ( L , t ) = 0 t ∈ [ 0 , T ] p ( x , T ) = 0 x ∈ [ 0 , L ] final condition !! → backward integration E. Blayo - Variational approach to data assimilation
The adjoint method Control of the initial condition Gˆ ateaux derivative of J � T � L � � ˆ u ( x , t ) − u obs ( x , t ) J [ u 0 ]( h 0 ) = u ( x , t ) ˆ dx dt 0 0 � L = − h 0 ( x ) p ( x , 0 ) dx 0 Gradient of J ∇ J = − p ( ., 0 ) function of x E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization Outline Introduction: model problem Definition and minimization of the cost function The adjoint method Rationale A simple example A more complex (but still linear) example Control of the initial condition The adjoint method as a constrained minimization E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization Minimization with equality constraints Optimization problem R n → I ◮ J : I R differentiable R n such that h 1 ( x ) = . . . = h p ( x ) = 0 } , where the ◮ K = { x ∈ I R n → I functions h i : I R are continuously differentiable. Find the solution of the constrained minimization problem min x ∈ K J ( x ) Theorem If x ∗ ∈ K is a local minimum of J in K , and if the vectors ∇ h i ( x ∗ ) ( i = 1 , . . . , p ) are linearly independent, then there exists λ ∗ = ( λ ∗ R p such that 1 , . . . , λ ∗ p ) ∈ I p � ∇ J ( x ∗ ) + λ ∗ i ∇ h i ( x ∗ ) = 0 i = 1 E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization p � Let L ( x ; λ ) = J ( x ) + λ i h i ( x ) i = 1 ◮ λ i ’s: Lagrange multipliers associated to the constraints. ◮ L : Lagrangian function associated to J . R n × I R p , Then minimizing J in K is equivalent to solving ∇L = 0 in I p � ∇ x L = ∇ J + λ i ∇ h i since i = 1 ∇ λ i L = h i i = 1 , . . . , p This is a saddle point problem. E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization The adjoint method as a constrained minimization The adjoint method can be interpreted as a minimization of J ( x ) under the constraint that the model equations must be satisfied. From this point of view, the adjoint variable corresponds to a Lagrange multiplier. E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization Example: control of the initial condition of the Burgers’ equation ◮ Model: ∂ x − ν ∂ 2 u ∂ u ∂ t + u ∂ u ∂ x 2 = f x ∈ ] 0 , L [ , t ∈ [ 0 , T ] u ( 0 , t ) = ψ 1 ( t ) t ∈ [ 0 , T ] u ( L , t ) = ψ 2 ( t ) t ∈ [ 0 , T ] u ( x , 0 ) = u 0 ( x ) x ∈ [ 0 , L ] ◮ Full observation field u obs ( x , t ) � T � L ◮ Cost function: J ( u 0 ) = 1 � 2 dx dt � u ( x , t ) − u obs ( x , t ) 2 0 0 E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization Example: control of the initial condition of the Burgers’ equation ◮ Model: ∂ x − ν ∂ 2 u ∂ u ∂ t + u ∂ u ∂ x 2 = f x ∈ ] 0 , L [ , t ∈ [ 0 , T ] u ( 0 , t ) = ψ 1 ( t ) t ∈ [ 0 , T ] u ( L , t ) = ψ 2 ( t ) t ∈ [ 0 , T ] u ( x , 0 ) = u 0 ( x ) x ∈ [ 0 , L ] ◮ Full observation field u obs ( x , t ) � T � L ◮ Cost function: J ( u 0 ) = 1 � 2 dx dt � u ( x , t ) − u obs ( x , t ) 2 0 0 We will consider here that J is a function of u 0 and u , and will minimize J ( u 0 , u ) under the constraint of the model equations. E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization Lagrangian function � ∂ u � T � L � ∂ x − ν ∂ 2 u ∂ t + u ∂ u L ( u 0 , u ; p ) = J ( u 0 , u ) + ∂ x 2 − f p � �� � 0 0 � �� � data ass cost function model Remark: no additional term (i.e. no Lagrange multipliers) for the initial condition nor for the boundary conditions: their values are fixed. By integration by parts, L can also be written: � T � L � � ∂ x − ν u ∂ 2 p − u ∂ p ∂ t − 1 2 u 2 ∂ p L ( u 0 , u ; p ) = J ( u 0 , u ) + ∂ x 2 − fp 0 0 � 1 � L � T � 2 p ( L , . ) − 1 2 ψ 2 2 ψ 2 + [ u ( ., T ) p ( ., T ) − u 0 p ( ., 0 )] + 1 p ( 0 , . ) 0 � ∂ u 0 � T � ∂ x ( L , . ) p ( L , . ) − ∂ u ∂ p ∂ p − ν ∂ x ( 0 , . ) p ( 0 , . ) + ψ 2 ∂ x ( L , . ) − ψ 1 ∂ x ( 0 , . ) 0 E. Blayo - Variational approach to data assimilation
The adjoint method The adjoint method as a constrained minimization Saddle point: � T � L � ∂ u � ∂ x − ν ∂ 2 u ∂ t + u ∂ u ( ∇ p L , h p ) = ∂ x 2 − f h p ◮ 0 0 � T � L � � ∂ x − ν ∂ 2 p ( u − u obs ) − ∂ p ∂ t − u ∂ p ( ∇ u L , h u ) = h u ◮ ∂ x 2 0 0 � L + h u ( ., T ) p ( ., T ) 0 � T � ∂ h u � ∂ x ( L , . ) p ( L , . ) − ∂ h u − ν ∂ x ( 0 , . ) p ( 0 , . ) 0 � L ( ∇ u 0 L , h 0 ) = − h 0 ( ., 0 ) p ( ., 0 ) ◮ 0 E. Blayo - Variational approach to data assimilation
Recommend
More recommend