Adjoint approach to optimization using automatic differentiation (AD) Praveen. C praveen@math.tifrbng.res.in Tata Institute of Fundamental Research Center for Applicable Mathematics Bangalore - 560 065 http://math.tifrbng.res.in Indo-German Workshop IIT Madras November 2008 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 1 / 62
Outline Mathematical formulation 1 Computing gradients 2 Quasi 1-D flow 3 Gradient smoothing 4 Quasi 1-D optimization: Pressure matching 5 Example codes 6 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 2 / 62
Outline Mathematical formulation 1 Computing gradients 2 Quasi 1-D flow 3 Gradient smoothing 4 Quasi 1-D optimization: Pressure matching 5 Example codes 6 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 3 / 62
Introduction Maturing of high fidelity analysis tools like Computational Fluid Dynamics (CFD) Finite Element Method (FEM) Increase in computational power Shift towards optimization and control Fluid dynamics Design aircraft wing shape to reduce drag Ship hull shape optimization to reduce drag Minimize unsteady forces through boundary suction/blowing Suppress boundary layer separation Enhance mixing Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 4 / 62
Introduction Maturing of high fidelity analysis tools like Computational Fluid Dynamics (CFD) Finite Element Method (FEM) Increase in computational power Shift towards optimization and control Fluid dynamics Design aircraft wing shape to reduce drag Ship hull shape optimization to reduce drag Minimize unsteady forces through boundary suction/blowing Suppress boundary layer separation Enhance mixing Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 4 / 62
Objectives and controls Objective function J ( α ) = J ( α, u ) mathematical representation of system performance Control variables α Parametric controls α ∈ R n Infinite dimensional controls α : X → Y Shape α ∈ set of admissible shapes State variable u : solution of an ODE or PDE R ( α, u ) = 0 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 5 / 62
Mathematical formulation Constrained minimization problem min α J ( α, u ) subject to R ( α, u ) = 0 Find δα such that δ J < 0 ∂ J ∂αδα + ∂ J δ J = ∂ u δ u ∂αδα + ∂ J ∂ J ∂ u = ∂αδα ∂ u � ∂ J � ∂α + ∂ J ∂ u = δα =: G δα ∂ u ∂α Steepest descent δα = − ǫ G ⊤ δ J = − ǫ GG ⊤ = − ǫ � G � 2 < 0 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 6 / 62
Mathematical formulation Constrained minimization problem min α J ( α, u ) subject to R ( α, u ) = 0 Find δα such that δ J < 0 ∂ J ∂αδα + ∂ J δ J = ∂ u δ u ∂αδα + ∂ J ∂ J ∂ u = ∂αδα ∂ u � ∂ J ∂α + ∂ J ∂ u � = δα =: G δα ∂ u ∂α Steepest descent δα = − ǫ G ⊤ δ J = − ǫ GG ⊤ = − ǫ � G � 2 < 0 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 7 / 62
Sensitivity approach Linearized state equation ∂ R ∂α δα + ∂ R ∂ u δ u = 0 or ∂ R ∂ u ∂α = − ∂ R ∂ u ∂α Solve sensitivity equation iteratively ∂ ∂α + ∂ R ∂ u ∂ u ∂α = − ∂ R ∂ t ∂ u ∂α Gradient d α = ∂ J d J ∂α + ∂ J ∂ u ∂ u ∂α Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 8 / 62
Sensitivity approach: Computational cost n design variables: α = ( α 1 , . . . , α n ) Solve primal problem R ( α, u ) = 0 to get u ( α ) For i = 1 , . . . , n Solve sensitivity equation wrt α i ∂ R ∂ u = − ∂ R ∂ u ∂α i ∂α i Compute derivative wrt α i d J = ∂ J + ∂ J ∂ u d α i ∂α i ∂ u ∂α i One primal equation, n sensitivity equations Computational cost = n + 1 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 9 / 62
Adjoint approach We have δ J = ∂ J ∂αδα + ∂ J ∂ R ∂α δα + ∂ R ∂ u δ u and ∂ u δ u = 0 Introduce a new unknown v � ∂ R � ∂ J ∂αδα + ∂ J ∂α δα + ∂ R ∂ u δ u + v ⊤ δ J = ∂ u δ u � ∂ J ∂α + v ⊤ ∂ R � � ∂ J ∂ u + v ⊤ ∂ R � = δα + δ u ∂α ∂ u Adjoint equation � ⊤ � ⊤ � ∂ R � ∂ J v = − ∂ u ∂ u Iterative solution � ⊤ � ⊤ � ∂ R � ∂ J ∂ v ∂ t + v = − ∂ u ∂ u Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 10 / 62
Adjoint approach: Computational cost n design variables: α = ( α 1 , . . . , α n ) Solve primal problem R ( α, u ) = 0 to get u ( α ) Solve adjoint problem � ⊤ � ⊤ � ∂ R � ∂ J v = − ∂ u ∂ u For i = 1 , . . . , n Compute derivative wrt α i d J = ∂ J + v ⊤ ∂ R d α i ∂α i ∂α i One primal equation, one adjoint equation Computational cost = 2, independent of n Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 11 / 62
Continuous vs Discrete Continuous approach: Start with governing PDE R ( α, u ) = 0 Derive adjoint PDE and boundary conditions Discretize adjoint PDE and solve Must be re-derived whenever cost function changes Gradient is not consistent: discretization error Discrete approach: Start with discrete approximation R ( α, u ) = 0 Derive discrete adjoint equations Solve discrete adjoint equations True gradient of discrete solution Can be automated using AD Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 12 / 62
Outline Mathematical formulation 1 Computing gradients 2 Quasi 1-D flow 3 Gradient smoothing 4 Quasi 1-D optimization: Pressure matching 5 Example codes 6 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 13 / 62
Techniques for computing gradients Hand differentiation Finite difference method Complex variable method Automatic Differentiation (AD) Computer code to compute some function Chain rule of differentiation Generates a code to compute derivatives ADIFOR, ADOLC, ODYSEE, TAMC, TAF, TAPENADE see http://www.autodiff.org Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 14 / 62
Derivatives Given a program P computing a function F R m R n F : → → X Y build a program that computes derivatives of F X : independent variables Y : dependent variables Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 15 / 62
Derivatives � ∂ y j � Jacobian matrix: J = ∂ x i Directional or tangent derivative Y = J ˙ ˙ X Adjoint mode X = J ⊤ ¯ ¯ Y Gradients ( n = 1 output) � ∂ y � J = ∂ x i Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 16 / 62
Forward differentiation Program P is a sequence of instructions F k T o = X , given k ’th line T k = F k ( T k − 1 ) Function is a composition F = F p ◦ F p − 1 ◦ . . . ◦ F 1 Chain rule Y = F ′ ( X ) ˙ ˙ 1 ( T o ) ˙ X = F ′ p ( T p − 1 ) F ′ p − 1 ( T p − 2 ) . . . F ′ X X , ˙ X → Y , ˙ Y cost ( ˙ Y ) = 4 ∗ cost ( Y ) Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 17 / 62
Differentiation: Example A simple example f = ( xy + sin x + 4)(3 y 2 + 6) Computer code, f = t 10 t 1 = x = t 2 y t 3 = t 1 t 2 t 4 = sin t 1 t 5 = t 3 + t 4 t 6 = t 5 + 4 t 2 = t 7 2 t 8 = 3 t 7 = t 8 + 6 t 9 t 10 = t 6 t 9 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 18 / 62
F77 code: costfunc.f subroutine costfunc(x, y, f) t1 = x t2 = y t3 = t1*t2 t4 = sin(t1) t5 = t3 + t4 t6 = t5 + 4 t7 = t2**2 t8 = 3.0* t7 t9 = t8 + 6.0 t10 = t6*t9 f = t10 end Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 19 / 62
Differentiation: Direct mode Apply chain rule of differentiation ˙ t 1 = x t 1 = x ˙ ˙ = = ˙ t 2 y t 2 y ˙ t 1 t 2 + t 1 ˙ ˙ t 3 = t 1 t 2 t 3 = t 2 ˙ cos( t 1 )˙ = sin( t 1 ) = t 4 t 4 t 1 ˙ t 3 + ˙ ˙ t 5 = t 3 + t 4 t 5 = t 4 ˙ ˙ = t 5 + 4 = t 6 t 6 t 5 t 2 ˙ 2 t 2 ˙ t 7 = t 7 = t 2 2 ˙ 3˙ = 3 t 7 = t 8 t 8 t 7 ˙ ˙ t 9 = t 8 + 6 t 9 = t 8 ˙ t 6 t 9 + t 6 ˙ ˙ t 10 = t 6 t 9 t 10 = t 9 y = 0, ˙ t 10 = ∂ f y = 1, ˙ t 10 = ∂ f x = 1, ˙ ˙ and ˙ x = 0, ˙ ∂ x ∂ y tapenade -d -vars "x y" -outvars f costfunc.f Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 20 / 62
Automatic Differentiation: Direct mode SUBROUTINE COSTFUNC_D(x, xd, y, yd, f, fd) t1d = xd t1 = x t2d = yd t2 = y t3d = t1d*t2 + t1*t2d t3 = t1*t2 t4d = t1d*COS(t1) t4 = SIN(t1) t5d = t3d + t4d t5 = t3 + t4 t6d = t5d t6 = t5 + 4 t7d = 2*t2*t2d t7 = t2**2 t8d = 3.0*t7d t8 = 3.0*t7 t9d = t8d t9 = t8 + 6.0 t10d = t6d*t9 + t6*t9d t10 = t6*t9 fd = t10d f = t10 END Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 21 / 62
Backward differentiation Program P is a sequence of instructions F k T o = X , given k ’th line T k = F k ( T k − 1 ) Function is a composition F = F p ◦ F p − 1 ◦ . . . ◦ F 1 Chain rule X = [ F ′ ( X )] ⊤ ¯ 2 ( T 1 )] ⊤ . . . [ F ′ p ( T p − 1 )] ⊤ ¯ ¯ Y = [ F ′ 1 ( T o )] ⊤ [ F ′ Y X , ¯ Y → ¯ X cost (¯ X ) = 4 ∗ cost ( Y ) Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 22 / 62
Recommend
More recommend