Adjoint approach to optimization using automatic differentiation - PowerPoint PPT Presentation

Adjoint approach to optimization using automatic differentiation (AD) Praveen. C praveen@math.tifrbng.res.in Tata Institute of Fundamental Research Center for Applicable Mathematics Bangalore - 560 065 http://math.tifrbng.res.in Indo-German Workshop IIT Madras November 2008 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 1 / 62

Outline Mathematical formulation 1 Computing gradients 2 Quasi 1-D flow 3 Gradient smoothing 4 Quasi 1-D optimization: Pressure matching 5 Example codes 6 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 2 / 62

Introduction Maturing of high fidelity analysis tools like Computational Fluid Dynamics (CFD) Finite Element Method (FEM) Increase in computational power Shift towards optimization and control Fluid dynamics Design aircraft wing shape to reduce drag Ship hull shape optimization to reduce drag Minimize unsteady forces through boundary suction/blowing Suppress boundary layer separation Enhance mixing Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 4 / 62

Objectives and controls Objective function J ( α ) = J ( α, u ) mathematical representation of system performance Control variables α Parametric controls α ∈ R n Infinite dimensional controls α : X → Y Shape α ∈ set of admissible shapes State variable u : solution of an ODE or PDE R ( α, u ) = 0 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 5 / 62

Mathematical formulation Constrained minimization problem min α J ( α, u ) subject to R ( α, u ) = 0 Find δα such that δ J < 0 ∂ J ∂αδα + ∂ J δ J = ∂ u δ u ∂αδα + ∂ J ∂ J ∂ u = ∂αδα ∂ u � ∂ J � ∂α + ∂ J ∂ u = δα =: G δα ∂ u ∂α Steepest descent δα = − ǫ G ⊤ δ J = − ǫ GG ⊤ = − ǫ � G � 2 < 0 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 6 / 62

Mathematical formulation Constrained minimization problem min α J ( α, u ) subject to R ( α, u ) = 0 Find δα such that δ J < 0 ∂ J ∂αδα + ∂ J δ J = ∂ u δ u ∂αδα + ∂ J ∂ J ∂ u = ∂αδα ∂ u � ∂ J ∂α + ∂ J ∂ u � = δα =: G δα ∂ u ∂α Steepest descent δα = − ǫ G ⊤ δ J = − ǫ GG ⊤ = − ǫ � G � 2 < 0 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 7 / 62

Sensitivity approach Linearized state equation ∂ R ∂α δα + ∂ R ∂ u δ u = 0 or ∂ R ∂ u ∂α = − ∂ R ∂ u ∂α Solve sensitivity equation iteratively ∂ ∂α + ∂ R ∂ u ∂ u ∂α = − ∂ R ∂ t ∂ u ∂α Gradient d α = ∂ J d J ∂α + ∂ J ∂ u ∂ u ∂α Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 8 / 62

Sensitivity approach: Computational cost n design variables: α = ( α 1 , . . . , α n ) Solve primal problem R ( α, u ) = 0 to get u ( α ) For i = 1 , . . . , n Solve sensitivity equation wrt α i ∂ R ∂ u = − ∂ R ∂ u ∂α i ∂α i Compute derivative wrt α i d J = ∂ J + ∂ J ∂ u d α i ∂α i ∂ u ∂α i One primal equation, n sensitivity equations Computational cost = n + 1 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 9 / 62

Adjoint approach We have δ J = ∂ J ∂αδα + ∂ J ∂ R ∂α δα + ∂ R ∂ u δ u and ∂ u δ u = 0 Introduce a new unknown v � ∂ R � ∂ J ∂αδα + ∂ J ∂α δα + ∂ R ∂ u δ u + v ⊤ δ J = ∂ u δ u � ∂ J ∂α + v ⊤ ∂ R � � ∂ J ∂ u + v ⊤ ∂ R � = δα + δ u ∂α ∂ u Adjoint equation � ⊤ � ⊤ � ∂ R � ∂ J v = − ∂ u ∂ u Iterative solution � ⊤ � ⊤ � ∂ R � ∂ J ∂ v ∂ t + v = − ∂ u ∂ u Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 10 / 62

Adjoint approach: Computational cost n design variables: α = ( α 1 , . . . , α n ) Solve primal problem R ( α, u ) = 0 to get u ( α ) Solve adjoint problem � ⊤ � ⊤ � ∂ R � ∂ J v = − ∂ u ∂ u For i = 1 , . . . , n Compute derivative wrt α i d J = ∂ J + v ⊤ ∂ R d α i ∂α i ∂α i One primal equation, one adjoint equation Computational cost = 2, independent of n Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 11 / 62

Continuous vs Discrete Continuous approach: Start with governing PDE R ( α, u ) = 0 Derive adjoint PDE and boundary conditions Discretize adjoint PDE and solve Must be re-derived whenever cost function changes Gradient is not consistent: discretization error Discrete approach: Start with discrete approximation R ( α, u ) = 0 Derive discrete adjoint equations Solve discrete adjoint equations True gradient of discrete solution Can be automated using AD Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 12 / 62

Techniques for computing gradients Hand differentiation Finite difference method Complex variable method Automatic Differentiation (AD) Computer code to compute some function Chain rule of differentiation Generates a code to compute derivatives ADIFOR, ADOLC, ODYSEE, TAMC, TAF, TAPENADE see http://www.autodiff.org Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 14 / 62

Derivatives Given a program P computing a function F R m R n F : → → X Y build a program that computes derivatives of F X : independent variables Y : dependent variables Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 15 / 62

Derivatives � ∂ y j � Jacobian matrix: J = ∂ x i Directional or tangent derivative Y = J ˙ ˙ X Adjoint mode X = J ⊤ ¯ ¯ Y Gradients ( n = 1 output) � ∂ y � J = ∂ x i Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 16 / 62

Forward differentiation Program P is a sequence of instructions F k T o = X , given k ’th line T k = F k ( T k − 1 ) Function is a composition F = F p ◦ F p − 1 ◦ . . . ◦ F 1 Chain rule Y = F ′ ( X ) ˙ ˙ 1 ( T o ) ˙ X = F ′ p ( T p − 1 ) F ′ p − 1 ( T p − 2 ) . . . F ′ X X , ˙ X → Y , ˙ Y cost ( ˙ Y ) = 4 ∗ cost ( Y ) Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 17 / 62

Differentiation: Example A simple example f = ( xy + sin x + 4)(3 y 2 + 6) Computer code, f = t 10 t 1 = x = t 2 y t 3 = t 1 t 2 t 4 = sin t 1 t 5 = t 3 + t 4 t 6 = t 5 + 4 t 2 = t 7 2 t 8 = 3 t 7 = t 8 + 6 t 9 t 10 = t 6 t 9 Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 18 / 62

F77 code: costfunc.f subroutine costfunc(x, y, f) t1 = x t2 = y t3 = t1*t2 t4 = sin(t1) t5 = t3 + t4 t6 = t5 + 4 t7 = t2**2 t8 = 3.0* t7 t9 = t8 + 6.0 t10 = t6*t9 f = t10 end Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 19 / 62

Differentiation: Direct mode Apply chain rule of differentiation ˙ t 1 = x t 1 = x ˙ ˙ = = ˙ t 2 y t 2 y ˙ t 1 t 2 + t 1 ˙ ˙ t 3 = t 1 t 2 t 3 = t 2 ˙ cos( t 1 )˙ = sin( t 1 ) = t 4 t 4 t 1 ˙ t 3 + ˙ ˙ t 5 = t 3 + t 4 t 5 = t 4 ˙ ˙ = t 5 + 4 = t 6 t 6 t 5 t 2 ˙ 2 t 2 ˙ t 7 = t 7 = t 2 2 ˙ 3˙ = 3 t 7 = t 8 t 8 t 7 ˙ ˙ t 9 = t 8 + 6 t 9 = t 8 ˙ t 6 t 9 + t 6 ˙ ˙ t 10 = t 6 t 9 t 10 = t 9 y = 0, ˙ t 10 = ∂ f y = 1, ˙ t 10 = ∂ f x = 1, ˙ ˙ and ˙ x = 0, ˙ ∂ x ∂ y tapenade -d -vars "x y" -outvars f costfunc.f Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 20 / 62

Automatic Differentiation: Direct mode SUBROUTINE COSTFUNC_D(x, xd, y, yd, f, fd) t1d = xd t1 = x t2d = yd t2 = y t3d = t1d*t2 + t1*t2d t3 = t1*t2 t4d = t1d*COS(t1) t4 = SIN(t1) t5d = t3d + t4d t5 = t3 + t4 t6d = t5d t6 = t5 + 4 t7d = 2*t2*t2d t7 = t2**2 t8d = 3.0*t7d t8 = 3.0*t7 t9d = t8d t9 = t8 + 6.0 t10d = t6d*t9 + t6*t9d t10 = t6*t9 fd = t10d f = t10 END Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 21 / 62

Backward differentiation Program P is a sequence of instructions F k T o = X , given k ’th line T k = F k ( T k − 1 ) Function is a composition F = F p ◦ F p − 1 ◦ . . . ◦ F 1 Chain rule X = [ F ′ ( X )] ⊤ ¯ 2 ( T 1 )] ⊤ . . . [ F ′ p ( T p − 1 )] ⊤ ¯ ¯ Y = [ F ′ 1 ( T o )] ⊤ [ F ′ Y X , ¯ Y → ¯ X cost (¯ X ) = 4 ∗ cost ( Y ) Praveen. C (TIFR-CAM) Adjoints and optimization IITM, Nov 2008 22 / 62

Adjoint approach to optimization using automatic differentiation - PowerPoint PPT Presentation

Adjoint approach to optimization using automatic differentiation (AD) Praveen. C praveen@math.tifrbng.res.in Tata Institute of Fundamental Research Center for Applicable Mathematics Bangalore - 560 065 http://math.tifrbng.res.in Indo-German

Adjoint code development and optimization using automatic differentiation (AD) Praveen. C

Airfoil shape optimization using adjoint method and automatic differentiation Praveen. C

Adjoint Solver Workshop Why is an Adjoint Solver useful? Design and manufacture for better

Adjoint Orbits, Principal Components, and Neural Nets Some facts about Lie groups and

Adjoint Derivative Computation Moritz Diehl and Carlo Savorgnan Adjoint Derivative Computation

Extension of the adjoint method Stanislas Larnier Institut de Mathmatiques de Toulouse

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Variational approach to data assimilation: optimization aspects and adjoint method Eric Blayo

Variational approach to data assimilation: optimization aspects and adjoint method Eric Blayo

High-Order, Time-Dependent Aerodynamic Optimization using a Discontinuous Galerkin Discretization

Adjoint approach to optimization Praveen. C praveen@math.tifrbng.res.in Tata Institute of

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Adjoint-Based Optimization of Time-Dependent Fluid-Structure Systems using a High-Order

Gradient-based optimization of flow problems using the adjoint method and high-order numerical

Adjoint-Based Optimization of Time-Dependent Fluid-Structure Systems using a High-Order

Adjoint-Based Optimization of Time-Dependent Fluid-Structure Systems using a High-Order

Automatic Differentiation of programs and its applications to Scientific Computing Laurent

Virtual Elements for the Stokes problem L. Beiro da Veiga in collaboration with: P .

Design of the LBNF Beamline Jim Hylen, for the DUNE collaboration Fermilab

Computational Geometry and Computer Algebra Sylvain Lazard INRIA Nancy Joint work with C.

Power law networks Social and Technological Networks Rik Sarkar University of Edinburgh, 2019.

Big Data Analytics: What is Big Data? H. Andrew Schwartz Stony Brook University CSE545, Fall

C++ C+ + is an object-oriented extension of C C was designed by Dennis Ritchie at Bell

Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA