adjoint approach to optimization
play

Adjoint approach to optimization Praveen. C - PowerPoint PPT Presentation

Adjoint approach to optimization Praveen. C praveen@math.tifrbng.res.in Tata Institute of Fundamental Research Center for Applicable Mathematics Bangalore 560065 http://math.tifrbng.res.in Health, Safety and Environment Group BARC 6-7


  1. Adjoint approach to optimization Praveen. C praveen@math.tifrbng.res.in Tata Institute of Fundamental Research Center for Applicable Mathematics Bangalore 560065 http://math.tifrbng.res.in Health, Safety and Environment Group BARC 6-7 October, 2010 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 1 / 58

  2. Outline • Minimum of a function • Constrained minimization • Finite difference approach • Adjoint approach • Automatic differentiation • Example Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 2 / 58

  3. Minimum of a function f(x) f'(x 0 )=0 x x 0 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 3 / 58

  4. Steepest descent method x 2 grad f(x 1 ,x 2 ) x 1 x n +1 = x n − s n ∇ f ( x n ) Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 4 / 58

  5. Objectives and controls • Objective function J ( α ) = J ( α, u ) mathematical representation of system performance • Control variables α ◮ Parametric controls α ∈ R n ◮ Infinite dimensional controls α : X → Y ◮ Shape α ∈ set of admissible shapes • State variable u : solution of an ODE or PDE R ( α, u ) = 0 ⇓ u = u ( α ) Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 5 / 58

  6. Gradient-based minimization: blackbox/FD approach β ∈ R N I ( β, Q ( β )) min • Initialize β 0 , n = 0 • For n = 0 , . . . , N iter ◮ Solve R ( β n , Q n ) = 0 ◮ For j = 1 , . . . , N ⋆ β n ( j ) = [ β n 1 , . . . , β n j + ∆ β j , . . . , β n N ] ⊤ ⋆ Solve R ( β n ( j ) , Q n ( j ) ) = 0 I ( β n ( j ) ,Q n ( j ) ) − I ( β n ,Q n ) d I ⋆ d β j ≈ ∆ β j ◮ Steepest descent step β n +1 = β n − s n d I d β ( β n ) Cost of FD-based steepest-descent Cost = O ( N + 1) N iter = O ( N + 1) O ( N ) = O ( N 2 ) Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 6 / 58

  7. Accuracy of FD: Choice of step size d xf ( x 0 ) = f ( x 0 + δ ) − f ( x 0 ) d + O ( δ ) δ In principle, if we choose a small δ , we reduce the error. But computers have finite precision. Instead of f ( x 0 ) the computers gives f ( x 0 ) + O ( ǫ ) where ǫ = machine precision. [ f ( x 0 + δ ) + O ( ǫ )] − [ f ( x 0 ) + O ( ǫ )] = f ( x 0 + δ ) − f ( x 0 ) ǫ + C 1 δ δ δ d ǫ = d xf ( x 0 ) + C 2 δ + C 1 , C 1 , C 2 depend on f, x 0 , precision δ � �� � Total error Total error is least when √ ǫ, � δ = δ opt = C 3 C 3 = C 1 /C 2 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 7 / 58

  8. Accuracy of FD: Choice of step size Error Precision ǫ δ opt 10 − 8 10 − 4 Single 10 − 16 10 − 8 Double See: C 1 δ Brian J. McCartin, Seven Deadly Sins of Numerical Computation The American Mathematical Monthly, Vol. 105, No. 10 (Dec., C 2 ǫ/δ 1998), pp. 929-941 δ δ opt Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 8 / 58

  9. Drag gradient using FD (Samareh) Errors in Finite Difference Approximations 1 10 Mid chord LE Sweep2 0 10 Tip chord LE sweep3 Twist (root) -1 10 Twist (mid) Twist (tip) -2 10 % Error LE Sweep2 -3 Round � off 10 Mid chord LE Sweep3 -4 10 Truncation Tip chord -5 10 Twist (mid) Twist (root) Twist (tip) -6 10 -9 -8 -7 -6 -5 -4 -3 -2 -1 10 10 10 10 10 10 10 10 10 Scaled Step Size Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 9 / 58

  10. Iterative problems I ( β, Q ) , where R ( β, Q ) = 0 • Q is implicitly defined, require an iterative solution method. • Assume a Q 0 and iterate Q n − → Q n +1 until || R ( β, Q n ) || ≤ TOL • If TOL is too small, need too many iterations • Many problems, we cannot reduce || R ( β, Q n ) || to small values • This means that numerical value of I will be noisy Finite difference will contain too much error, and is useless RAE5243 airfoil, Mach=0.68, Re=19 million, AOA=2.5 deg. iter Lift Drag 41496 0.824485788042416 1.627593747613790E-002 41497 0.824485782714867 1.627593516695762E-002 41498 0.824485777387834 1.627593285794193E-002 41499 0.824485772061306 1.627593054909022E-002 41500 0.824485766735297 1.627592824040324E-002 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 10 / 58

  11. Complex variable method f ( x 0 + iδ ) = f ( x 0 ) + iδf ′ ( x 0 ) + O ( δ 2 ) + iO ( δ 3 ) f ′ ( x 0 ) = 1 δ imag[ f ( x 0 + iδ )] + O ( δ 2 ) • No roundoff error • We can take δ to be very small, δ = 10 − 20 • Can be easily implemented ◮ fortran: redeclare real variables as complex ◮ matlab: no change • Iterative problems: β − → β + i ∆ β ◮ Obtain ˜ Q = Q ( β + i ∆ β ) by solving R ( β + i ∆ β, ˜ Q ) = 0 ◮ Then gradient 1 I ′ ( β ) ≈ ∆ β imag[ I ( β + i ∆ β, Q ( β + i ∆ β )] • Computational cost is O ( N 2 ) or higher (due to complex arithmetic) Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 11 / 58

  12. Objectives and controls • Objective function J ( α ) = J ( α, u ) mathematical representation of system performance • Control variables α ◮ Parametric controls α ∈ R n ◮ Infinite dimensional controls α : X → Y ◮ Shape α ∈ set of admissible shapes • State variable u : solution of an ODE or PDE R ( α, u ) = 0 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 12 / 58

  13. Mathematical formulation • Constrained minimization problem min α J ( α, u ) subject to R ( α, u ) = 0 • Find δα such that δJ < 0 ∂J ∂αδα + ∂J δJ = ∂uδu ∂J ∂αδα + ∂J ∂u = ∂αδα ∂u � ∂J � ∂α + ∂J ∂u = δα =: Gδα ∂u ∂α • Steepest descent δα = − ǫG ⊤ δJ = − ǫGG ⊤ = − ǫ � G � 2 < 0 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 13 / 58

  14. Mathematical formulation • Constrained minimization problem min α J ( α, u ) subject to R ( α, u ) = 0 • Find δα such that δJ < 0 ∂J ∂αδα + ∂J δJ = ∂uδu ∂J ∂αδα + ∂J ∂u = ∂αδα ∂u � ∂J � ∂α + ∂J ∂u = δα =: Gδα ∂u ∂α • Steepest descent δα = − ǫG ⊤ δJ = − ǫGG ⊤ = − ǫ � G � 2 < 0 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 14 / 58

  15. Sensitivity approach • Linearized state equation R ( α, u ) = 0 ∂R ∂α δα + ∂R ∂u δu = 0 or ∂R ∂α = − ∂R ∂u ∂u ∂α • Solve sensitivity equation iteratively, e.g., ∂ ∂α + ∂R ∂u ∂α = − ∂R ∂u ∂t ∂u ∂α • Gradient d J d α = ∂J ∂α + ∂J ∂u ∂u ∂α Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 15 / 58

  16. Sensitivity approach: Computational cost • n design variables: α = ( α 1 , . . . , α n ) • Solve primal problem R ( α, u ) = 0 to get u ( α ) • For i = 1 , . . . , n ◮ Solve sensitivity equation wrt α i ∂R ∂u = − ∂R ∂u ∂α i ∂α i ◮ Compute derivative wrt α i d J = ∂J + ∂J ∂u d α i ∂α i ∂u ∂α i • One primal equation, n sensitivity equations Computational cost = n + 1 Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 16 / 58

  17. Adjoint approach • We have δJ = ∂J ∂αδα + ∂J ∂R ∂α δα + ∂R ∂uδu and ∂u δu = 0 • Introduce a new unknown v � ∂R � ∂J ∂αδα + ∂J ∂α δα + ∂R ∂uδu + v ⊤ δJ = ∂u δu � ∂J � � ∂J � ∂α + v ⊤ ∂R ∂u + v ⊤ ∂R = δα + δu ∂α ∂u • Adjoint equation � ∂R � ⊤ � ∂J � ⊤ v = − ∂u ∂u • Iterative solution � ∂R � ⊤ � ∂J � ⊤ ∂v ∂t + v = − ∂u ∂u Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 17 / 58

  18. Adjoint approach: Computational cost • n design variables: α = ( α 1 , . . . , α n ) • Solve primal problem R ( α, u ) = 0 to get u ( α ) • Solve adjoint problem � ∂R � ⊤ � ∂J � ⊤ v = − ∂u ∂u • For i = 1 , . . . , n ◮ Compute derivative wrt α i d J = ∂J + v ⊤ ∂R d α i ∂α i ∂α i • One primal equation, one adjoint equation Computational cost = 2, independent of n Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 18 / 58

  19. Adjoint: Two approaches Continuous or Discrete or differentiate-discretize discretize-differentiate PDE PDE Adjoint Discrete PDE PDE Discrete Discrete adjoint adjoint Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 19 / 58

  20. Techniques for computing gradients • Hand differentiation • Finite difference method • Complex variable method • Automatic Differentiation (AD) ◮ Computer code to compute J ( α, u ) and R ( α, u ) ◮ Chain rule of differentiation ◮ Generates a code to compute derivatives ◮ ADIFOR, ADOLC, ODYSEE, TAMC, TAF, TAPENADE see http://www.autodiff.org Code for α Code for AD tool ∂J J ( α, u ) ∂α Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 20 / 58

  21. Derivatives • Given a program P computing a function F R m R n F : → X → Y • build a program that computes derivatives of F • X : independent variables • Y : dependent variables Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 21 / 58

  22. Derivatives � ∂Y j � • Jacobian matrix: J = ∂X i • Directional or tangent derivative Y = J ˙ ˙ X • Adjoint mode X = J ⊤ ¯ ¯ Y • Gradients ( n = 1 output) � ∂Y � J = = ∇ Y ∂X i Praveen. C (TIFR-CAM) Optimization BARC, 6 Oct 2010 22 / 58

Recommend


More recommend