Automatic Differentiation-based perturbation methods for - - PowerPoint PPT Presentation

automatic differentiation based perturbation methods for
SMART_READER_LITE
LIVE PREVIEW

Automatic Differentiation-based perturbation methods for - - PowerPoint PPT Presentation

Automatic Differentiation-based perturbation methods for uncertainties and errors Anca Belme, Massimiliano Martinelli, Laurent Hasco et, Val erie Pascual, Alain Dervieux Tropics project http://www-sop.inria.fr/tropics/ INRIA


slide-1
SLIDE 1

Automatic Differentiation-based perturbation methods for uncertainties and errors

Anca Belme, Massimiliano Martinelli, Laurent Hasco¨ et, Val´ erie Pascual, Alain Dervieux Tropics project http://www-sop.inria.fr/tropics/

INRIA Sophia-Antipolis Support of FP6-NODESIM-CFD 44` eme AAAF, Nantes, 23-25 mars 2009

slide-2
SLIDE 2

Overview:

  • 1. Progress in Automatic Differentiation,
  • 2. Numerical error reduction,
  • 3. Perturbation methods for uncertainties.
slide-3
SLIDE 3
  • 1. Progress in Automatic Differentiation

The TAPENADE program differentiator: TAPENADE (115 000 lines of JAVA) transforms a FORTRAN program into a FORTRAN program which computes the derivative of the given FORTRAN

  • program. TAPENADE is organised around an intermediate step, the analysis

and differentation of an abstract imperative language. http://www-sop.inria.fr/tropics/

Differentiation Engine Imperative Language Analyzer (IL) C parser Fortran95 parser Fortran77 parser (IL) C printer Fortran95 printer Fortran77 printer Overall Architecture of tapenade

slide-4
SLIDE 4

Differentiability/Differentiation modes

Direct/Tangent mode:

  • Differentiated code computes (c, δc) → j′(c).δc

ateaux derivative of a functional

  • Computational cost factor: αT ≈ 4
  • Chain rule applies in same order as initial code
  • Does not store intermediate variables

Backward/Reverse mode:

  • Differentiated code computes (c, δj) → (j′(c))∗.δj
  • gradient of a functional
  • Computational cost factor: αR ≈ 5
  • chain rule applies in reverse order to initial code.
  • Needs storing intermediate variables in a first direct sweep.
slide-5
SLIDE 5
slide-6
SLIDE 6

Reverse differentiation strategies for iterative solvers

Find Wh such that |Ψh(Wh)| ≤ ε If the iteration for Wh is stopped by a test with ε > 0, then Wh depends on initial conditions of solution algorithm, in a similar manner to an unsteady

  • system. Also Wh is only piecewise differentiable. It then is possible to apply

R-AD to the whole programme, including iterative solution algorithms. Due to dependance to intermediate states they need be stored (or recomputed). For this case and for unsteady problems, efficients strategies of storage or recomputations can be introduced by user by means of “check point” con- trols. Assuming ε = 0, state Wh does not depend on initial conditions of solution algorithm and is differentiable with respect to ch. The iteration should be pointed as “independant” or parallel in order to avoid unnecessary storage

  • f intermediate values.
slide-7
SLIDE 7
  • 1. Progress in Automatic Differentiation

The TAPENADE program differentiator, recent advances: In its Version 3(alpha), TAPENADE takes into account user directives in- serted into sources. Directive II-LOOP allows for a more efficient reverse differentiation of par- allel loops by inhibiting storages useless in “parallel” case. Directive NOCHECKPOINT can be used for optimal checkpointing in reverse

  • differentiation. TAPENADE lets the user specify finely which procedure calls

must be checkpointed or not. TAPENADE V3 has been extended to deal with Fortran95 and with ANSI

  • C. TAPENADE is optimised for successive reverse differentiations.
slide-8
SLIDE 8
  • 2. Numerical error reduction

Abstract representation of the Partial Differential Equation: Ψ(W) = 0 . Discretisation of the PDE: Ψh(Wh) = 0 ∈ RN Wh ∈ RN , Wh = [(Wh)i] . Operators between continuous and discrete: Rh : RN → V ⊂ L2(Ω) ∩ C0(¯ Ω) vh → Rhvh Th : V → RN v → Thv Wh(x, y, z) = (RhWh)(x, y, z) ; W − Wh = small? corrector?

slide-9
SLIDE 9
  • 2. Numerical error reduction

A priori estimate: Ψh(W) − Ψh(Wh) = Ψh(W) ⇒ W − Wh ≈ [ ∂Ψh ∂Wh ]−1Ψh(W), in practice: δWh = Rh[ ∂Ψh ∂Wh ]−1(Ψh − ThΨ)(W(h)). A posteriori estimate: Ψ(W) − Ψ(Wh) = −Ψ(Wh) ⇒ W − Wh ≈ −[ ∂Ψ ∂W ]−1Ψ(Wh), in practice: δWh = −Rh[ ∂Ψh ∂Wh ]−1ThΨ(RhWh), quadratures formulas can be used for Ψ(RhWh).

slide-10
SLIDE 10
  • 2. Numerical error reduction

Goal-oriented error: j(W) = (g, W)L2(Ω) s.t. Ψ(W) = 0 ( ∂Ψ ∂W )∗p = g jh = (g, RhWh)L2(Ω) s.t. Ψh(Wh) = 0 gh = Thg ( ∂Ψh ∂Wh )∗ph = gh ⇔ ph = [ ∂Ψh ∂Wh ]−∗Th gh ph = Rhph . A fundamental assumption of the present analysis is that this discrete adjoint is a good enough approximation of continuous adjoint.

slide-11
SLIDE 11
  • 2. Numerical error reduction

A posteriori goal-oriented errors and correctors Field corrector δWh = −Rh[ ∂Ψh ∂Wh ]−1ThΨ(Wh) Direct-linearized goal-oriented corrector δ1j = −

  g , Rh[ ∂Ψh

∂Wh ]−1ThΨ(Wh)

  

L2(Ω)

Adjoint-based goal-oriented correctors (Giles-Pierce) δ1j = − (ph , ThΨ(Wh))L2(Ω)

slide-12
SLIDE 12
  • 2. Numerical error reduction

An equivalence: Assuming that we have chosen: Th = R∗

h,

then direct-linearised and adjoint-based correctors are identical −

  g , Rh[ ∂Ψh

∂Wh ]−1ThΨ(Wh)

  

L2(Ω)

= −

  [ ∂Ψh

∂Wh ]−∗R∗

h g , ThΨ(Wh)

  

RN

= −

  T ∗

h[ ∂Ψh

∂Wh ]−∗R∗

h g , Ψ(Wh)

  

L2(Ω)

slide-13
SLIDE 13
  • 2. Numerical error reduction

A preliminary numerical example 3D Euler flow around a wing with NACA0012 section.

  • 1. Direct calculation
  • 2. Direct calculation + Correction

CPU is 30% larger, error on entropy deviation is 10 times smaller.

Entropy spurious generation for a direct computation of a steady flow and for a corrected one.

slide-14
SLIDE 14
  • 2. Numerical error reduction by correction, conclud’d

Which A posteriori goal-oriented errors and correctors Direct-linearized goal-oriented corrector δ1j = −

  g , Rh[ ∂Ψh

∂Wh ]−1ThΨ(Wh)

  

L2(Ω)

Adjoint-based goal-oriented correctors δ1j = − (ph , ThΨ(Wh))L2(Ω) Adjoints, as compared with linearised perturbations, are slightly more com- plex to compute for steady PDE, and much more complex for unsteady PDE’s (despite TAPENADE is well adapted for this exercice). For com- puting correctors and for several mesh adaptation applications, the “Direct- linearized goal-oriented corrector” can be used for an identical result.

slide-15
SLIDE 15
  • 2. A priori numerical error reduction, mesh adaption

Minimizing P1-Interpolation error

(Loseille, PhD, Paris VI, 2008) ||u − πMu||2 =

  |∂2u

∂ξ2|.m2

ξ + |∂2u

∂η2|.m2

η

   

2

dxdy where ξ and η are directions of diagonalization of the Hessian of u. min

M

EM under the constraint NM = N. This can be solved analytically.

slide-16
SLIDE 16
  • 2. A priori numerical error reduction, mesh adaption

Optimal Metric for interpolation Mopt = C N R−1

       

|∂2u

∂η2| −5/6|∂2u ∂ξ2| 1/6

|∂2u

∂ξ2| −5/6|∂2u ∂η2| 1/6

        R .

(1) with:C =

  • |∂2u

∂ξ2|.|∂2u ∂η2|

2

6 dxdy .

But this only minimises an interpolation error...

slide-17
SLIDE 17
  • 2. A priori numerical error reduction, mesh adaption

Approximation a priori error (with Alauzet & Loseille)

Steady Euler equations: W ∈ V =

  • H1(Ω)

5 , ∀φ ∈ V,

(Ψ(W) , φ) =

  • Ω φ ∇.F(W) dΩ −
  • Γ φ ˆ

F(W).n dΓ = 0 where Γ = ∂Ω and ˆ F is B.C. fluxes. Mixed-Element-Volume appoximation: ∀φh ∈ Vh,

  • Ωh φh∇.ΠhF(Wh) dΩh −
  • Γh φhΠh ˆ

F(Wh).n dΓh = −

  • Ωh φh Dh(Wh)dΩh,

where Πh is the usual elementwise linear interpolation and where Dh holds for a numerical dissipation term.

slide-18
SLIDE 18
  • 2. A priori numerical error reduction, mesh adaption

A priori adjoint-based error estimate: (g, W − Wh) ≈ ((Ψh − Ψ)(W), P), with

   ∂Ψ

∂W

  

P = g, the optimal mesh is obtain after some transformations by solving: Find Mopt = ArgminM

  • Ω |∇P||F(W) − πMF(W)|dΩ

+

  • Γ |P||( ¯

F(W) − πM ¯ F(W)).n| dΓ (2) under the constraint C(M) = N. Solved analytically as interpolation case. Remark: The adjoint-based formulation is compulsory.

slide-19
SLIDE 19

Application to sonic boom : Hessian-based (Mach L2)

slide-20
SLIDE 20

Goal-oriented + Hessian-based (“foot print” funct.)

slide-21
SLIDE 21

Pressure track at ground

slide-22
SLIDE 22
  • 3. Hessian-based perturbation methods for uncertainties

A differentiation problem: Given a computer program computing a functional j(c) for c ∈ Rn j(c) = J(c, W(c)) , such that Ψ(c, W) = 0 we want, applying a source-to-source programme differentiation soft- ware, viz. TAPENADE, to get a computer program computing the sec-

  • nd derivatives:

d2j dc2 = (Hii). The solution W(c) ∈ RN of state equation Ψ(c, W) = 0 is supposed here to be steady-state. It can be obtained by explicit or implicit pseudo-time advancing techniques. Application to an unsteady problem can be treated in a similar manner, but with important extra difficulties.

slide-23
SLIDE 23

Why do we need Hessian?

Perturbative methods for uncertainty propagation:

  • Method of Moments

µj ≃ j(µc) + 1

2

  • i Hiiσ2

i

σ2

j ≃

  • i G2

iσ2 i + 1 2

  • i,k H2

ikσ2 i σ2 k

Robust optimization:

  • Gradient-based methods for jrobust(c) = j(c) + ε||dj

dc||

  • Gradient-free methods for jrobust(c) = j(c) + 1

2

  • i Hiiσ2

i

  • Gradient-free methods for jrobust(c) = j(c) + kσ2

j

Adjoint-corrected functionals:

  • Gradient-based methods for jcorr(c) = j(c) − Ψex(c, W), Π0
slide-24
SLIDE 24

Second derivative: Tangent-on-Tangent approach

d2j dcidck = −Π∗

0D2 i,kΨ + D2 i,kJ

with the adjoint Π0 and θk = dW dck solutions of: ( ∂Ψ ∂W )∗Π = ( ∂J ∂W )∗ ∂Ψ ∂W dW dck = −∂Ψ ∂c ek and: D2

i,kJ = ∂

∂c(∂J ∂c ei)ek + ∂ ∂W (∂J ∂c ei)θk + ∂ ∂W (∂J ∂c ek)θi + ∂ ∂W ( ∂J ∂W θi)θk D2

i,kΨ = ∂

∂c(∂Ψ ∂c ei)ek + ∂ ∂W (∂Ψ ∂c ei)θk + ∂ ∂W (∂Ψ ∂c ek)θi + ∂ ∂W ( ∂Ψ ∂W θi)θk

slide-25
SLIDE 25

Second derivative: Tangent-on-Reverse

∂ ∂ci (∂j ∂c)∗ = (∂2j

∂c2)ei = ∂ ∂c(∂J ∂c)∗ei + ∂ ∂W (∂J ∂c)∗θi +

− ∂

∂c[(∂Ψ ∂c )∗Π0]ei − ∂ ∂W [(∂Ψ ∂c )∗Π0]θi − (∂Ψ ∂c )∗λi

with Π0 solution of the adjoint system ( ∂Ψ ∂W )∗Π = ( ∂J ∂W )∗ θi and λi solution of the systems ∂Ψ ∂W θi = −∂Ψ ∂c ei ( ∂Ψ ∂W )∗λi = ∂ ∂c( ∂J ∂W )∗ei + ∂ ∂W ( ∂J ∂W )∗θi − ∂ ∂c[( ∂Ψ ∂W )∗Π0]ei − ∂ ∂W [( ∂Ψ ∂W )∗Π0]θi

slide-26
SLIDE 26

Full Hessian: ToT vs. ToR

Common part:

  • Adjoint state ( ∂Ψ

∂W )∗Π = ( ∂J ∂W )∗

  • n linear system to solve: ∂Ψ

∂W dW dci = −∂Ψ ∂c ei Specific parts:

  • We assume unitary cost for a single residual evaluation Ψ(c, W)
  • ToT : computation of D2

ikΨ and D2 ikJ =

⇒ n(n + 1) 2 α2

T

(1 < αT < 4)

  • ToR : n (adjoint) linear system to solve: ( ∂Ψ

∂W )∗λi = ˙ ¯ JW − ˙ ¯ ΨW = ⇒ n(niter + αT)αR (1 < αR ≤ 5) Use ToT when n < 2αR(niter + αT) α2

T

slide-27
SLIDE 27

Application to reduced order modeling of 3D-Euler flow

Functional to model:

  • Inputs: Mach number and angle of attack
  • Output: drag coefficient.

Numerical basis: Upwind vertex-centered finite-volume, implicit pseudo-time advancing with first-order Jacobian

slide-28
SLIDE 28

Building the reduced quadratic model

0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.78 0.8 0.82 0.84 0.86 0.88 1.6 1.8 2 2.2 2.4 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 Drag

Nonlinear simulations

Mach Angle of attack Drag 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.78 0.8 0.82 0.84 0.86 0.88 1.6 1.8 2 2.2 2.4 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 Drag

Taylor 1st order (α=2.0, M=0.83)

Mach Angle of attack Drag 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.78 0.8 0.82 0.84 0.86 0.88 1.6 1.8 2 2.2 2.4 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 Drag

Taylor 2nd order (α=2.0, M=0.83)

Mach Angle of attack Drag

  • 3
  • 2
  • 1

1 2 3 0.78 0.8 0.82 0.84 0.86 0.88 1.6 1.8 2 2.2 2.4

  • 3
  • 2
  • 1

1 2 3 Relative difference (%)

Relative Difference: Nonlinear vs. Taylor 2nd order

Mach Angle of attack Relative difference (%)

slide-29
SLIDE 29

Comparing the reduced quadratic model with metamodels (∗)

Metamodel Points Expectation Relative error Variance Relative error RBF 8 6.855 10−3 0.029% 1.562 10−7 0.579% KRG 8 6.853 10−3 0.058% 1.551 10−7 0.128% First-order 6.830 10−3 0.393% 1.547 10−7 0.386% Second-order 6.860 10−3 0.043% 1.558 10−7 0.321% Memory in Mb Flow solver first-order Jacobian + Block-diagonal Precond. + work vectors 86+10+35 First derivatives ILU(1) Precond. + GMRES(200) + differentiated vectors 170+250+120 Second derivatives ILU(1) Precond. + GMRES(200) + differentiated vectors 170+250+240 CPU time in second CPU time in second Flow solver 403 403*8 Gradient 255 Hessian 278 Total 936 3224 (∗) Martinelli-Duvigneau, submitted in a journal.

slide-30
SLIDE 30

Concluding remarks

Efficiency oriented AD application

  • First and second derivative architecture well identified.
  • Optimum use of modern iterative solvers.
  • Use reverse mod whenever necessary
  • When using reverse mode a better efficiency is obtained by using new

TAPENADE functionalities. Current and future works

  • AD current studies concerning MPI efficiency.
  • Mesh adaptation, correction and accuracy control for large instationary

state systems.