Fast Algorithms for Nonlinear Optimal Control for Diffeomorphic - PowerPoint PPT Presentation

Fast Algorithms for Nonlinear Optimal Control for Diffeomorphic Registration Andreas Mang Department of Mathematics, University of Houston RICAM, New Trends in PDE-Constrained Optimization, 10/17/2019

Teaser: CLAIRE unknowns CPUs GPUs runtime 50M ( 256 3 ) 512 — < 2 sec 50M ( 256 3 ) 1 1 ≈ 6 sec 200B (4096 3 ) 8192 — ≈ 3.5 min http://andreasmang.github.io/claire 2 [Mang et al., 2016,Gholami et al., 2017,Mang et al., 2019]

R. Azencott G. Biros M. Brunn C. Davatzikos J. He CS UStuttgart Math UHouston Oden UTAustin CBIA UPenn Math UHouston M. Mehl J. Herring N. Himthani J. Kim CS UStuttgart Math UHouston Oden UTAustin Math UHouston 3

Inverse Problem find a plausible map y : R d → R d such that ( m 0 ◦ y )( x ) = m 1 ( x ) , for all x ∈ R d y ◮ m 0 ◦ y m 0 m 1 5 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]

m 1 m 0 6 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]

m 1 m 0 y ∈ diff (Ω) 7 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]

m 1 m 0 y �∈ diff (Ω) 8 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]

Building Blocks

Flows of Diffeomorphisms introduce pseudo-time variable t ∈ [0 , 1] and parameterize y by v ∂ t y = v ( y ) , y (0) = id R d • y ( s, t, x ) x ( s ) = y ( s, s, x ) = y ( t, s, y ( s, t, x )) • 10 [Younes, 2010]

Optimal Control Problem (Prototype) minimize dist ( y (1) · m 0 , m 1 ) + reg( v ) v , y subject to ∂ t y = v ( y ) , y (0) = id R d Large Deformation Diffeomorphic Metric Mapping 11 [Younes, 2010,Beg et al., 2005]

Regularity ∂ t y = v ( y ) , y (0) = id v ∈ L 2 ([0 , 1] , V ) , V ֒ → W s, 2 ( R 3 ) 3 , s > 5 / 2 ⇒ y ∈ G V ⊆ diff ( R 3 ) = (smoothness class 1 ≤ r ≤ s − 3 / 2 ) 13 [Beg et al., 2005,Trouve, 1998,Dupuis et al., 1998]

Regularity � 1 � 1 � v ( t ) � 2 V d t = �L v ( t ) , v ( t ) � L 2 (Ω) d d t 0 0 L : V → V ∗ , L := (1 − γ 2 ∇ ) κ id , γ, κ > 0 �� 1 � dist G ( id R d , φ ) 2 = inf � v � 2 V d t : φ = y (1) v 0 ∂ t y = v ( y ) , y (0) = id R d 14 [Beg et al., 2005]

Regularity (RKHS) V ≡ V κ ( RKHS with associated kernel κ ) v ( t, x ) := � q j =1 κ ( x j ( t ) , x ) α j ( t ) q q � � � v ( t ) � 2 κ ( x j ( t ) , x k ( t )) α ⊺ V = j ( t ) α k ( t ) j =1 k =1 κ ( x , y ) ∝ exp( − 0 . 5 � x − y � 2 Σ − 1 ) 15

Distance Functional dist SSD ( m 0 , m 1 ) = � m 0 − m 1 � 2 L 2 (Ω) 1 − | m 0 − m 1 | m 0 m 1 16 [Sotiras et al., 2013,Modersitzki, 2009]

Distance Functional 17

Distance Functional � m 1 , m 0 � L 2 (Ω) dist CC ( m 0 , m 1 ) = � m 1 , m 1 � L 2 (Ω) � m 0 , m 0 � L 2 (Ω) � ∇ m 0 ) ⊺ ˜ ∇ m 1 ) 2 d x 1 − (( ˜ dist NGF ( m 0 , m 1 ) = Ω 18 [Sotiras et al., 2013,Modersitzki, 2009,Haber and Modersitzki, 2006]

Distance Functional (RKHS) s j := { x j 1 , . . . , x j k } , j = 1 , 2 , . . . 19 [Azencott et al., 2010]

Distance Functional (RKHS) k 2 ( � k � k dist ( s 1 , s 2 ) = 1 j =1 κ ( x 1 i , x 1 j ) i =1 − � k � m j =1 2 κ ( x 1 i , x 2 j ) i =1 + � k � k j =1 κ ( x 2 i , x 2 j )) i =1 κ ( x , y ) ∝ exp( − 0 . 5 � x − y � 2 Σ − 1 ) 20 [Azencott et al., 2010]

Formulations

Optimal Control Problem 1 L 2 (Ω) + β 2 � y (1) · m 0 − m 1 � 2 2 � v � 2 minimize L 2 ([0 , 1] , V ) v , y subject to ∂ t y = v ( y ) , y (0) = id R d 22 [Younes, 2010,Beg et al., 2005]

Deformation Model ∂ t m + � v , ∇ m � = 0 in Ω × (0 , 1] m = m 0 in Ω × { 0 } 23

Optimization Problem 1 L 2 (Ω) + β 2 � m (1) − m 1 � 2 2 � v � 2 minimize L 2 ([0 , 1] , V ) v , m subject to ∂ t m + � v , ∇ m � = 0 m = m 0 (div v = 0) [Arguilière et al., 2016,Chen and Lorenz, 2012,Barbu and Marinoschi, 2016,Borzi et al., 24 2002,Hart et al., 2009,Herzog et al., 2019,Jarde and Ulbrich, 2019,Vialard et al., 2012]

Solver

Numerical Optimization

Lagrangian minimize J ( v ) subject to C ( v , m ) = 0 v ,m L ( v , m, λ ) := J ( v ) + � λ, C ( v , m ) � L 2 (Ω) k 27 [Biegler et al., 2003,Borzi and Schulz, 2012,Hinze et al., 2009,Lions, 1971]

Optimality Conditions     g m m ⋆ g v g ( w ⋆ )= ( w ⋆ )= 0 , w ⋆ = v ⋆  ∈ R n , n ≫ 1 e 6    λ ⋆ g λ g ( w )= ∂ ε L ( w + ε ˜ w ) | ε =0 (optimize-then-discretize) 28 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]

Full Space Method w k +1 = w k + α k ˜ w k       g m m ˜ H mm H mv A ⊺ g v ˜ v H vm H reg C ⊺ = −       ˜ g λ A C 0 λ k k k � �� g k H k w k ˜ 29 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]

Reduced Space Method g m = 0 g λ = 0 and m = − A − 1 C ˜ = ⇒ ˜ v ˜ λ = − A − T ( H mm ˜ m + H mv ˜ v ) 30 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]

Reduced Space Method v k +1 = v k + α k ˜ v k v k = − (( H reg + H mis ) k ) − 1 g v ˜ k H mis := C T A − T ( H mm A − 1 C − H mv ) − H vm A − 1 C 31 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]

Problem Formulation (Reminder) 1 L 2 (Ω) + β 2 � m (1) − m 1 � 2 minimize 2 �L v , v � L 2 (Ω) d v ,m subject to ∂ t m + � v , ∇ m � = 0 m = m 0 32

Reduced Gradient � 1 g v ( v ) := β L v + Q λ ∇ m d t 0 ∂ t m + � v , ∇ m � = 0 in Ω × (0 , 1] m = m 0 in Ω × { 0 } − ∂ t λ − div λ v = 0 in Ω × [0 , 1) λ = m 1 − m in Ω × { 1 } 33

Newton–Krylov Method H v v k = − g v k ˜ k , v k +1 = v k + α k ˜ v k ◮ globalized via Armijo line search ◮ (preconditioned) CG method ◮ matrix-free (only matvec required) ◮ inexactness (Eisenstat & Walker) 34

(Reduced) Hessian Matvec � 1 m + ˜ H v [˜ v ]( v ) := β L ˜ v + Q λ ∇ ˜ λ ∇ m d t 0 ∂ t ˜ m + � v , ∇ ˜ m � + � ˜ v , ∇ m � = 0 in Ω × (0 , 1] m = 0 ˜ in Ω × { 0 } − ∂ t ˜ λ − div(˜ λ v + λ ˜ v ) = 0 in Ω × [0 , 1) ˜ λ = − ˜ m in Ω × { 1 } 35

Computational Bottlenecks ◮ evaluating objective: 1 PDE solve ◮ evaluating gradient: 2 PDE solves ◮ Hessian matvec: 2 PDE solves 36

Computational Bottlenecks ◮ efficient time integrator (fast PDE solves) ◮ effective preconditioner (few PDE solves) 37

PDE Solver

Time Integration • • • • • • • • • • x • • • • • ∂ t u + v · ∇ u = f ( u, v ) • • • • • • • • • • t j • • • • • d t y = v ( y ) in [ t j − 1 , t j ) • • • • • • • • • • • • • • • y • • for t = t j • • y = x • • • • • • • • • • • • • • • • • • • • • • • • • • t j − 1 39

Time Integration • • • • • • • • • • x • • • • • ∂ t u + v · ∇ u = f ( u, v ) • • • • • • • • • • t j • • • • • • • • • • in ( t j − 1 , t j ] • • d t u ( y ) = f • • • • • • • • y • • • • • • • • • • u = u 0 for t = t j − 1 • • • • • • • • • • • • • • • • • • • • t j − 1 39

Preconditioner

Spectral Preconditioner v = − g v ( H reg + H mis )˜ ( I + H − 1 v = − H − 1 reg g v reg H mis )˜ 41

2L Preconditioner F H + F L = I He k = F H HF H e k + F L HF L e k ˜ v = ˜ v L + ˜ v H H L ˜ v L = ( F L HF L )˜ v L = − F L g H H ˜ v H = ( F H HF H )˜ v H = − F H g [Adavani and Biros, 2008,Biros and Doˇ gan, 2008,Giraud et al., 2006,Kaltenbacher, 42 2003,Kaltenbacher, 2001,King, 1990]

2L Preconditioner ˜ − 1 / 2 Hw = − H reg g ˜ 1 / 2 − 1 / 2 − 1 / 2 w := H H := ( I + H reg ˜ v , reg ) reg H mis H 43

2L Preconditioner Hu = s , u = u L + u H ≈ F L Q P ¯ u L + F H s u L ≈ ˜ H − 1 ¯ c Q R F L s 44

2L Preconditioner ˜ c = Q R ˜ H G HQ P ˜ − 1 / 2 − 1 / 2 H c = I c + H reg ,c H mis ,c H reg ,c c A − T c ( H mm,c A − 1 c C c − H mv,c ) − H vm,c A − 1 H mis ,c = C T c C c 45

Parallel Implementation

MPI Parallelism ◮ AccFFT http://accfft.org ◮ PETSc + TAO https://www.mcs. anl.gov/petsc/ 47 [Gholami et al., 2016,Munson et al., 2015,Balay et al., 2014]

Fast Algorithms for Nonlinear Optimal Control for Diffeomorphic - PowerPoint PPT Presentation

Fast Algorithms for Nonlinear Optimal Control for Diffeomorphic Registration Andreas Mang Department of Mathematics, University of Houston RICAM, New Trends in PDE-Constrained Optimization, 10/17/2019 Teaser: CLAIRE unknowns CPUs GPUs runtime

Fast Nonlinear Model Predictive Control Algorithms and Applications in Process Engineering

Nonlinear Optimization for Optimal Control Part 2 Pieter Abbeel UC Berkeley EECS Outline n

Bellmans curse of dimensionality n n-dimensional state space n Number of states grows

Outline n From linear to nonlinear n Model-predictive control (MPC) n POMDPs Page 1

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Turnpike property in finite-dimensional nonlinear optimal control elat 1 Emmanuel Tr 1 Univ.

standards Efficient 3D editing with optimal graphical control of all modifications Fast

Fast and Near-Optimal Algorithms for Approximating Distributions by Histograms Jayadev Acharya 1

Fast Solution of Optimal Control Problems with L1 Cost Simon Le Cleac'h and Zac Manchester

Efficient Symbolical and Numerical Algorithms for nonlinear model predictive control with

Robust Optimal Control for Nonlinear Dynamic Systems Moritz Diehl, Professor for Optimization in

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

Nonlinear Feedback Types in Impulse and Fast Control Alexander N. Daryin and Alexander B.

Nonlinear Control and Singularities Porquerolles, Var (83), France, October 24 th 28 th , 2010

Geometric Numerical Integration of Hamiltonian systems: application to some optimal control

Numerical solution of optimal control problems for descriptor systems Volker Mehrmann TU Berlin

High Warehouse Racks: Optimal Feedback Control and High Warehouse Racks: Optimal Feedback Control

Time-delay reservoir computers: nonlinear stability of functional differential systems and optimal

Applying Nonlinear Model Predictive Control (NMPC) Philipp Petr, Christian Schrder, Prof.

Optimal scaling of the transient phase of Metropolis Hastings algorithms Tony Leli` evre Ecole

Singular Optimal Control : a Degenerate Parabolic-Hyperbolic example Mamadou Gueye, Universidad