Fast Algorithms for Nonlinear Optimal Control for Diffeomorphic Registration Andreas Mang Department of Mathematics, University of Houston RICAM, New Trends in PDE-Constrained Optimization, 10/17/2019
Teaser: CLAIRE unknowns CPUs GPUs runtime 50M ( 256 3 ) 512 — < 2 sec 50M ( 256 3 ) 1 1 ≈ 6 sec 200B (4096 3 ) 8192 — ≈ 3.5 min http://andreasmang.github.io/claire 2 [Mang et al., 2016,Gholami et al., 2017,Mang et al., 2019]
R. Azencott G. Biros M. Brunn C. Davatzikos J. He CS UStuttgart Math UHouston Oden UTAustin CBIA UPenn Math UHouston M. Mehl J. Herring N. Himthani J. Kim CS UStuttgart Math UHouston Oden UTAustin Math UHouston 3
4
Inverse Problem find a plausible map y : R d → R d such that ( m 0 ◦ y )( x ) = m 1 ( x ) , for all x ∈ R d y ◮ m 0 ◦ y m 0 m 1 5 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]
m 1 m 0 6 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]
m 1 m 0 y ∈ diff (Ω) 7 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]
m 1 m 0 y �∈ diff (Ω) 8 [Amit, 1994,Modersitzki, 2009,Modersitzki, 2004,Fischer and Modersitzki, 2008]
Building Blocks
Flows of Diffeomorphisms introduce pseudo-time variable t ∈ [0 , 1] and parameterize y by v ∂ t y = v ( y ) , y (0) = id R d • y ( s, t, x ) x ( s ) = y ( s, s, x ) = y ( t, s, y ( s, t, x )) • 10 [Younes, 2010]
Optimal Control Problem (Prototype) minimize dist ( y (1) · m 0 , m 1 ) + reg( v ) v , y subject to ∂ t y = v ( y ) , y (0) = id R d Large Deformation Diffeomorphic Metric Mapping 11 [Younes, 2010,Beg et al., 2005]
12
Regularity ∂ t y = v ( y ) , y (0) = id v ∈ L 2 ([0 , 1] , V ) , V ֒ → W s, 2 ( R 3 ) 3 , s > 5 / 2 ⇒ y ∈ G V ⊆ diff ( R 3 ) = (smoothness class 1 ≤ r ≤ s − 3 / 2 ) 13 [Beg et al., 2005,Trouve, 1998,Dupuis et al., 1998]
Regularity � 1 � 1 � v ( t ) � 2 V d t = �L v ( t ) , v ( t ) � L 2 (Ω) d d t 0 0 L : V → V ∗ , L := (1 − γ 2 ∇ ) κ id , γ, κ > 0 �� 1 � dist G ( id R d , φ ) 2 = inf � v � 2 V d t : φ = y (1) v 0 ∂ t y = v ( y ) , y (0) = id R d 14 [Beg et al., 2005]
Regularity (RKHS) V ≡ V κ ( RKHS with associated kernel κ ) v ( t, x ) := � q j =1 κ ( x j ( t ) , x ) α j ( t ) q q � � � v ( t ) � 2 κ ( x j ( t ) , x k ( t )) α ⊺ V = j ( t ) α k ( t ) j =1 k =1 κ ( x , y ) ∝ exp( − 0 . 5 � x − y � 2 Σ − 1 ) 15
Distance Functional dist SSD ( m 0 , m 1 ) = � m 0 − m 1 � 2 L 2 (Ω) 1 − | m 0 − m 1 | m 0 m 1 16 [Sotiras et al., 2013,Modersitzki, 2009]
Distance Functional 17
Distance Functional � m 1 , m 0 � L 2 (Ω) dist CC ( m 0 , m 1 ) = � m 1 , m 1 � L 2 (Ω) � m 0 , m 0 � L 2 (Ω) � ∇ m 0 ) ⊺ ˜ ∇ m 1 ) 2 d x 1 − (( ˜ dist NGF ( m 0 , m 1 ) = Ω 18 [Sotiras et al., 2013,Modersitzki, 2009,Haber and Modersitzki, 2006]
Distance Functional (RKHS) s j := { x j 1 , . . . , x j k } , j = 1 , 2 , . . . 19 [Azencott et al., 2010]
Distance Functional (RKHS) k 2 ( � k � k dist ( s 1 , s 2 ) = 1 j =1 κ ( x 1 i , x 1 j ) i =1 − � k � m j =1 2 κ ( x 1 i , x 2 j ) i =1 + � k � k j =1 κ ( x 2 i , x 2 j )) i =1 κ ( x , y ) ∝ exp( − 0 . 5 � x − y � 2 Σ − 1 ) 20 [Azencott et al., 2010]
Formulations
Optimal Control Problem 1 L 2 (Ω) + β 2 � y (1) · m 0 − m 1 � 2 2 � v � 2 minimize L 2 ([0 , 1] , V ) v , y subject to ∂ t y = v ( y ) , y (0) = id R d 22 [Younes, 2010,Beg et al., 2005]
Deformation Model ∂ t m + � v , ∇ m � = 0 in Ω × (0 , 1] m = m 0 in Ω × { 0 } 23
Optimization Problem 1 L 2 (Ω) + β 2 � m (1) − m 1 � 2 2 � v � 2 minimize L 2 ([0 , 1] , V ) v , m subject to ∂ t m + � v , ∇ m � = 0 m = m 0 (div v = 0) [Arguilière et al., 2016,Chen and Lorenz, 2012,Barbu and Marinoschi, 2016,Borzi et al., 24 2002,Hart et al., 2009,Herzog et al., 2019,Jarde and Ulbrich, 2019,Vialard et al., 2012]
Solver
Numerical Optimization
Lagrangian minimize J ( v ) subject to C ( v , m ) = 0 v ,m L ( v , m, λ ) := J ( v ) + � λ, C ( v , m ) � L 2 (Ω) k 27 [Biegler et al., 2003,Borzi and Schulz, 2012,Hinze et al., 2009,Lions, 1971]
Optimality Conditions g m m ⋆ g v g ( w ⋆ )= ( w ⋆ )= 0 , w ⋆ = v ⋆ ∈ R n , n ≫ 1 e 6 λ ⋆ g λ g ( w )= ∂ ε L ( w + ε ˜ w ) | ε =0 (optimize-then-discretize) 28 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]
Full Space Method w k +1 = w k + α k ˜ w k g m m ˜ H mm H mv A ⊺ g v ˜ v H vm H reg C ⊺ = − ˜ g λ A C 0 λ k k k � �� � � �� � � �� � g k H k w k ˜ 29 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]
Reduced Space Method g m = 0 g λ = 0 and m = − A − 1 C ˜ = ⇒ ˜ v ˜ λ = − A − T ( H mm ˜ m + H mv ˜ v ) 30 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]
Reduced Space Method v k +1 = v k + α k ˜ v k v k = − (( H reg + H mis ) k ) − 1 g v ˜ k H mis := C T A − T ( H mm A − 1 C − H mv ) − H vm A − 1 C 31 [Biros and Ghattas, 2005a,Biros and Ghattas, 2005b,Haber and Ascher, 2001]
Problem Formulation (Reminder) 1 L 2 (Ω) + β 2 � m (1) − m 1 � 2 minimize 2 �L v , v � L 2 (Ω) d v ,m subject to ∂ t m + � v , ∇ m � = 0 m = m 0 32
Reduced Gradient � 1 g v ( v ) := β L v + Q λ ∇ m d t 0 ∂ t m + � v , ∇ m � = 0 in Ω × (0 , 1] m = m 0 in Ω × { 0 } − ∂ t λ − div λ v = 0 in Ω × [0 , 1) λ = m 1 − m in Ω × { 1 } 33
Newton–Krylov Method H v v k = − g v k ˜ k , v k +1 = v k + α k ˜ v k ◮ globalized via Armijo line search ◮ (preconditioned) CG method ◮ matrix-free (only matvec required) ◮ inexactness (Eisenstat & Walker) 34
(Reduced) Hessian Matvec � 1 m + ˜ H v [˜ v ]( v ) := β L ˜ v + Q λ ∇ ˜ λ ∇ m d t 0 ∂ t ˜ m + � v , ∇ ˜ m � + � ˜ v , ∇ m � = 0 in Ω × (0 , 1] m = 0 ˜ in Ω × { 0 } − ∂ t ˜ λ − div(˜ λ v + λ ˜ v ) = 0 in Ω × [0 , 1) ˜ λ = − ˜ m in Ω × { 1 } 35
Computational Bottlenecks ◮ evaluating objective: 1 PDE solve ◮ evaluating gradient: 2 PDE solves ◮ Hessian matvec: 2 PDE solves 36
Computational Bottlenecks ◮ efficient time integrator (fast PDE solves) ◮ effective preconditioner (few PDE solves) 37
PDE Solver
Time Integration • • • • • • • • • • x • • • • • ∂ t u + v · ∇ u = f ( u, v ) • • • • • • • • • • t j • • • • • d t y = v ( y ) in [ t j − 1 , t j ) • • • • • • • • • • • • • • • y • • for t = t j • • y = x • • • • • • • • • • • • • • • • • • • • • • • • • • t j − 1 39
Time Integration • • • • • • • • • • x • • • • • ∂ t u + v · ∇ u = f ( u, v ) • • • • • • • • • • t j • • • • • • • • • • in ( t j − 1 , t j ] • • d t u ( y ) = f • • • • • • • • y • • • • • • • • • • u = u 0 for t = t j − 1 • • • • • • • • • • • • • • • • • • • • t j − 1 39
Preconditioner
Spectral Preconditioner v = − g v ( H reg + H mis )˜ ( I + H − 1 v = − H − 1 reg g v reg H mis )˜ 41
2L Preconditioner F H + F L = I He k = F H HF H e k + F L HF L e k ˜ v = ˜ v L + ˜ v H H L ˜ v L = ( F L HF L )˜ v L = − F L g H H ˜ v H = ( F H HF H )˜ v H = − F H g [Adavani and Biros, 2008,Biros and Doˇ gan, 2008,Giraud et al., 2006,Kaltenbacher, 42 2003,Kaltenbacher, 2001,King, 1990]
2L Preconditioner ˜ − 1 / 2 Hw = − H reg g ˜ 1 / 2 − 1 / 2 − 1 / 2 w := H H := ( I + H reg ˜ v , reg ) reg H mis H 43
2L Preconditioner Hu = s , u = u L + u H ≈ F L Q P ¯ u L + F H s u L ≈ ˜ H − 1 ¯ c Q R F L s 44
2L Preconditioner ˜ c = Q R ˜ H G HQ P ˜ − 1 / 2 − 1 / 2 H c = I c + H reg ,c H mis ,c H reg ,c c A − T c ( H mm,c A − 1 c C c − H mv,c ) − H vm,c A − 1 H mis ,c = C T c C c 45
Parallel Implementation
MPI Parallelism ◮ AccFFT http://accfft.org ◮ PETSc + TAO https://www.mcs. anl.gov/petsc/ 47 [Gholami et al., 2016,Munson et al., 2015,Balay et al., 2014]
Recommend
More recommend