Simulating Human Aorta Material Behavior Using a GPU Explicit Finite Element Solver Vukasin Strbac †, David M. Pierce‡, Jos Vander Sloten †, Nele Famaey † †Biomechanics Section, Mechanical Engineering, KULeuven, Leuven, BE ‡Mechanical Engineering, Biomedical Engineering, Mathematics, Interdisciplinary Mechanics Lab University of Connecticut, Storrs, CT, US Vukasin Strbac GTC2016
Introduction: general biomech. motivation Accelerating FE analysis provides new clinical opportunities: pre-operative (e.g. faster custom stent design) intra-operative stress monitoring post-operative damage monitoring/fatigue estimation at lower cost Ever-advancing capabilities of modern hardware, e.g. GPGPUs, offer opportunities to accelerate established algorithms angioplasty stenting heterogeneous composition, aorta tissue behavior 2/21 Vukasin Strbac GTC2016 14.04.16
Introduction: core facts Explicit FE is pleasingly parallel (for the most part) Explicit FE is sensitive to material and geometric parameters Complex material model is necessary for accurate results GPUs are sensitive to floating point precision used What can we expect? How does anisotropy affect GPU explicit FE? How do hexahedral element formulations affect GPU explicit FE? Particularly in terms of Gaussian integration schemes How does that affect our research? 3/21 Vukasin Strbac GTC2016 14.04.16
𝑵 {𝒗} + 𝑑 𝑒 [𝑵]{𝒗 } + {𝑮 𝒗 } = [𝑺] Introduction: GPU-based FE solver Nonlinear, explicit, large strain, central differences Assign Boundary Conditions Trilinear hexahedral elements, unstructured grid Compute stress Templated Integrate stress per single/double precision, textures, output, etc.. element Assemble global Boundary conditions: kinematic, constant force, pressure internal force vector Materials – following slides (linear, nonlinear) per Pre-processing Forward time- node Custom input file structure for geometry, material and BCs marching step Post-processing Check energy Binary .vtu files + Paraview balance Real-time rendering Validated against n Co - Abaqus (Dassault Systèmes) and nv? - FEAP(University of California, Berkeley) y End 4/21 Vukasin Strbac GTC2016 14.04.16
Element technology: Biofidelic materials Compute stress Integrate stress • Linear elastic model (Hookean) H 𝜏 𝑗𝑘 = 𝑔 𝜗 𝑗𝑘 = λ𝜀 𝑗𝑘 𝜗 𝑗𝑘 + 2𝜈𝜗 𝑗𝑘 = Cε Nonlinear elastic model, isotropic (neo-Hookean) • 𝜖Ψ 𝜏 = 𝑔( 𝜖𝑮 ) NH Nonlinear elastic, anisotropic (fiber-reinforced arterial tissue model [Gasser et al., 2006] ) • GHO Anisotropic constituent [Weisbecker et al., 2012] 5/21 Vukasin Strbac GTC2016 14.04.16
Element technology: Gaussian integration Compute stress Integrate stress Arithmetic Memory expense expense Under-integration ζ -Fast 1x 1x UI -Inaccurate -Hourglassing ξ -No volumetric locking (Not appropriate for anisotropic materials -No shear locking µ with low mesh density) ζ Full integration (FI) -Slow FI ~3x -Very accurate ~8x -Volumetric locking ξ -Shear locking µ ζ Selective reduced (SR) SR -Very slow ~9x ~4x -Very accurate ξ -No volumetric locking -Shear locking µ 6/21 Vukasin Strbac GTC2016 14.04.16
Ideal case: extension-inflation test Extension 5% + systolic pressure Reference solutions FEAP & ABAQUS We implement the same materials in all solvers We solve using 3 different generations: Fermi, Kepler and Maxwell (no optimization) GHO material (+neo-Hooke for ref.) Scaling Convergence criteria based on reference solutions RMS < 0.0005mm deltaRMS < 0.0001mm 7/21 Vukasin Strbac GTC2016 14.04.16
Ideal case: extension-inflation test Under-integration Full integration Selective-reduced integration 8/21 Vukasin Strbac GTC2016 14.04.16
Ideal case: extension-inflation test FERMI (C2075) 9/21 Vukasin Strbac GTC2016 14.04.16
Ideal case: extension-inflation test KEPLER (K20c) 10/21 Vukasin Strbac GTC2016 14.04.16
Ideal case: extension-inflation test MAXWELL (GTX980) 11/21 Vukasin Strbac GTC2016 14.04.16
Ideal case: extension-inflation test Anisotropy cost (GHO/NH) Integration cost (SR/UI) 12/21 Vukasin Strbac GTC2016 14.04.16
Ideal case: conclusions Speed-ups are considerable Difficult to say exactly why one GPU is faster in a specific scenario No architecture-specific considerations are employed, speedup is free Useful for Parameter-fitting and geometry identification Sensitivity analyses …anything made possible by large numbers of FE simulations Not a clinically accurate scenario 13/21 Vukasin Strbac GTC2016 14.04.16
Near incompressibility and floating point precision MPa Single precision Double precision 14/21 Vukasin Strbac GTC2016 14.04.16
FI UI SR Double Single 15/21 Vukasin Strbac GTC2016 14.04.16
Clinically relevant test case: AAA inflation p1 p2 p3 p4 p5 Patient-specific FE meshes of abdominal aortic aneurysms [Tarjuelo-Gutierrez et al., 2014] 16/21 Vukasin Strbac GTC2016 14.04.16
Clinically relevant test case: AAA inflation thrombus The ‘silent killer’ Peak Wall Stress (PWS) estimate needed Thrombus: Separation Different material Layer specific material properties aorta 17/21 Vukasin Strbac GTC2016 14.04.16
18/21 Presenter Type of presentation 14.04.16
Clinically relevant test case: AAA inflation p1 p2 p3 p4 p5 FEAP[h] 21.12 22.79 21.01 21.52 21.86 CUDA[h] 2.93 1.22 2.75 3.03 1.31 factor x7.2 x18.7 x7.6 x7.1 x16.8 19/21 Vukasin Strbac GTC2016 14.04.16
Poisson = 0.4995 20/21 Vukasin Strbac GTC2016 14.04.16
Conclusion We maintain significant speedup even using state-of-the-art materials, high- order integration and double precision on GPUs, with no compromise whatsoever on accuracy. Even for less than ideal meshes. Single precision becomes ineffective quickly, and depends on Poisson ratio. Double precision is necessary. Practical opportunities, enabling technology: FE sensitivity analysis Inverse FE simulations Indications of clinical use Generally: Memory-bound algorithm Lots of random reads and atomic writes due to unstructured grid For details on implementation/optimization see: S4497, Strbac GTC2014 21/21 Vukasin Strbac GTC2016 14.04.16
Thank you for your attention. Questions? 22/21 Vukasin Strbac GTC2016 14.04.16
Recommend
More recommend