simulating human aorta material behavior
play

Simulating Human Aorta Material Behavior Using a GPU Explicit Finite - PowerPoint PPT Presentation

Simulating Human Aorta Material Behavior Using a GPU Explicit Finite Element Solver Vukasin Strbac , David M. Pierce, Jos Vander Sloten , Nele Famaey Biomechanics Section, Mechanical Engineering, KULeuven, Leuven, BE Mechanical


  1. Simulating Human Aorta Material Behavior Using a GPU Explicit Finite Element Solver Vukasin Strbac †, David M. Pierce‡, Jos Vander Sloten †, Nele Famaey † †Biomechanics Section, Mechanical Engineering, KULeuven, Leuven, BE ‡Mechanical Engineering, Biomedical Engineering, Mathematics, Interdisciplinary Mechanics Lab University of Connecticut, Storrs, CT, US Vukasin Strbac GTC2016

  2. Introduction: general biomech. motivation  Accelerating FE analysis provides new clinical opportunities:  pre-operative (e.g. faster custom stent design)  intra-operative stress monitoring  post-operative damage monitoring/fatigue estimation at lower cost  Ever-advancing capabilities of modern hardware, e.g. GPGPUs, offer opportunities to accelerate established algorithms  angioplasty  stenting  heterogeneous composition, aorta  tissue behavior 2/21 Vukasin Strbac GTC2016 14.04.16

  3. Introduction: core facts  Explicit FE is pleasingly parallel (for the most part)  Explicit FE is sensitive to material and geometric parameters  Complex material model is necessary for accurate results  GPUs are sensitive to floating point precision used  What can we expect?  How does anisotropy affect GPU explicit FE?  How do hexahedral element formulations affect GPU explicit FE?  Particularly in terms of Gaussian integration schemes  How does that affect our research? 3/21 Vukasin Strbac GTC2016 14.04.16

  4.  𝑵 {𝒗} + 𝑑 𝑒 [𝑵]{𝒗 } + {𝑮 𝒗 } = [𝑺] Introduction: GPU-based FE solver  Nonlinear, explicit, large strain, central differences  Assign Boundary Conditions  Trilinear hexahedral elements, unstructured grid Compute stress  Templated Integrate stress  per single/double precision, textures, output, etc..   element  Assemble global  Boundary conditions: kinematic, constant force, pressure internal force vector  Materials – following slides (linear, nonlinear)   per Pre-processing  Forward time-  node  Custom input file structure for geometry, material and BCs marching step  Post-processing  Check energy  Binary .vtu files + Paraview balance  Real-time rendering  Validated against  n  Co - Abaqus (Dassault Systèmes) and nv? - FEAP(University of California, Berkeley)  y  End 4/21 Vukasin Strbac GTC2016 14.04.16

  5. Element technology: Biofidelic materials Compute stress Integrate stress • Linear elastic model (Hookean)  H  𝜏 𝑗𝑘 = 𝑔 𝜗 𝑗𝑘 = λ𝜀 𝑗𝑘 𝜗 𝑗𝑘 + 2𝜈𝜗 𝑗𝑘 = Cε Nonlinear elastic model, isotropic (neo-Hookean) • 𝜖Ψ  𝜏 = 𝑔( 𝜖𝑮 )  NH Nonlinear elastic, anisotropic (fiber-reinforced arterial tissue model [Gasser et al., 2006] ) •  GHO  Anisotropic constituent  [Weisbecker et al., 2012] 5/21 Vukasin Strbac GTC2016 14.04.16

  6. Element technology: Gaussian integration Compute stress Integrate stress  Arithmetic  Memory  expense  expense  Under-integration  ζ  -Fast  1x  1x  UI  -Inaccurate  -Hourglassing  ξ  -No volumetric locking  (Not appropriate for anisotropic materials  -No shear locking  µ with low mesh density)  ζ  Full integration (FI)  -Slow  FI  ~3x  -Very accurate ~8x  -Volumetric locking  ξ  -Shear locking  µ  ζ  Selective reduced (SR)  SR  -Very slow  ~9x  ~4x  -Very accurate  ξ  -No volumetric locking  -Shear locking  µ 6/21 Vukasin Strbac GTC2016 14.04.16

  7. Ideal case: extension-inflation test  Extension 5% + systolic pressure  Reference solutions  FEAP & ABAQUS  We implement the same materials in all solvers  We solve using 3 different generations: Fermi, Kepler and Maxwell (no optimization)  GHO material (+neo-Hooke for ref.)  Scaling  Convergence criteria based on reference solutions  RMS < 0.0005mm  deltaRMS < 0.0001mm 7/21 Vukasin Strbac GTC2016 14.04.16

  8. Ideal case: extension-inflation test  Under-integration  Full integration  Selective-reduced integration 8/21 Vukasin Strbac GTC2016 14.04.16

  9. Ideal case: extension-inflation test  FERMI  (C2075) 9/21 Vukasin Strbac GTC2016 14.04.16

  10. Ideal case: extension-inflation test  KEPLER  (K20c) 10/21 Vukasin Strbac GTC2016 14.04.16

  11. Ideal case: extension-inflation test  MAXWELL  (GTX980) 11/21 Vukasin Strbac GTC2016 14.04.16

  12. Ideal case: extension-inflation test  Anisotropy cost  (GHO/NH)  Integration cost  (SR/UI) 12/21 Vukasin Strbac GTC2016 14.04.16

  13. Ideal case: conclusions  Speed-ups are considerable  Difficult to say exactly why one GPU is faster in a specific scenario  No architecture-specific considerations are employed, speedup is free  Useful for  Parameter-fitting and geometry identification  Sensitivity analyses  …anything made possible by large numbers of FE simulations  Not a clinically accurate scenario 13/21 Vukasin Strbac GTC2016 14.04.16

  14. Near incompressibility and floating point precision  MPa  Single  precision  Double  precision 14/21 Vukasin Strbac GTC2016 14.04.16

  15.  FI  UI  SR  Double  Single 15/21 Vukasin Strbac GTC2016 14.04.16

  16. Clinically relevant test case: AAA inflation  p1  p2  p3  p4  p5  Patient-specific FE meshes of abdominal aortic aneurysms [Tarjuelo-Gutierrez et al., 2014] 16/21 Vukasin Strbac GTC2016 14.04.16

  17. Clinically relevant test case: AAA inflation  thrombus  The ‘silent killer’  Peak Wall Stress (PWS) estimate needed  Thrombus:  Separation  Different material  Layer specific material properties  aorta 17/21 Vukasin Strbac GTC2016 14.04.16

  18. 18/21 Presenter Type of presentation 14.04.16

  19. Clinically relevant test case: AAA inflation  p1  p2  p3  p4  p5 FEAP[h] 21.12 22.79 21.01 21.52 21.86 CUDA[h] 2.93 1.22 2.75 3.03 1.31 factor x7.2 x18.7 x7.6 x7.1 x16.8 19/21 Vukasin Strbac GTC2016 14.04.16

  20. Poisson = 0.4995 20/21 Vukasin Strbac GTC2016 14.04.16

  21. Conclusion  We maintain significant speedup even using state-of-the-art materials, high- order integration and double precision on GPUs, with no compromise whatsoever on accuracy. Even for less than ideal meshes.  Single precision becomes ineffective quickly, and depends on Poisson ratio. Double precision is necessary.  Practical opportunities, enabling technology:  FE sensitivity analysis  Inverse FE simulations  Indications of clinical use  Generally:  Memory-bound algorithm  Lots of random reads and atomic writes due to unstructured grid  For details on implementation/optimization see: S4497, Strbac GTC2014 21/21 Vukasin Strbac GTC2016 14.04.16

  22.  Thank you for your attention.  Questions? 22/21 Vukasin Strbac GTC2016 14.04.16

Recommend


More recommend