dynamics framework on the gpu
play

Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, - PowerPoint PPT Presentation

GPU TECHNOLOGY CONFERENCE: S5400: Chrono::SPIKE A Nonsmooth Contact Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, Hammad Mazhar, Radu Serban, Dan Negrut Simulation-Based Engineering Laboratory University of Wisconsin -


  1. GPU TECHNOLOGY CONFERENCE: S5400: Chrono::SPIKE – A Nonsmooth Contact Dynamics Framework on the GPU Daniel Melanz, Luning Fang, Ang Li, Hammad Mazhar, Radu Serban, Dan Negrut Simulation-Based Engineering Laboratory University of Wisconsin - Madison

  2. Overview Nonsmooth Contact Dynamics 1) Quadratic Optimization w/ Conic Constraints 2) Preconditioning with SPIKE 3) Numerical Results 4) Conclusions & Future Work 5) 3/19/2015 2 University of Wisconsin

  3. Nonsmooth Contact Dynamics 3/19/2015 3 University of Wisconsin

  4. Nonsmooth Dynamics 3/19/2015 4 University of Wisconsin

  5. Nonsmooth Dynamics: Frictionless Case The Signorini Conditions : Every relative velocity should be zero or separating Every contact impulse should be non- attractive No impulse at separating contacts: Antonio Signorini Tonge, 2012 3/19/2015 5 University of Wisconsin

  6. Nonsmooth Dynamics: Frictionless Case The Signorini Conditions : This is a compact way to write the three conditions in one line of math Antonio Signorini Tonge, 2012 3/19/2015 6 University of Wisconsin

  7. Nonsmooth Dynamics: Frictionless Case The final model can be expressed by these equations: Tonge, 2012 3/19/2015 7 University of Wisconsin

  8. Nonsmooth Dynamics: Friction Case Stewart and Trinkle, 1996 3/19/2015 8 University of Wisconsin

  9. Nonsmooth Dynamics: Friction Case Anitescu and Hart, 2004 3/19/2015 9 University of Wisconsin

  10. Nonsmooth Dynamics: The Cone Complementarity Problem (CCP) where 3/19/2015 10 University of Wisconsin

  11. Nonsmooth Dynamics: The Quadratic Programming Angle… • The CCP captures the first-order optimality condition for a quadratic optimization problem with conic constraints: • Notation used: 3/19/2015 11 University of Wisconsin

  12. Quadratic Optimization w/ Conic Constraints (CCQO’s) 3/19/2015 12 University of Wisconsin

  13. CCQO’s: First Order Methods 3/18/2015 13

  14. CCQO’s: Second Order Methods • Original problem: • Reformulation via an indicator function: where otherwise • Approximation via logarithmic barrier: 3/18/2015 14

  15. Interior Point 3/18/2015 15

  16. Numerical Results 3/19/2015 16 University of Wisconsin

  17. Results: Physical Model • Several numerical experiments were performed using a model of spheres falling into a bucket 3/19/2015 17 University of Wisconsin

  18. Results: Comparison of Solver Results • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers -2000 -2000 1e-1 1e-1 1e-2 1e-2 PDIP APGD 1e-3 1e-3 -4000 1e-4 -4000 1e-4 1e-5 1e-5 -6000 -6000 Weight [N] -8000 Weight [N] -8000 -10000 -10000 -12000 -12000 -14000 -14000 -16000 -16000 2 2.2 2.4 2.6 2.8 3 3.2 2 2.2 2.4 2.6 2.8 3 3.2 Time [s] Time [s] 3/19/2015 18 University of Wisconsin

  19. Results: Comparison of Solver Iterations • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers 500 60 1e-1 1e-1 1e-2 1e-2 450 1e-3 PDIP APGD 1e-3 1e-4 1e-4 1e-5 50 1e-5 400 350 40 300 Iterations [#] Iterations [#] 250 30 200 20 150 100 10 50 0 0 0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5 Time [s] Time [s] 3/19/2015 19 University of Wisconsin

  20. Results: Comparison of Solver Execution Time • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers PDIP APGD 3/19/2015 20 University of Wisconsin

  21. Results: Comparison of Solvers • Simulations of the filling simulation were performed for 3 seconds with a step size, h=10 -3 seconds using the APGD and PDIP solvers 500 1e-1 1e-2 450 PDIP APGD 1e-3 1e-4 1e-5 400 350 300 Iterations [#] 250 200 150 100 50 0 0 0.5 1 1.5 2 2.5 3 3.5 Time [s] 3/19/2015 21 University of Wisconsin

  22. Preconditioning with SPIKE 3/19/2015 22 University of Wisconsin

  23. The SPIKE algorithm • SPIKE: a divide-and-conquer approach to solving banded dense systems. • Proposed by A. H. Sameh and D. J. Kuck in 1978. (see also E. Polizzi and A. H. Sameh, Parallel Computing 32(2), 2006) • Basic idea: • Partition the matrix A . • Factorize A to isolate independent blocks. • Solve a reduced system to account for coupling information. • Recover solution of original system. • SPIKE comes in two main flavors: • Full-SPIKE : recursively solve an exact reduced system (direct solver for banded matrices). • Truncated-SPIKE : solve an approximate reduced system in one step (needs iterative refinement). 3/19/2015 23 University of Wisconsin

  24. SPIKE: algorithmic details Partitioning and Factorization • Partition and factorize A into block diagonal matrix D and spike matrix S. 3/19/2015 24 University of Wisconsin

  25. SPIKE: algorithmic details Solving Dg=b • Reduced to solving P independent (banded dense) linear systems. • Map these systems to P blocks on GPU. • Apply classical LU (or UL) methods to each sub-system. 3/19/2015 25 University of Wisconsin

  26. SPIKE factorization in plain math • The right ( V i ) and left ( W i ) spike blocks can be obtained through the solution of P independent multiple-RHS banded linear systems. 3/19/2015 26 University of Wisconsin

  27. SPIKE: algorithmic details Solving Sx=g (full SPIKE) • Combine all coupling blocks into a reduced matrix • (Recursively) solve the reduced system • Recover solution from reduced solution Combine coupling blocks 3/19/2015 27 University of Wisconsin

  28. SPIKE: algorithmic details Solving Sx=g (truncated SPIKE) • Justified for diagonally dominant systems only. • All spike blocks W and V are approximated by their top and bottom parts, respectively. • Results in a decoupling of the reduced matrix into ( P -1 ) small independent systems ( 2 K x 2 K ). Truncate spike blocks 3/19/2015 28 University of Wisconsin

  29. Truncated SPIKE as a preconditioner • Fundamental idea: • Reorder a sparse matrix to obtain a banded matrix with as “heavy” a diagonal as possible. • Drop small entries far from the main diagonal in an attempt to produce an even narrower band. • Use truncated SPIKE on resulting banded matrix. • Sparse matrix reordering • Reordering is critical • Non-zeroes can spread while we prefer them to gather around diagonals. • Both truncated SPIKE and BiCGStab(2) prefer diagonal elements with large absolute values. • Reordering strategies • Use row permutations to maximize product of absolute diagonal values: A  QA • Apply symmetric RCM for bandwidth reduction: QA + A T Q T  P ( QA + A T Q T ) P T 3/19/2015 29 University of Wisconsin

  30. Numerical Results 3/19/2015 30 University of Wisconsin

  31. Results: Preconditioned PDIP (P-PDIP) • Adding preconditioning to the search direction computation drastically improves computation time 3/19/2015 31 University of Wisconsin

  32. Results: Effect of Problem Size • A series of simulations on filling models of increasing size were performed to estimate how the solver performance scales with problem dimension 3/19/2015 32 University of Wisconsin

  33. Conclusions & Future Work 3/19/2015 33 University of Wisconsin

  34. Conclusions • Interior point methods require much less iterations than gradient descent methods, but each iteration is much more computationally expensive • Preconditioning is responsible for an four-fold reduction in run times when simulating nonsmooth contact problems • Although used with the nonsmooth dynamics, this speed-up is independent of the specific formalism adopted for the formulation of the equations of motion 3/19/2015 34 University of Wisconsin

  35. Future Work • Investigate improvements to the interior point algorithm • Investigate SPIKE update strategies and preconditioner re-use • Investigate the effectiveness of spectral reordering methods • Understand and gauge the software implementation effort and simulation efficiency trade-offs related to moving from the GPU to parallel multi-core CPU architectures 3/19/2015 35 University of Wisconsin

  36. Thank you. • Source available for download under BSD-3 http://spikegpu.sbel.org/ • For all of our animations, please visit https://vimeo.com/uwsbel • For more information about the Simulation- Based Engineering Laboratory, please visit http://sbel.wisc.edu/ 3/19/2015 36 University of Wisconsin

  37. Thank You. melanz@wisc.edu Simulation Based Engineering Lab Wisconsin Applied Computing Center 3/19/2015 37 University of Wisconsin

Recommend


More recommend