finding parallelism in general purpose linear programming
play

FINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING Daniel - PowerPoint PPT Presentation

FINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING Daniel Thuerck 1,2 (advisors Michael Goesele 1,2 and Marc Pfetsch 1 ) Maxim Naumov 3 1 Graduate School of Computational Engineering, TU Darmstadt 2 Graphics, Capture and Massively Parallel


  1. FINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING Daniel Thuerck 1,2 (advisors Michael Goesele 1,2 and Marc Pfetsch 1 ) Maxim Naumov 3 1 Graduate School of Computational Engineering, TU Darmstadt 2 Graphics, Capture and Massively Parallel Computing, TU Darmstadt 3 NVIDIA Research 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 1

  2. INTRODUCTION TO LINEAR PROGRAMMING 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 2

  3. Linear Programs min 𝑑 ⊀ 𝑦 Linear objective function s.t. 𝐡𝑦 ≀ 𝑐 Linear constraints 𝑦 β‰₯ 0 𝑑 π‘ˆ 𝑏 1 𝑐 1 𝑐 = 𝐡 = where and π‘ˆ 𝑏 𝑛 𝑐 𝑛 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 3

  4. Linear Programs: Applications [3P Logistics] 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 4

  5. Lower-Level Parallelism in LP INTERNALS OF AN LP SOLVER 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 5

  6. Solving LPs min 𝑑 ⊀ 𝑦 οƒ˜ A is 𝑛 Γ— π‘œ matrix, with 𝑛 β‰ͺ π‘œ s.t. 𝐡𝑦 = 𝑐 οƒ˜ A is sparse and has full row-rank . 𝑦 β‰₯ 0 οƒ˜ Variables may be bounded: π‘š ≀ 𝑦 ≀ 𝑣 β€œStandard” LP format 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 6

  7. Solving LPs Simplex Interior Point 𝑑 𝑑 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 7

  8. Solving LPs Simplex Interior Point (IPM) 𝐡 ⊀ 𝐸 β€œAugmented (Newton) System” 𝐡 β€œBasis” (active set) β€œNormal 𝐡𝐸 βˆ’1 𝐡 ⊀ 𝐡 𝐢 = Equations” 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 8

  9. Solving LPs IPM / Aug. System IPM / Normal Equations 𝐡 ⊀ 𝐸 𝐡𝐸 βˆ’1 𝐡 ⊀ 𝐡 οƒ˜ (𝑛 + π‘œ) Γ— (𝑛 + π‘œ) , sparse οƒ˜ 𝑛 Γ— 𝑛 , SPD, mi migh ght be dense se οƒ˜ Symmetric, indefinite οƒ˜ Squared condition number οƒ˜ Solution: Indefinite LDL T or οƒ˜ Solution: Cholesky-factorization MINRES method or CG method 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 9

  10. Solving LPs IPM / Aug. System IPM / Normal Equations 𝐡 ⊀ 𝐸 𝐡𝐸 βˆ’1 𝐡 ⊀ 𝐡 οƒ˜ (𝑛 + π‘œ) Γ— (𝑛 + π‘œ) , sparse οƒ˜ 𝑛 Γ— 𝑛 , SPD, mi migh ght be dense se οƒ˜ Symmetric, indefinite οƒ˜ Squared condition number οƒ˜ Solution: Indefinite LDL T or οƒ˜ Solution: Cholesky-factorization MINRES method or CG method 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 10

  11. Introducing culip-lp … An ongoing implementation of Mehrotra’s Primal-Dual interior point algorithm [1], featuring... οƒΌ (Iterati rative ve ) Linear Algebra based on the β€œ Aug ugment ented ed Matrix rix ” approach, οƒΌ Ful ull-ran rank guarantees, οƒΌ Comprehensive pre repro proce cessi ssing & pre resc scaling aling. Towards solving large-scale LPs on the GPU as open source ce for everybody 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 11

  12. Progress report IMPLEMENTING CULIP-LP 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 12

  13. Solver architecture Preprocess Scale Standardize IPM loop 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 13

  14. Solver architecture Preprocess Scale Standardize IPM loop In Input t data: οƒ˜ Constraints 𝐡 π‘“π‘Ÿ 𝑦 = 𝑐 π‘“π‘Ÿ οƒ˜ Constraints 𝐡 π‘šπ‘“ 𝑦 ≀ 𝑐 π‘šπ‘“ οƒ˜ Objective vector 𝑑 οƒ˜ Bounds (on some variables) π‘š, 𝑣 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 14

  15. Solver architecture Preprocess Scale Standardize IPM loop Storage ge forma mat: t: CSR 𝐡 π‘“π‘Ÿ 𝑦 = 𝑐 π‘“π‘Ÿ οƒ˜ Compressed sparse row format 𝐡 π‘šπ‘“ 𝑦 ≀ 𝑐 π‘šπ‘“ οƒ˜ Provides efficient row-based access by 3 arr rrays ays: 𝑑 𝑏 𝑐 0 row_ptr π‘š, 𝑣 0 1 1 2 0 col_Ind 𝑑 2 a b c d e val 3 𝑒 𝑓 4 5 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 15

  16. Solver architecture Preprocess Scale Standardize IPM loop οƒ˜ Ex Examp mple: e: LP β€œ pb-simp-nonunif ” (see [2]) 𝐡 π‘“π‘Ÿ 𝑦 = 𝑐 π‘“π‘Ÿ οƒ˜ Input matrix: 1,4 Mio x 23k with 4,36 Mio nonzeros 𝐡 π‘šπ‘“ 𝑦 ≀ 𝑐 π‘šπ‘“ οƒ˜ Removed 1 singleton inequality 𝑑 οƒ˜ Removed 3629 low-forcing constraints π‘š, 𝑣 Execute in rounds οƒ˜ Removed 1 fixed variable οƒ˜ Removed 1,1 Mio (!) singleton inequalities οƒ˜ Result: approx. 3,6 6 Mio nonzeros removed 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 16

  17. Solver architecture Preprocess Scale Standardize IPM loop Goal: : Reduce element variance in matrices 𝐡 π‘“π‘Ÿ 𝑦 = 𝑐 π‘“π‘Ÿ οƒ˜ Scaling [3] makes a difference 𝐡 π‘šπ‘“ 𝑦 ≀ 𝑐 π‘šπ‘“ 1. Geometric scaling (1x – 4x) 𝑑 𝐡 𝑗,β‹… 𝐡 𝑗,β‹… = max |𝐡 𝑗,β‹… | min(|𝐡 𝑗,β‹… |) π‘š, 𝑣 2. Equilibration (1x) 𝐡 𝑗,β‹… 𝐡 𝑗,β‹… = 𝐡 𝑗,β‹… 2 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 17

  18. Solver architecture Preprocess Scale Standardize IPM loop Goal: : Forma mat LP in in standar ard form 𝐡 π‘“π‘Ÿ 𝑦 = 𝑐 π‘“π‘Ÿ οƒ˜ Shift variables: l ≀ 𝑦 ≀ 𝑣 β†’ 0 ≀ 𝑦 β€² ≀ 𝑣 + π‘š 𝐡 π‘šπ‘“ 𝑦 ≀ 𝑐 π‘šπ‘“ 𝑑 οƒ˜ Split (free) variables 𝑦 β†’ 𝑦 = 𝑦 + βˆ’ 𝑦 βˆ’ 𝑦 + , 𝑦 βˆ’ β‰₯ 0 π‘š, 𝑣 𝐡 π‘šπ‘“ 𝐽 𝑐 𝑀𝑓 οƒ˜ Build std ’ matrix: = 𝐡 π‘“π‘Ÿ 𝑐 π‘“π‘Ÿ 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 18

  19. Solver architecture Preprocess Scale Standardize IPM loop En Ensure re A has f full rank (sym ymbolica ically ly) 𝐡𝑦 = 𝑐 𝑑 𝑄𝐡𝑅 = 𝑛 𝑣 𝑣 𝑛 𝑑 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 19

  20. Solver architecture Preprocess Scale Standardize IPM loop οƒ˜ Goal: Solve KKT conditions by Newton steps 𝐡𝑦 = 𝑐 οƒ˜ Steps: 𝑑 οƒ˜ Augmented matrix assembly 𝑣 𝑀 𝑑 οƒ˜ Solv lving ing the e (ind ndef efinit inite) ) augmented mented matrix ix 𝑀 π‘ž οƒ˜ Solv lve twice ce: predictor and corrector οƒ˜ Stepsize along 𝑀 = 𝑀 π‘ž + 𝑀 𝑑 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 20

  21. Solving the augmented system Iterat rative ive stra rategy: egy: οƒ˜ Symmetric, indefinite: use MINRES [4] (in parts) οƒ˜ Equilibrate system implicitly 𝐡 ⊀ 𝐸 οƒ˜ Preconditioner: Experiments ongoing 𝐡 Dire rect ct strate rategy: gy: οƒ˜ Symmetric, indefinite: use SPRAL SSIDS [5] οƒ˜ Reordering by METIS [6] οƒ˜ Scaling for large pivots 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 21

  22. Intermediate findings PERFORMANCE EVALUATION 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 22

  23. Benchmark problems Problem name [7] M N NNZ ex9 40,962 10,404 517,112 ex10 696,608 17,680 1,162,000 neos-631710 169,576 167,056 834,166 bley_xl1 175,620 5831 869,391 map06 328,818 164,547 549,920 map10 328,818 164,547 549,920 nb10tb 150,495 73340 1,172,289 neos-142912 58,726 416,040 1,855,220 in 1,526,202 1,449,074 6,811,639 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 23

  24. Performance Problem name [7] NNZ CLP barr [sec] culip-lp [sec] ex9 517,112 X (NC) 81 ex10 1,162,000 X (NS) 141 neos-631710 834,166 172 478 bley_xl1 869,391 X (NS) 1,492 map06 549,920 X (NC) 466 map10 549,920 X (NC) 615 nb10tb 1,172,289 X (NC) 2,461 neos-142912 1,855,220 356 447 in 6,811,639 X (NS) NC X – failed, NS – did not start 1 st iteration, NC – did not converged within 1 hour 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 24

  25. Runtime breakdown 10 MINRES Predictor 9 Corrector 8 7 SPRAL time [sec] 6 5 4 3 2 1 0 1 11 21 31 41 51 61 71 81 91 IPM step Example: map10 [7] Problem: map10 [7] 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 25

  26. Iterative vs. direct methods MINRES Iterations MINRES relative residual 6000 1.E+00 1.E-01 5000 Relative Residual 1.E-02 4000 Iterations 1.E-03 3000 1.E-04 Predictor 2000 1.E-05 Corrector 1000 1.E-06 1.E-07 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13 IPM step IPM step Example: map10 [7] 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 26

  27. Numerical difficulty Condition of matrix Remedies 𝐡 ⊀ 𝐸 οƒ˜ 2x2 pivoting in factorizations (e.g. 𝑀𝐸𝑀 ⊀ 𝐡 in SPRAL) οƒ˜ Preconditioning for MINRES or GMRES depends mainly on 𝐸 = 𝑒𝑗𝑏𝑕(𝑦) β‹… 𝑒𝑗𝑏𝑕(𝑑) with strong duality towards the end often yielding where max(𝑦 𝑗 𝑑 𝑗 ) β‰ˆ 10 10 x T =[x 1 ,…,x n ] are solution and min 𝑦 𝑗 𝑑 𝑗 s T =[s 1 ,…,s n ] are slack variables 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 27

Recommend


More recommend