FINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING Daniel Thuerck 1,2 (advisors Michael Goesele 1,2 and Marc Pfetsch 1 ) Maxim Naumov 3 1 Graduate School of Computational Engineering, TU Darmstadt 2 Graphics, Capture and Massively Parallel Computing, TU Darmstadt 3 NVIDIA Research 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 1
INTRODUCTION TO LINEAR PROGRAMMING 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 2
Linear Programs min π β€ π¦ Linear objective function s.t. π΅π¦ β€ π Linear constraints π¦ β₯ 0 π π π 1 π 1 π = π΅ = where and π π π π π 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 3
Linear Programs: Applications [3P Logistics] 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 4
Lower-Level Parallelism in LP INTERNALS OF AN LP SOLVER 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 5
Solving LPs min π β€ π¦ ο A is π Γ π matrix, with π βͺ π s.t. π΅π¦ = π ο A is sparse and has full row-rank . π¦ β₯ 0 ο Variables may be bounded: π β€ π¦ β€ π£ βStandardβ LP format 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 6
Solving LPs Simplex Interior Point π π 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 7
Solving LPs Simplex Interior Point (IPM) π΅ β€ πΈ βAugmented (Newton) Systemβ π΅ βBasisβ (active set) βNormal π΅πΈ β1 π΅ β€ π΅ πΆ = Equationsβ 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 8
Solving LPs IPM / Aug. System IPM / Normal Equations π΅ β€ πΈ π΅πΈ β1 π΅ β€ π΅ ο (π + π) Γ (π + π) , sparse ο π Γ π , SPD, mi migh ght be dense se ο Symmetric, indefinite ο Squared condition number ο Solution: Indefinite LDL T or ο Solution: Cholesky-factorization MINRES method or CG method 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 9
Solving LPs IPM / Aug. System IPM / Normal Equations π΅ β€ πΈ π΅πΈ β1 π΅ β€ π΅ ο (π + π) Γ (π + π) , sparse ο π Γ π , SPD, mi migh ght be dense se ο Symmetric, indefinite ο Squared condition number ο Solution: Indefinite LDL T or ο Solution: Cholesky-factorization MINRES method or CG method 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 10
Introducing culip-lp β¦ An ongoing implementation of Mehrotraβs Primal-Dual interior point algorithm [1], featuring... οΌ (Iterati rative ve ) Linear Algebra based on the β Aug ugment ented ed Matrix rix β approach, οΌ Ful ull-ran rank guarantees, οΌ Comprehensive pre repro proce cessi ssing & pre resc scaling aling. Towards solving large-scale LPs on the GPU as open source ce for everybody 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 11
Progress report IMPLEMENTING CULIP-LP 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 12
Solver architecture Preprocess Scale Standardize IPM loop 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 13
Solver architecture Preprocess Scale Standardize IPM loop In Input t data: ο Constraints π΅ ππ π¦ = π ππ ο Constraints π΅ ππ π¦ β€ π ππ ο Objective vector π ο Bounds (on some variables) π, π£ 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 14
Solver architecture Preprocess Scale Standardize IPM loop Storage ge forma mat: t: CSR π΅ ππ π¦ = π ππ ο Compressed sparse row format π΅ ππ π¦ β€ π ππ ο Provides efficient row-based access by 3 arr rrays ays: π π π 0 row_ptr π, π£ 0 1 1 2 0 col_Ind π 2 a b c d e val 3 π π 4 5 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 15
Solver architecture Preprocess Scale Standardize IPM loop ο Ex Examp mple: e: LP β pb-simp-nonunif β (see [2]) π΅ ππ π¦ = π ππ ο Input matrix: 1,4 Mio x 23k with 4,36 Mio nonzeros π΅ ππ π¦ β€ π ππ ο Removed 1 singleton inequality π ο Removed 3629 low-forcing constraints π, π£ Execute in rounds ο Removed 1 fixed variable ο Removed 1,1 Mio (!) singleton inequalities ο Result: approx. 3,6 6 Mio nonzeros removed 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 16
Solver architecture Preprocess Scale Standardize IPM loop Goal: : Reduce element variance in matrices π΅ ππ π¦ = π ππ ο Scaling [3] makes a difference π΅ ππ π¦ β€ π ππ 1. Geometric scaling (1x β 4x) π π΅ π,β π΅ π,β = max |π΅ π,β | min(|π΅ π,β |) π, π£ 2. Equilibration (1x) π΅ π,β π΅ π,β = π΅ π,β 2 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 17
Solver architecture Preprocess Scale Standardize IPM loop Goal: : Forma mat LP in in standar ard form π΅ ππ π¦ = π ππ ο Shift variables: l β€ π¦ β€ π£ β 0 β€ π¦ β² β€ π£ + π π΅ ππ π¦ β€ π ππ π ο Split (free) variables π¦ β π¦ = π¦ + β π¦ β π¦ + , π¦ β β₯ 0 π, π£ π΅ ππ π½ π ππ ο Build std β matrix: = π΅ ππ π ππ 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 18
Solver architecture Preprocess Scale Standardize IPM loop En Ensure re A has f full rank (sym ymbolica ically ly) π΅π¦ = π π ππ΅π = π π£ π£ π π 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 19
Solver architecture Preprocess Scale Standardize IPM loop ο Goal: Solve KKT conditions by Newton steps π΅π¦ = π ο Steps: π ο Augmented matrix assembly π£ π€ π ο Solv lving ing the e (ind ndef efinit inite) ) augmented mented matrix ix π€ π ο Solv lve twice ce: predictor and corrector ο Stepsize along π€ = π€ π + π€ π 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 20
Solving the augmented system Iterat rative ive stra rategy: egy: ο Symmetric, indefinite: use MINRES [4] (in parts) ο Equilibrate system implicitly π΅ β€ πΈ ο Preconditioner: Experiments ongoing π΅ Dire rect ct strate rategy: gy: ο Symmetric, indefinite: use SPRAL SSIDS [5] ο Reordering by METIS [6] ο Scaling for large pivots 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 21
Intermediate findings PERFORMANCE EVALUATION 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 22
Benchmark problems Problem name [7] M N NNZ ex9 40,962 10,404 517,112 ex10 696,608 17,680 1,162,000 neos-631710 169,576 167,056 834,166 bley_xl1 175,620 5831 869,391 map06 328,818 164,547 549,920 map10 328,818 164,547 549,920 nb10tb 150,495 73340 1,172,289 neos-142912 58,726 416,040 1,855,220 in 1,526,202 1,449,074 6,811,639 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 23
Performance Problem name [7] NNZ CLP barr [sec] culip-lp [sec] ex9 517,112 X (NC) 81 ex10 1,162,000 X (NS) 141 neos-631710 834,166 172 478 bley_xl1 869,391 X (NS) 1,492 map06 549,920 X (NC) 466 map10 549,920 X (NC) 615 nb10tb 1,172,289 X (NC) 2,461 neos-142912 1,855,220 356 447 in 6,811,639 X (NS) NC X β failed, NS β did not start 1 st iteration, NC β did not converged within 1 hour 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 24
Runtime breakdown 10 MINRES Predictor 9 Corrector 8 7 SPRAL time [sec] 6 5 4 3 2 1 0 1 11 21 31 41 51 61 71 81 91 IPM step Example: map10 [7] Problem: map10 [7] 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 25
Iterative vs. direct methods MINRES Iterations MINRES relative residual 6000 1.E+00 1.E-01 5000 Relative Residual 1.E-02 4000 Iterations 1.E-03 3000 1.E-04 Predictor 2000 1.E-05 Corrector 1000 1.E-06 1.E-07 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13 IPM step IPM step Example: map10 [7] 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 26
Numerical difficulty Condition of matrix Remedies π΅ β€ πΈ ο 2x2 pivoting in factorizations (e.g. ππΈπ β€ π΅ in SPRAL) ο Preconditioning for MINRES or GMRES depends mainly on πΈ = ππππ(π¦) β ππππ(π‘) with strong duality towards the end often yielding where max(π¦ π π‘ π ) β 10 10 x T =[x 1 ,β¦,x n ] are solution and min π¦ π π‘ π s T =[s 1 ,β¦,s n ] are slack variables 10.05.2017 | TU Darmstadt | GCC / GSC CE | Daniel Thuerck, Maxim Naumov | 27
Recommend
More recommend