regression via iteratively reweighted least squares alina
play

Regression via Iteratively Reweighted Least Squares Alina Ene, - PowerPoint PPT Presentation

Improved Convergence for and 1 Regression via Iteratively Reweighted Least Squares Alina Ene, Adrian Vladu IRLS Method Basic primitive: min r i x i 2 Ax = b IRLS Method Basic primitive: min r i x i 2 Ax = b solution


  1. Improved Convergence for ℓ ∞ and ℓ 1 Regression via Iteratively Reweighted Least Squares Alina Ene, Adrian Vladu

  2. IRLS Method Basic primitive: min ∑r i x i 2 Ax = b

  3. IRLS Method Basic primitive: min ∑r i x i 2 Ax = b solution given by one linear system solve x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r)

  4. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one linear system solve * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  5. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  6. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  7. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  8. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  9. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  10. Benchmark: Optimization on Graphs t min |x| ∞ s Ax = b

  11. Benchmark: Optimization on Graphs minimize congestion of flow x t min |x| ∞ s Ax = b

  12. Benchmark: Optimization on Graphs minimize congestion of flow x t min |x| ∞ s Ax = b boundary condition: x routes demand from s to t

  13. Benchmark: Optimization on Graphs minimize congestion of .5 .5 flow x .5 t 0 min |x| ∞ s Ax = b .5 .5 .5 boundary condition: x routes demand Maximum flow from s to t

  14. Benchmark: Optimization on Graphs +1 -1 min |x| 1 +1 Ax = b -1

  15. Benchmark: Optimization on Graphs minimize +1 cost of flow x -1 min |x| 1 +1 Ax = b -1

  16. Benchmark: Optimization on Graphs minimize +1 cost of flow x -1 min |x| 1 +1 Ax = b -1 boundary condition: x routes demand from +1 to -1

  17. Benchmark: Optimization on Graphs minimize +1 cost of 1 1 flow x 0 -1 0 min |x| 1 +1 Ax = b 0 1 1 -1 boundary condition: x routes demand Minimum cost flow from +1 to -1

  18. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Ax = b Ax = b max flow min cost flow

  19. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow

  20. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow First order methods (gradient descent) ➜ running time strongly depends on matrix structure ➜ in general, takes time at least Ω(m 1.5 /poly(ε)) Second order methods (Newton method, IRLS) ➜ interior point method: Õ(m 1/2 ) linear system solves ➜ can be made Õ(n 1/2 ) with a lot of work [LS ’ 14] “Hybrid” method ➜ [CKMST, STOC ’ 11] Õ(m 1/3 /ε 11/3 ) linear system solves ➜ ~30 pages of description and proofs for complicated method

  21. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow First order methods (gradient descent) ➜ running time strongly depends on matrix structure ➜ in general, takes time at least Ω(m 1.5 /poly(ε)) Second order methods (Newton method, IRLS) ➜ interior point method: Õ(m 1/2 ) linear system solves ➜ can be made Õ(n 1/2 ) with a lot of work [Lee-Sidford ’ 14] “Hybrid” method ➜ [CKMST, STOC ’ 11] Õ(m 1/3 /ε 11/3 ) linear system solves ➜ ~30 pages of description and proofs for complicated method

  22. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow First order methods (gradient descent) ➜ running time strongly depends on matrix structure ➜ in general, takes time at least Ω(m 1.5 /poly(ε)) Second order methods (Newton method, IRLS) ➜ interior point method: Õ(m 1/2 ) linear system solves ➜ can be made Õ(n 1/2 ) with a lot of work [Lee-Sidford ’ 14] “Hybrid” method ➜ [Christiano-Kelner-Madry-Spielman-Teng ’ 11] Õ(m 1/3 /ε 11/3 ) linear system solves ➜ ~30 pages of description and proofs for complicated method

  23. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations

  24. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations * no matter what the structure of the underlying matrix is

  25. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ ≤ OPT Ax = b t s

  26. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ Guess ≤ OPT Ax = b OPT value (.5) t s

  27. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ Guess ≤ OPT Ax = b OPT value (.5) t s

  28. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 Ax = b 1 OPT value (.5) t r = 1 s Initialize 1 1 1

  29. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .4 Ax = b 1 OPT value (.5) .4 t .6 .2 r = 1 s Initialize .6 .4 1 .4 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem

  30. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .4 Ax = b 1.44 OPT value (.5) .4 t .6 .2 r = 1 s Initialize .6 .4 .4 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  31. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .4 Ax = b 1.44 OPT value (.5) .4 t .6 .2 r = 1 s Initialize .6 .4 .4 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  32. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 Ax = b 1.44 OPT value (.5) t r = 1 s Initialize 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  33. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .44 Ax = b 1.44 OPT value (.5) .44 t .55 .11 r = 1 s Initialize .55 .44 .44 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  34. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .44 Ax = b 1.75 OPT value (.5) .44 t .55 .11 r = 1 s Initialize .55 .44 .44 1.75 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  35. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 Ax = b 1.75 OPT value (.5) t r = 1 s Initialize 1.75 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  36. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ Guess ≤ OPT Ax = b OPT value (.5) t r = 1 s Initialize min ∑ r i x i 2 Solve least Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  37. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 2 Ax = b OPT value (.5) t r = 1 s Initialize 2 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  38. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .5 2 Ax = b OPT value .5 (.5) t .5 0 r = 1 s Initialize .5 .5 2 .5 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  39. Nonstandard Optimization Primitive ➜ Objective function is max r≥0 min Ax=b ∑r i x i 2 /∑r i Similar analysis to packing/covering LP [Young ’ 01] ℓ 1 version is a type of “slime mold dynamics” [Straszak- Vishnoi ’ 16, ‘17]

  40. Nonstandard Optimization Primitive ➜ Objective function is max r≥0 min Ax=b ∑r i x i 2 /∑r i ➜ Similar analysis to packing/covering LP [Young ’ 01] ℓ 1 version is a type of “slime mold dynamics” [Straszak- Vishnoi ’ 16, ‘17]

  41. Nonstandard Optimization Primitive ➜ Objective function is max r≥0 min Ax=b ∑r i x i 2 /∑r i ➜ Similar analysis to packing/covering LP [Young ’ 01] ➜ ℓ 1 version is a type of “slime mold dynamics” [Straszak- Vishnoi ’ 16, ‘17]

Recommend


More recommend