algorithms with provable
play

Algorithms with provable guarantees for clustering problems Ola - PowerPoint PPT Presentation

Algorithms with provable guarantees for clustering problems Ola Svensson Where to place rescue centers? Build k centers so as to minimize sum of travel distances Where to place rescue centers? optimize some objective Build k centers so as to


  1. Algorithms with provable guarantees for clustering problems Ola Svensson

  2. Where to place rescue centers? Build k centers so as to minimize sum of travel distances

  3. Where to place rescue centers? optimize some objective Build k centers so as to minimize sum of travel distances

  4. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )

  5. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )

  6. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )

  7. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) decrease distance for 3 clients increase distance for 6 clients

  8. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) decrease distance for 3 clients decrease distance for 6 clients increase distance for 6 clients increase distance for 3 clients

  9. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )

  10. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) CENTER: Open point/facility on real line so as to minimize max distance over all clients ( )

  11. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) CENTER: Open point/facility on real line so as to minimize max distance over all clients ( )

  12. Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) CENTER: Open point/facility on real line so as to minimize max distance over all clients ( ) x x

  13. K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( )

  14. K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( )

  15. K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( )

  16. K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( ) K-CENTER: Open k points/facilities in a metric space so as to minimize max distance over all clients ( )

  17. K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( ) K-CENTER: Open k points/facilities in a metric space so as to minimize max distance over all clients ( )

  18. Mathematical formulation of objective functions

  19. Mathematical formulation of objective functions General Problem parameterized by 𝒒 β‰₯ 𝟐 : Find a set 𝑻 of k points/facilities in a metric space so as to minimize 𝟐/𝒒 𝒆 π’Œ, 𝑻 𝒒 π’Œ π’…π’Žπ’‹π’‡π’π’–

  20. Mathematical formulation of objective functions General Problem parameterized by 𝒒 β‰₯ 𝟐 : Find a set 𝑻 of k points/facilities in a metric space so as to minimize 𝟐/𝒒 𝒆 π’Œ, 𝑻 𝒒 π’Œ π’…π’Žπ’‹π’‡π’π’– Distance from client j to closest facility in S K-MEDIAN: 𝒒 = 𝟐 K-CENTER: 𝒒 = ∞ Actually, π‘˜ π‘‘π‘šπ‘—π‘“π‘œπ‘’ 𝑒 π‘˜, 𝑇 2 and Euclidean metric K-MEANS: 𝒒 = πŸ‘

  21. Facility Location Facility Location: Open facilities in a metric space so as to minimize sum of distances from clients + opening costs

  22. ALL THESE PROBLEMS ARE INTRACTABLE (NP-HARD) IN THE WORST CASE

  23. Solving intractable problems β€’ Heuristics β€’ good for β€œtypical” instances β€’ bad instances do not happen too often 16384 4096 ! 1024 Sweden has only 9 256 million inhabitants 64 Dantzig, Fulkerson, and Johnson solve a 49- 16 city instance to optimality β‰ˆ 360 persons/city 4 Applegate, Bixby, Chvatal, Cook, and Helsgaun solve a 1 24978-city instance 50's 70's 80's 90's 00's

  24. Solving intractable problems Approximation Algorithms β€’ β€’ Perhaps we can efficiently find a reasonably good solution? Approximation Ratio: worst case over all instances Ξ± =1 is an exact polynomial time algorithm β€’ Ξ± =1.01 then algorithm finds a solution with at most 1% higher cost β€’

  25. GOAL: Complete understanding of worst case behavior

  26. State of the Art Approximation Hardness 1.488 1.463 Facility Location [Li’11] [Guha & Khuller’98] 2 2 K-Center [Gonzales’85, Hochbaum & Shmoys’85 ] [Hsu & Nemhauser’79] 2.67 1+2/e K-Median [Byrka et al’15] [Jain et al.’02] 9 1.0013 K-Means [Kanungo et al’2004] [Lee. Schmidt, Wright’15] Even better: Approximation algorithms (can be) achieved by standard LP relaxations and techniques transfer between problems

  27. A 2-APPROXIMATION ALGORITHM FOR K-CENTER

  28. Open any point Greedy K-Center For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points

  29. Open any point Greedy K-Center For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points

  30. Open any point Greedy K-Center For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points

  31. Open any point Greedy K-Center For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points

  32. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Consider optimal solution and corresponding Voronoi diagram

  33. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1: We opened up one point in each cell

  34. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1: We opened up one point in each cell

  35. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1: We opened up one point in each cell ≀ π‘ƒπ‘„π‘ˆ ≀ π‘ƒπ‘„π‘ˆ

  36. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1: We opened up one point in each cell ≀ π‘ƒπ‘„π‘ˆ ≀ 2 β‹… π‘ƒπ‘„π‘ˆ ≀ π‘ƒπ‘„π‘ˆ In this case any client is connected within distance ≀ πŸ‘ β‹… 𝑷𝑸𝑼

  37. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1I: We did not open up one point in each cell

  38. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1I: We opened up two points in a single cell

  39. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1I: We opened up two points in a single cell ≀ π‘ƒπ‘„π‘ˆ ≀ π‘ƒπ‘„π‘ˆ

  40. Open any point Analysis For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Case 1I: We opened up two points in a single cell ≀ 2 β‹… π‘ƒπ‘„π‘ˆ ≀ π‘ƒπ‘„π‘ˆ ≀ π‘ƒπ‘„π‘ˆ Also in this case any client is connected within distance ≀ πŸ‘ β‹… 𝑷𝑸𝑼

  41. Open any point For 𝑗 = 2, … , 𝑙 Open point farthest away from already opened points Gonzales, Hochbaum & Shmoys’85 THEOREM: The above greedy algorithm is a 2-approximation for k-Center

  42. ALGORITHMS FOR FACILITY LOCATION AND K-MEDIAN

  43. LINEAR PROGRAMMING RELAXATION

  44. LP Relaxation for Facility Location LINEAR PROGRAM: β€’ y i takes value 1 if i is opened and 0 otherwise β€’ x ij takes value 1 if j is connected to i and 0 otherwise

  45. LP Relaxation for Facility Location LINEAR PROGRAM: β€’ y i takes value 1 if i is opened and 0 otherwise β€’ x ij takes value 1 if j is connected to i and 0 otherwise connection cost opening cost minimize π‘—βˆˆπΊ 𝑔 𝑗 𝑧 𝑗 + π‘—βˆˆπΊ,π‘˜βˆˆπ· 𝑒 π‘—π‘˜ 𝑦 π‘—π‘˜ subject to π‘—βˆˆπΊ 𝑦 π‘—π‘˜ = 1 π‘˜ ∈ 𝐷 𝑦 π‘—π‘˜ ≀ 𝑧 𝑗 i ∈ 𝐺, π‘˜ ∈ 𝐷 𝑦 π‘—π‘˜ , 𝑧 𝑗 ∈ [0,1] i ∈ 𝐺, π‘˜ ∈ 𝐷

  46. LP Relaxation for Facility Location LINEAR PROGRAM: β€’ y i takes value 1 if i is opened and 0 otherwise β€’ x ij takes value 1 if j is connected to i and 0 otherwise minimize π‘—βˆˆπΊ 𝑔 𝑗 𝑧 𝑗 + π‘—βˆˆπΊ,π‘˜βˆˆπ· 𝑒 π‘—π‘˜ 𝑦 π‘—π‘˜ Every client is connected subject to π‘—βˆˆπΊ 𝑦 π‘—π‘˜ = 1 π‘˜ ∈ 𝐷 𝑦 π‘—π‘˜ ≀ 𝑧 𝑗 i ∈ 𝐺, π‘˜ ∈ 𝐷 𝑦 π‘—π‘˜ , 𝑧 𝑗 ∈ [0,1] i ∈ 𝐺, π‘˜ ∈ 𝐷

  47. LP Relaxation for Facility Location LINEAR PROGRAM: β€’ y i takes value 1 if i is opened and 0 otherwise β€’ x ij takes value 1 if j is connected to i and 0 otherwise minimize π‘—βˆˆπΊ 𝑔 𝑗 𝑧 𝑗 + π‘—βˆˆπΊ,π‘˜βˆˆπ· 𝑒 π‘—π‘˜ 𝑦 π‘—π‘˜ Clients connected to open facilities subject to π‘—βˆˆπΊ 𝑦 π‘—π‘˜ = 1 π‘˜ ∈ 𝐷 𝑦 π‘—π‘˜ ≀ 𝑧 𝑗 i ∈ 𝐺, π‘˜ ∈ 𝐷 𝑦 π‘—π‘˜ , 𝑧 𝑗 ∈ [0,1] i ∈ 𝐺, π‘˜ ∈ 𝐷

Recommend


More recommend