Algorithms with provable guarantees for clustering problems Ola Svensson
Where to place rescue centers? Build k centers so as to minimize sum of travel distances
Where to place rescue centers? optimize some objective Build k centers so as to minimize sum of travel distances
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) decrease distance for 3 clients increase distance for 6 clients
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) decrease distance for 3 clients decrease distance for 6 clients increase distance for 6 clients increase distance for 3 clients
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( )
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) CENTER: Open point/facility on real line so as to minimize max distance over all clients ( )
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) CENTER: Open point/facility on real line so as to minimize max distance over all clients ( )
Median and Center MEDIAN: Open point/facility on real line so as to minimize sum of distances from clients ( ) CENTER: Open point/facility on real line so as to minimize max distance over all clients ( ) x x
K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( )
K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( )
K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( )
K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( ) K-CENTER: Open k points/facilities in a metric space so as to minimize max distance over all clients ( )
K-Median and K-Center K-MEDIAN: Open k points/facilities in a metric space so as to minimize sum of distances from clients ( ) K-CENTER: Open k points/facilities in a metric space so as to minimize max distance over all clients ( )
Mathematical formulation of objective functions
Mathematical formulation of objective functions General Problem parameterized by π β₯ π : Find a set π» of k points/facilities in a metric space so as to minimize π/π π π, π» π π π πππππ
Mathematical formulation of objective functions General Problem parameterized by π β₯ π : Find a set π» of k points/facilities in a metric space so as to minimize π/π π π, π» π π π πππππ Distance from client j to closest facility in S K-MEDIAN: π = π K-CENTER: π = β Actually, π ππππππ’ π π, π 2 and Euclidean metric K-MEANS: π = π
Facility Location Facility Location: Open facilities in a metric space so as to minimize sum of distances from clients + opening costs
ALL THESE PROBLEMS ARE INTRACTABLE (NP-HARD) IN THE WORST CASE
Solving intractable problems β’ Heuristics β’ good for βtypicalβ instances β’ bad instances do not happen too often 16384 4096 ! 1024 Sweden has only 9 256 million inhabitants 64 Dantzig, Fulkerson, and Johnson solve a 49- 16 city instance to optimality β 360 persons/city 4 Applegate, Bixby, Chvatal, Cook, and Helsgaun solve a 1 24978-city instance 50's 70's 80's 90's 00's
Solving intractable problems Approximation Algorithms β’ β’ Perhaps we can efficiently find a reasonably good solution? Approximation Ratio: worst case over all instances Ξ± =1 is an exact polynomial time algorithm β’ Ξ± =1.01 then algorithm finds a solution with at most 1% higher cost β’
GOAL: Complete understanding of worst case behavior
State of the Art Approximation Hardness 1.488 1.463 Facility Location [Liβ11] [Guha & Khullerβ98] 2 2 K-Center [Gonzalesβ85, Hochbaum & Shmoysβ85 ] [Hsu & Nemhauserβ79] 2.67 1+2/e K-Median [Byrka et alβ15] [Jain et al.β02] 9 1.0013 K-Means [Kanungo et alβ2004] [Lee. Schmidt, Wrightβ15] Even better: Approximation algorithms (can be) achieved by standard LP relaxations and techniques transfer between problems
A 2-APPROXIMATION ALGORITHM FOR K-CENTER
Open any point Greedy K-Center For π = 2, β¦ , π Open point farthest away from already opened points
Open any point Greedy K-Center For π = 2, β¦ , π Open point farthest away from already opened points
Open any point Greedy K-Center For π = 2, β¦ , π Open point farthest away from already opened points
Open any point Greedy K-Center For π = 2, β¦ , π Open point farthest away from already opened points
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Consider optimal solution and corresponding Voronoi diagram
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1: We opened up one point in each cell
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1: We opened up one point in each cell
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1: We opened up one point in each cell β€ πππ β€ πππ
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1: We opened up one point in each cell β€ πππ β€ 2 β πππ β€ πππ In this case any client is connected within distance β€ π β π·πΈπΌ
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1I: We did not open up one point in each cell
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1I: We opened up two points in a single cell
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1I: We opened up two points in a single cell β€ πππ β€ πππ
Open any point Analysis For π = 2, β¦ , π Open point farthest away from already opened points Case 1I: We opened up two points in a single cell β€ 2 β πππ β€ πππ β€ πππ Also in this case any client is connected within distance β€ π β π·πΈπΌ
Open any point For π = 2, β¦ , π Open point farthest away from already opened points Gonzales, Hochbaum & Shmoysβ85 THEOREM: The above greedy algorithm is a 2-approximation for k-Center
ALGORITHMS FOR FACILITY LOCATION AND K-MEDIAN
LINEAR PROGRAMMING RELAXATION
LP Relaxation for Facility Location LINEAR PROGRAM: β’ y i takes value 1 if i is opened and 0 otherwise β’ x ij takes value 1 if j is connected to i and 0 otherwise
LP Relaxation for Facility Location LINEAR PROGRAM: β’ y i takes value 1 if i is opened and 0 otherwise β’ x ij takes value 1 if j is connected to i and 0 otherwise connection cost opening cost minimize πβπΊ π π π§ π + πβπΊ,πβπ· π ππ π¦ ππ subject to πβπΊ π¦ ππ = 1 π β π· π¦ ππ β€ π§ π i β πΊ, π β π· π¦ ππ , π§ π β [0,1] i β πΊ, π β π·
LP Relaxation for Facility Location LINEAR PROGRAM: β’ y i takes value 1 if i is opened and 0 otherwise β’ x ij takes value 1 if j is connected to i and 0 otherwise minimize πβπΊ π π π§ π + πβπΊ,πβπ· π ππ π¦ ππ Every client is connected subject to πβπΊ π¦ ππ = 1 π β π· π¦ ππ β€ π§ π i β πΊ, π β π· π¦ ππ , π§ π β [0,1] i β πΊ, π β π·
LP Relaxation for Facility Location LINEAR PROGRAM: β’ y i takes value 1 if i is opened and 0 otherwise β’ x ij takes value 1 if j is connected to i and 0 otherwise minimize πβπΊ π π π§ π + πβπΊ,πβπ· π ππ π¦ ππ Clients connected to open facilities subject to πβπΊ π¦ ππ = 1 π β π· π¦ ππ β€ π§ π i β πΊ, π β π· π¦ ππ , π§ π β [0,1] i β πΊ, π β π·
Recommend
More recommend