lecture 7 arora rao vazirani lecture outline
play

Lecture 7: Arora Rao Vazirani Lecture Outline Part I: Semidefinite - PowerPoint PPT Presentation

Lecture 7: Arora Rao Vazirani Lecture Outline Part I: Semidefinite Programming Relaxation for Sparsest Cut Part II: Combining Approaches Part III: Arora-Rao-Vazirani Analysis Overview Part IV: Analyzing Matchings of Close Points


  1. Lecture 7: Arora Rao Vazirani

  2. Lecture Outline • Part I: Semidefinite Programming Relaxation for Sparsest Cut • Part II: Combining Approaches • Part III: Arora-Rao-Vazirani Analysis Overview • Part IV: Analyzing Matchings of Close Points • Part V: Reduction to the Well-Separated Case • Part VI: Open Problems

  3. Part I: Semidefinite Programming Relaxation for Sparsest Cut

  4. Problem Reformulation • Reformulation: Want to minimize 2 over all cut σ 𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑦 𝑘 − 𝑦 𝑗 pseudo-metrics normalized so that 2 = 1 σ 𝑗,𝑘:𝑗<𝑘 𝑦 𝑘 − 𝑦 𝑗 2 and • More precisely, take 𝑒 2 𝑗, 𝑘 = 𝑦 𝑘 − 𝑦 𝑗 minimize σ 𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑒 2 (𝑗, 𝑘) subject to: ∃𝑑: ∀𝑗, x i ∈ {−𝑑, +𝑑} 1. σ 𝑗,𝑘:𝑗<𝑘 𝑒 2 (𝑗, 𝑘) = 1 2.

  5. Problem Relaxation • Reformulation: Minimize 2 − 2𝑦 𝑗 𝑦 𝑘 + 𝑦 𝑘 2 ) subject to: σ 𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) (𝑦 𝑗 ∃𝑑: ∀𝑗, x i ∈ {−𝑑, +𝑑} 1. 2 − 2𝑦 𝑗 𝑦 𝑘 + 𝑦 𝑘 2 ) = 1 σ 𝑗,𝑘:𝑗<𝑘 (𝑦 𝑗 2. • Relaxation: Minimize σ 𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) (𝑁 𝑗𝑗 −2𝑁 𝑗𝑘 + 𝑁 𝑘𝑘 ) subject to: 1. ∀𝑗, 𝑘, 𝑁 𝑗𝑗 = 𝑁 𝑘𝑘 σ 𝑗,𝑘:𝑗<𝑘 (𝑁 𝑗𝑗 −2𝑁 𝑗𝑘 + 𝑁 𝑘𝑘 ) = 1 2. 𝑁 ≽ 0 3.

  6. Bad Example: The Cycle • Consider the cycle of length 𝑜 . The semidefinite program can place the cycle on the unit circle and assign each 𝑦 𝑗 the corresponding vector 𝑤 𝑗 . 1 𝑤 1 𝑤 5 5 2 G 𝑤 2 𝑤 4 𝑤 3 4 3

  7. Bad Example: The Cycle • σ 𝑗,𝑘:𝑗<𝑘 (𝑒 2 (𝑗, 𝑘)) = Θ(𝑜 2 ) • σ 𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) (𝑒 2 (𝑗, 𝑘)) = Θ(𝑜 ⋅ 1/𝑜 2 ) • Gives sparsity Θ(1/𝑜 3 ) , true value is Θ(1/n 2 ) • Gap is Ω(𝑜) , which is horrible! 1 𝑤 1 𝑤 5 5 2 G 𝑤 2 𝑤 4 𝑤 3 4 3

  8. Part II: Combining Approaches

  9. Adding the Triangle Inequalities • Why did the semidefinite program do so much worse than the linear program? • Missing: Triangle inequalities 𝑒 2 𝑗, 𝑙 ≤ 𝑒 2 (𝑗, 𝑘) + 𝑒 2 (𝑘, 𝑙) • What happens if we add the triangle inequalities to the semidefinite program?

  10. Geometric Picture • Let Θ be the angle between 𝑤 𝑗 − 𝑤 𝑘 and 𝑤 𝑙 − 𝑤 𝑘 2 + 𝑤 𝑙 − 𝑤 𝑘 2 if Θ = 2 = 𝜌 • 𝑤 𝑙 − 𝑤 𝑗 𝑤 𝑘 − 𝑤 𝑗 2 2 + 𝑤 𝑙 − 𝑤 𝑘 2 if Θ > 𝜌 2 > • 𝑤 𝑙 − 𝑤 𝑗 𝑤 𝑘 − 𝑤 𝑗 2 2 + 𝑤 𝑙 − 𝑤 𝑘 2 if Θ < 2 < 𝜌 • 𝑤 𝑙 − 𝑤 𝑗 𝑤 𝑘 − 𝑤 𝑗 2 • Triangle inequalities ⬄ no obtuse angles 𝑤 𝑙 𝑤 𝑘 Θ 0 𝑤 𝑗

  11. Fixing Cycle Example • Putting 𝑜 > 4 vectors in a circle violates triangle inequality, so the semidefinite program no longer behaves badly on the cycle. In fact, it gets very close to the right answer. 𝑤 𝑙 Θ 𝑤 𝑘 0 𝑤 𝑗

  12. Goemans-Linial Relaxation • Semidefinite program (proposed by Goemans and Lineal): Minimize σ 𝑗,𝑘:𝑗<𝑘: 𝑗,𝑘 ∈𝐹(𝐻) (𝑁 𝑗𝑗 − 2𝑁 𝑗𝑘 + 𝑁 𝑘𝑘 ) subject to: ∀𝑗, 𝑘, M 𝑗𝑗 = 𝑁 1. 𝑘𝑘 ∀𝑗, 𝑘, 𝑙, 𝑒 2 (𝑗, 𝑙) ≤ 𝑒 2 (𝑗, 𝑘) + 𝑒 2 (𝑘, 𝑙) where 2. 𝑒 2 (𝑗, 𝑘) = 𝑁 𝑗𝑗 − 2𝑁 𝑗𝑘 + 𝑁 𝑘𝑘 σ 𝑗,𝑘:𝑗<𝑘 𝑁 𝑗𝑗 − 2𝑁 𝑗𝑘 + 𝑁 𝑘𝑘 = 1 3. 𝑁 ≽ 0 4.

  13. Arora-Rao-Vazirani Theorem • Theorem [ARV]: The Goemans-Linial relaxation for sparsest cut gives an 𝑃 𝑚𝑝𝑕𝑜 -approximation and has a polynomial time rounding algorithm.

  14. 2 Metric Spaces 𝑀 2 • Also called metrics of negative type 2 metric if it is possible • Definition: A metric is an 𝑀 2 to assign a vector 𝑤 𝑦 to every point 𝑦 such that 2 . 𝑒 𝑦, 𝑧 = 𝑤 𝑧 − 𝑤 𝑦 • Last time: General metrics can be embedded into 𝑀 1 with 𝑃 log 𝑜 distortion. 2 metric embeds into 𝑀 1 • Theorem [ALN08]: Any 𝑀 2 with 𝑃 𝑚𝑝𝑕𝑜(𝑚𝑝𝑕𝑚𝑝𝑕𝑜) distortion. • [ARV] analyzes the algorithm more directly

  15. Goemans-Linial Relaxation and SOS • Degree 4 SOS captures the triangle inequality: if 2 = 𝑦 𝑘 2 = 𝑦 𝑙 2 then 𝑦 𝑗 2 + 𝑦 𝑗 2 2 𝑦 𝑙 − 𝑦 𝑗 2 ≤ 𝑦 𝑗 2 𝑦 𝑘 − 𝑦 𝑗 2 𝑦 𝑙 − 𝑦 𝑘 𝑦 𝑗 2 𝑦 𝑗 2 − 𝑦 𝑗 𝑦 𝑙 ≤ 2𝑦 𝑗 2 − 𝑦 𝑗 𝑦 𝑘 − 𝑦 𝑗 𝑦 𝑘 ) 2 (2𝑦 𝑗 ⬄2𝑦 𝑗 • Proof: 2 𝑦 𝑘 − 𝑦 𝑙 2 = 4(𝑦 𝑗 2 − 𝑦 𝑗 𝑦 𝑘 )(𝑦 𝑗 2 − 𝑦 𝑘 𝑦 𝑙 ) 𝑦 𝑗 − 𝑦 𝑘 2 𝑦 𝑗 2 − 𝑦 𝑗 𝑦 𝑘 − 𝑦 𝑘 𝑦 𝑙 + 𝑦 𝑗 𝑦 𝑙 ≥ 0 = 4𝑦 𝑗 • Thus, degree 4 SOS captures the Goemans-Linial relaxation

  16. Part III: Arora-Rao-Vazirani Analysis Overview

  17. Well-Spread Case • Semidefinite program gives us one vector 𝑤 𝑗 for each vertex 𝑗 . • We first consider the case when these vectors are spread out. • Definition: We say that a set of 𝑜 vectors {𝑤 𝑗 } is well-spread if it can be scaled so that: ∀𝑗, 𝑤 𝑗 ≤ 1 1. 1 2 is Ω(1) (the average squared distance 𝑜 2 σ 𝑗<𝑘 𝑒 𝑗𝑘 2. between vectors is constant) • We will assume we are using this scaling.

  18. Structure Theorem • Theorem: Given a set of 𝑜 vectors {𝑤 𝑗 } which are well-spread and obey the triangle inequality, there exist well-separated subsets 𝑌 and 𝑍 of these vectors of linear size. In other words, there exist 𝑌, 𝑍 such that: 𝑌 and 𝑍 are Δ far apart (i.e. ∀𝑤 𝑗 ∈ 𝑌, 𝑤 𝑘 ∈ 1. 2 ≥ Δ ) where Δ is Ω 1 𝑍, 𝑒 𝑗𝑘 𝑚𝑝𝑕𝑜 |𝑌| and |𝑍| are both Ω(𝑜) 2.

  19. Finding a Sparse Cut • Idea: If we have well-separated subsets 𝑌, 𝑍 , take a random cut of the form (𝑇 𝑠 , ҧ 𝑇 𝑠 ) where 2 ≤ 𝑠} and 𝑠 ∈ 0, Δ 𝑇 𝑠 = {𝑗: 𝑒 2 𝑤 𝑗 , 𝑌 = min 𝑘:𝑤 𝑘 ∈𝑍 𝑒 𝑗𝑘 2 𝑒 𝑗𝑘 • All 𝑗, 𝑘 ∈ 𝐹(𝐻) contribute at most Δ to the 2 to expected number of edges cut and 𝑒 𝑗𝑘 2 (the number of edges the σ 𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑒 𝑗𝑘 SDP “thinks” are cut)

  20. Finding a Sparse Cut Continued • Since 𝑌, 𝑍 have size Ω(𝑜) and are always on opposite sides of the cut, we always have that 2 up 𝑇 𝑠 | is Θ 𝑜 2 . This matches σ 𝑗,𝑘:𝑗<𝑘 𝑒 𝑗𝑘 𝑇 𝑠 ⋅ | ҧ to a constant factor. (this is why we need 𝑌 and 𝑍 to have linear size!) • Thus, the expected ratio of the sparsity to the 1 Δ = O 𝑚𝑝𝑕𝑜 , as SDP value is at most needed.

  21. Tight Example: Hypercube log 2 𝑜 1 1 • Take the hypercube − log 2 𝑜 , log 2 𝑜 • X = 𝑦: σ 𝑗 𝑦 𝑗 ≤ −1 and Y = 𝑧: σ 𝑗 𝑦 𝑗 ≥ 1 have the following properties: 𝑌 and 𝑍 have linear size 1. ∀𝑦 ∈ 𝑌, 𝑧 ∈ 𝑍 , 𝑦, 𝑧 differ in ≥ 2 log 2 𝑜 2. 2 log 2 𝑜 2 coordinates. Thus, 𝑒 2 𝑦, 𝑧 ≥ log 2 𝑜 = log 2 𝑜

  22. Finding Well-Separated Sets • Let 𝑒 be the dimension such that ∀𝑗, 𝑤 𝑗 ∈ ℝ 𝑒 . • Algorithm (Parameters 𝜏 > 0, Δ, 𝑒 ) 1. Choose a random 𝑣 ∈ ℝ 𝑒 . 2. Find a value 𝑏 such that there are Ω(𝑜) vectors 𝑤 𝑗 with 𝑤 𝑗 ⋅ 𝑣 ≤ 𝑏 and Ω(𝑜) vectors 𝑤 𝑘 with 𝑤 𝑘 ⋅ 𝑣 ≥ 𝜏 𝑏 + 𝑒 . Let 𝑌′ and 𝑍′ be these two sets of vectors 3. As long as there is a pair 𝑦 ∈ 𝑌 ′ , 𝑧 ∈ 𝑍′ such that 𝑒 𝑦, 𝑧 < Δ , delete 𝑦 from 𝑌′ and 𝑧 from 𝑍′ . The resulting sets will be the desired 𝑌, 𝑍 . • Need to show: P[𝑌, 𝑍 have size Ω(𝑜)] is Ω(1)

  23. Finding Well-Separated Sets • Will first explain why step 1,2 succeed with probability 2𝜀 > 0 . • Will then show that the probability step 3 deletes a linear number of points is ≤ 𝜀 • Together, this implies that the entire algorithm succeeds with probability at least 𝜀 > 0 .

  24. Behavior of Gaussian Projections • What happens if we project a vector 𝑤 of length 𝑚 in a random direction in ℝ 𝑒 ? • Without loss of generality, assume 𝑤 = 𝑓 1 • To pick a random unit vector in ℝ 𝑒 , choose 1 each coordinate according to 𝑂 0, 𝑒 (the normal distribution with mean 0 and standard 1 deviation 𝑒 ), then rescale. • If 𝑒 is not too small, w.h.p. very little rescaling will be needed.

  25. Behavior of Gaussian Projections • What happens if we project a vector of length 𝑚 in a random direction in ℝ 𝑒 ? • Resulting value has a distribution which is ≈ normal distribution of mean 0 , standard 1 deviation 𝑒 (difference comes from the rescaling step)

  26. Success of Steps 1,2 • If we take a random 𝑣 ∈ ℝ 𝑒 , with probability 𝑜 2 Ω(1) , σ 𝑗<𝑘 (𝑤 𝑘 − 𝑤 𝑗 ) ⋅ 𝑣 is Ω 𝑒 • Note: this can fail with non-negligible probability, consider the case when ∀𝑗, 𝑤 𝑗 = ±𝑤 . If 𝑣 is orthogonal to 𝑤 then everything is projected to 0 . • For arbitrarily small 𝜗 > 0 , with very high 1 probability, |𝑤 𝑗 ⋅ 𝑣| is 𝑃 𝑒 for 1 − 𝜗 𝑜 of the 𝑗 ∈ [1, 𝑜]

Recommend


More recommend