discrepancy and sdps
play

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) - PowerPoint PPT Presentation

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) Outline Discrepancy Theory What is it Basic Results (non-constructive) SDP connection Algorithms for discrepancy New methods in discrepancy (upper/lower bounds)


  1. Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands )

  2. Outline Discrepancy Theory • What is it • Basic Results (non-constructive) SDP connection • Algorithms for discrepancy • New methods in discrepancy (upper/lower bounds) • Approximation

  3. Discrepancy: Example Input: n points placed arbitrarily in a grid. Color them red/blue such that each axis-parallel rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Random has about O(n 1/2 log 1/2 n) Can achieve O(log 2.5 n) Why do we care?

  4. Combinatorial Discrepancy Universe: U= [1,…,n] S 3 Subsets: S 1 ,S 2 ,…,S m S 4 S 1 Find χ : [n] ! {-1,+1} to S 2 Minimize | χ (S)| 1 = max S | ∑ i 2 S χ (i) | If A is a 𝑛 × n matrix. 𝑦∈ −1 , 1 𝑜 𝐵𝐵 ∞ min Disc(A) =

  5. Applications CS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, … Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …

  6. Hereditary Discrepancy Discrepancy a useful measure of complexity of a set system 1 2 … n 1’ 2’ … n’ 𝑇 𝑗 = 𝐵 𝑗 ∪ 𝐵 ’ 𝑗 A 1 A’ 1 But not so robust A 2 A’ 2 … … Discrepancy = 0 Hereditary discrepancy: herdisc (U,S) = max 𝑉 ′ ⊆𝑉 disc (U’, S |U’ ) Robust version of discrepancy (How to certify herdisc < D? In NP?)

  7. Some Applications

  8. Rounding Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and 𝐵 ∈ 𝑆 𝑜 , can round x to 𝐵 � ∈ 𝑎 𝑜 s.t. 𝐵𝐵 – 𝐵𝐵 � ∞ < Herdisc 𝐵 𝐵 � x Ax=b Proof: Round the bits of x one by one. 𝐵 1 : blah .0101101 (-1) 𝐵 2 : blah .1101010 Key Point: Low discrepancy A … coloring guides our updates! 𝐵 𝑜 : blah .0111101 (+1) 1 1 1 2 𝑙 + 2 𝑙−1 + … + 2 ) Error = herdisc(A) (

  9. Rounding LSV’86 result guarantees existence of good rounding. How to find it efficiently? Thm [B’10]. Can round efficiently, so that Error ≤ 𝑃 log 𝑛 log 𝑜 Herdisc 𝐵 Use SDPs, basic method

  10. Refinements Spencer’85: Any 0-1 matrix (n x n ) has disc ≤ 6 𝑜 Non-constructive Entropy method (very powerful technique) B.’10: Algorithmic O( 𝑜 ) (SDP + Entropy method) Lovett-Meka’12: (much simpler) Better variant of “entropy method” Extends iterated rounding. Bin-packing: Rothvoss’13: Alg ≤ LP + O(log OPT log log OPT) Karmarkar-Karp’82: Alg ≤ LP + O( log 2 OPT )

  11. Dynamic Data Structures N weighted points in a 2-d region. Weights updated over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Goal: Preprocess (in a data structure) 1) Low query time 2) Low update time (upon weight change)

  12. Example Line: Interval queries Trivial: Query Time= O(n) Update Time = 1 Query time= 1 Update time= O( 𝑜 2 ) (Table of entries W[a,b] ) Query = O(log n) Update = O(log n) Recursively for 2-d. 𝑃 log 2 𝑜 , log 2 𝑜

  13. What about other queries? Circles arbitrary rectangles aligned triangle 𝑜 1 / 2 Turns out 𝑢 𝑟 𝑢 𝑣 ≥ log 2 𝑜 Reason: Set system S formed by query sets & points has large discrepancy (about 𝑜 1 / 4 ) 𝑒𝑗𝑒𝑒 𝑇 2 Larsen-Green’11 : 𝑢 𝑟 𝑢 𝑣 ≥ log 2 𝑜

  14. Lower Bounds Various methods: Spectral, Fourier analytic, … Determinant lower bound: detlb(A) ≤ herdisc (A) [Lovasz et al. 86] herdisc(A) ≤ polylog(n,m) detlb(A) [Matousek’11] (SDP duality) Polylog approximation for herdisc(A) [Nikolov, Talwar, Zhang’13]

  15. SDP Connection

  16. Vector Discrepancy Exact: Min t −𝑢 ≤ ∑ 𝑏 𝑗𝑗 𝐵 𝑗 ≤ 𝑢 for all rows i 𝑗 𝐵 𝑗 ∈ − 1,1 for each j SDP: vecdisc(A) min t ∑ 𝑏 𝑗𝑗 𝑤 𝑗 2 ≤ 𝑢 for all rows i 𝑗 𝑤 𝑗 2 = 1 for each j

  17. Is vecdisc a good relaxation? Not directly. vecdisc(A) = 0 even if disc(A) very large [Charikar, Newman, Nikolov’11] NP-Hard: disc(A) = 0 or disc(A) very large ? Let hervecdisc(A) = max vecdisc( 𝐵 | 𝑇 ) 𝑇 Thm [B’10]: disc(A) = 𝑃 log 𝑛 log 𝑜 hervecdisc(A) Pf: Algorithm

  18. Algorithm (at high level) start Each dimension: An Element Cube: {-1,+1} n Each vertex: A Coloring finish Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated , e.g. δ t 1 + δ t 2 ¼ 0 Analysis: Few steps to reach a vertex (walk has high variance) Disc( S i ) does a random walk (with low variance)

  19. An SDP Hereditary disc. λ ) the following SDP is feasible SDP: Low discrepancy | ∑ 𝑏 𝑗𝑗 v j | 2 · λ 2 for each row i. |v j | 2 = 1 for each element j. Obtain v j 2 R n Perhaps 𝑤 𝑗 can guide us how to update color of element j ? Trouble: 𝑤 𝑗 is a vector. Need a real number. Project on random vector vector g ( η i = g ¢ v i ) Seems promising: ∑ 𝑏 𝑗𝑗 𝜃 𝑗 = 𝑕 ⋅ ∑ 𝑏 𝑗𝑗 𝑤 𝑗 𝑗 𝑗

  20. Properties of Rounding Lemma: If g 2 R n is a random Gaussian, for any v 2 R n , v is distributed as N(0, |v| 2 ) g ¢ SDP: Each η j » N(0, 1 ) |v j | 2 = 1 1. 2. For each row i, | ∑ j a ij v j | 2 · λ 2 ∑ j a ij η j = g ¢ ( ∑ j a ij v j ) » N(0, · λ 2 ) (std deviation · λ ) η ’s will guide our updates to x.

  21. Algorithm Overview Construct coloring iteratively. Initially: Start with coloring x 0 = (0,0,0, …,0) at t = 0. At Time t: Update coloring as x t = x t-1 + γ ( η t 1 ,…, η t n ) ( γ tiny: say 1/n) x t (j) = γ ( η 1 j + η 2 j + … + η t +1 j ) Color of element j: Does random walk over time with step size ¼ γ Ν(0,1) time x(i) -1 Fixed if reaches -1 or +1. Disc(row i): ∑ a ij x t (j) does a random walk w/ step γ N(0,· λ 2 ) 𝑗

  22. Analysis At time T = O(1/ γ 2 ) 1: With prob. ½, an element reaches -1 or +1. 2: Each row has discrepancy in expectation. At time T = O((log n)/ 𝛿 2 ) 1. Most likely all elements fixed 2. Expected discrepancy for a row = 𝜇 log 𝑜 (By Chernoff, all have discrepancy O( 𝜇 log 𝑜 log 𝑛 )

  23. New Entropy Method

  24. Entropy method Very powerful method to prove discrepancy upper bounds [Beck, Spencer 80’s]: Given an m x n matrix A, there is a partial coloring satisfying 𝑏 𝑗 𝐵 ≤ 𝜇 𝑗 𝑏 𝑗 2 1 𝑜 𝑕 𝜇 𝑗 ≈ ln 𝜇 𝑗 if 𝜇 𝑗 < 1 provided ∑ 𝑕 ( 𝜇 𝑗 ) ≤ 5 𝑗 2 if 𝜇 𝑗 ≥ 1 ≈ 𝑓 −𝜇 𝑗 E.g. can ask for partial coloring with 0 discrepancy on n/log n rows, and reasonable amount on others.

  25. Lovett Meka Algorithm Do a sticky random walk. If some row 𝐵 𝑗 gets tight (disc( 𝐵 𝑗 ) = 𝜇 𝑗 𝑏 𝑗 2 ) start Move in space 𝐵 𝑗 x = 0 Make progress as long as dimension = Ω 𝑜 2 ≤ 𝑜 ∑ 𝑗 exp −𝜇 𝑗 2 (better than entropy method) Guarantees a partial coloring, even if n/4 𝜇 𝑗 ’s are 0.

  26. Comparison with iterated rounding Fact: If n variables, but ≤ n/2 constraints Ax = b (rest: 0 ≤ x ≤ 1) There exists a basic feasible solution with > n/2 variables 0-1. Iterated rounding: LP with m constraints, drop except n/2. (no control on dropped constrains ( 𝑏 𝑗 x ≤ 𝑐 𝑗 ), error up to | 𝑏 𝑗 | 1 ) Lovett-Meka Lemma: Can find solution with ≥ n/2 integral variables with error ≤ 𝜇 𝑗 𝑏 𝑗 2 ). E.g. Can set n/10 constraints to have 0 error, and controlled bounds on others

  27. Lower Bounds

  28. Discrepancy If disc(A) > D 𝐵𝐵 ∞ ≥ 𝐸 for all 𝐵 ∈ − 1,1 𝑜 (say A: n x n matrix for convenience) If 𝜏 𝑛𝑗𝑜 ( 𝐵 ) ≥ 𝐸 𝐵𝐵 2 ≥ 𝜏 𝑛𝑗𝑜 𝐵 𝐵 2 Could be very weak bound. Can consider 𝜏 𝑛𝑗𝑜 (PA) P:diag, tr(P) = n

  29. Determinant Lower Bound Thm (Lovasz Spencer Vesztergombi’86 ): herdisc 𝐵 ≥ detlb (A) 𝑙 × 𝑙 submatrix 𝐶 of 𝐵 det 𝐶 1 / 𝑙 detlb(A): max 𝑙 max (simple geometric argument) Conjecture (LSV’86): Herdisc ≤ O(1) detlb Remark: For TU Matrices, Herdisc(A) =1, detlb = 1 (every submatrix has det -1,0 or +1)

  30. Hoffman’s example log 𝑜 Hoffman: Detlb(A) ≤ 2 herdisc 𝐵 ≥ log log 𝑜 Palvolgyi’11: Ω log 𝑜 gap T: k-ary tree of depth k. ≈ 𝑙 𝑙 nodes. ... S: All edges out of a node. S’: All leaf to root paths. Both S and S’ are TU. Claim: 𝐸𝑓𝑢𝐸𝑐 𝑇 ∪ 𝑇𝑇 ≤ 2 ( Expand determinant) Herdisc( 𝑇 ∪ 𝑇𝑇 ) = k

  31. Matousek’11: herdisc(A) ≤ O(log n log 𝑛 ) detlb. Idea: SDP Duality -> Dual Witness for large herdisc(A). Dual Witness -> Submatrix with large determinant. Other implications: herdisc 𝐵 1 ∪ ⋯ ∪ 𝐵 𝑢 ≤ 𝑃 log 𝑜 log 𝑛 𝑢 max 𝑗 . herdisc 𝐵 𝑗

  32. Matousek’s result Thm: Herdisc(A) = O(log n log 𝑛 ) detlb(A) Pf: Recall, Disc 𝐵 = 𝑃 log 𝑛 log 𝑜 Hervecdisc(A) Some S, s.t. Vecdisc 𝐵 | 𝑇 = Herdisc(A)/ 𝑃 log 𝑛 log 𝑜 Will show: vecdisc 𝐵 | 𝑇 ≤ 𝑃 log n detlb 𝐵 | 𝑇 Let us use A for 𝐵 | 𝑇

Recommend


More recommend