Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands )

Outline Discrepancy Theory • What is it • Basic Results (non-constructive) SDP connection • Algorithms for discrepancy • New methods in discrepancy (upper/lower bounds) • Approximation

Discrepancy: Example Input: n points placed arbitrarily in a grid. Color them red/blue such that each axis-parallel rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Random has about O(n 1/2 log 1/2 n) Can achieve O(log 2.5 n) Why do we care?

Combinatorial Discrepancy Universe: U= [1,…,n] S 3 Subsets: S 1 ,S 2 ,…,S m S 4 S 1 Find χ : [n] ! {-1,+1} to S 2 Minimize | χ (S)| 1 = max S | ∑ i 2 S χ (i) | If A is a 𝑛 × n matrix. 𝑦∈ −1 , 1 𝑜 𝐵𝐵 ∞ min Disc(A) =

Applications CS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, … Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …

Hereditary Discrepancy Discrepancy a useful measure of complexity of a set system 1 2 … n 1’ 2’ … n’ 𝑇 𝑗 = 𝐵 𝑗 ∪ 𝐵 ’ 𝑗 A 1 A’ 1 But not so robust A 2 A’ 2 … … Discrepancy = 0 Hereditary discrepancy: herdisc (U,S) = max 𝑉 ′ ⊆𝑉 disc (U’, S |U’ ) Robust version of discrepancy (How to certify herdisc < D? In NP?)

Some Applications

Rounding Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and 𝐵 ∈ 𝑆 𝑜 , can round x to 𝐵 � ∈ 𝑎 𝑜 s.t. 𝐵𝐵 – 𝐵𝐵 � ∞ < Herdisc 𝐵 𝐵 � x Ax=b Proof: Round the bits of x one by one. 𝐵 1 : blah .0101101 (-1) 𝐵 2 : blah .1101010 Key Point: Low discrepancy A … coloring guides our updates! 𝐵 𝑜 : blah .0111101 (+1) 1 1 1 2 𝑙 + 2 𝑙−1 + … + 2 ) Error = herdisc(A) (

Rounding LSV’86 result guarantees existence of good rounding. How to find it efficiently? Thm [B’10]. Can round efficiently, so that Error ≤ 𝑃 log 𝑛 log 𝑜 Herdisc 𝐵 Use SDPs, basic method

Refinements Spencer’85: Any 0-1 matrix (n x n ) has disc ≤ 6 𝑜 Non-constructive Entropy method (very powerful technique) B.’10: Algorithmic O( 𝑜 ) (SDP + Entropy method) Lovett-Meka’12: (much simpler) Better variant of “entropy method” Extends iterated rounding. Bin-packing: Rothvoss’13: Alg ≤ LP + O(log OPT log log OPT) Karmarkar-Karp’82: Alg ≤ LP + O( log 2 OPT )

Dynamic Data Structures N weighted points in a 2-d region. Weights updated over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Goal: Preprocess (in a data structure) 1) Low query time 2) Low update time (upon weight change)

Example Line: Interval queries Trivial: Query Time= O(n) Update Time = 1 Query time= 1 Update time= O( 𝑜 2 ) (Table of entries W[a,b] ) Query = O(log n) Update = O(log n) Recursively for 2-d. 𝑃 log 2 𝑜 , log 2 𝑜

What about other queries? Circles arbitrary rectangles aligned triangle 𝑜 1 / 2 Turns out 𝑢 𝑟 𝑢 𝑣 ≥ log 2 𝑜 Reason: Set system S formed by query sets & points has large discrepancy (about 𝑜 1 / 4 ) 𝑒𝑗𝑒𝑒 𝑇 2 Larsen-Green’11 : 𝑢 𝑟 𝑢 𝑣 ≥ log 2 𝑜

Lower Bounds Various methods: Spectral, Fourier analytic, … Determinant lower bound: detlb(A) ≤ herdisc (A) [Lovasz et al. 86] herdisc(A) ≤ polylog(n,m) detlb(A) [Matousek’11] (SDP duality) Polylog approximation for herdisc(A) [Nikolov, Talwar, Zhang’13]

SDP Connection

Vector Discrepancy Exact: Min t −𝑢 ≤ ∑ 𝑏 𝑗𝑗 𝐵 𝑗 ≤ 𝑢 for all rows i 𝑗 𝐵 𝑗 ∈ − 1,1 for each j SDP: vecdisc(A) min t ∑ 𝑏 𝑗𝑗 𝑤 𝑗 2 ≤ 𝑢 for all rows i 𝑗 𝑤 𝑗 2 = 1 for each j

Is vecdisc a good relaxation? Not directly. vecdisc(A) = 0 even if disc(A) very large [Charikar, Newman, Nikolov’11] NP-Hard: disc(A) = 0 or disc(A) very large ? Let hervecdisc(A) = max vecdisc( 𝐵 | 𝑇 ) 𝑇 Thm [B’10]: disc(A) = 𝑃 log 𝑛 log 𝑜 hervecdisc(A) Pf: Algorithm

Algorithm (at high level) start Each dimension: An Element Cube: {-1,+1} n Each vertex: A Coloring finish Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated , e.g. δ t 1 + δ t 2 ¼ 0 Analysis: Few steps to reach a vertex (walk has high variance) Disc( S i ) does a random walk (with low variance)

An SDP Hereditary disc. λ ) the following SDP is feasible SDP: Low discrepancy | ∑ 𝑏 𝑗𝑗 v j | 2 · λ 2 for each row i. |v j | 2 = 1 for each element j. Obtain v j 2 R n Perhaps 𝑤 𝑗 can guide us how to update color of element j ? Trouble: 𝑤 𝑗 is a vector. Need a real number. Project on random vector vector g ( η i = g ¢ v i ) Seems promising: ∑ 𝑏 𝑗𝑗 𝜃 𝑗 = 𝑕 ⋅ ∑ 𝑏 𝑗𝑗 𝑤 𝑗 𝑗 𝑗

Properties of Rounding Lemma: If g 2 R n is a random Gaussian, for any v 2 R n , v is distributed as N(0, |v| 2 ) g ¢ SDP: Each η j » N(0, 1 ) |v j | 2 = 1 1. 2. For each row i, | ∑ j a ij v j | 2 · λ 2 ∑ j a ij η j = g ¢ ( ∑ j a ij v j ) » N(0, · λ 2 ) (std deviation · λ ) η ’s will guide our updates to x.

Algorithm Overview Construct coloring iteratively. Initially: Start with coloring x 0 = (0,0,0, …,0) at t = 0. At Time t: Update coloring as x t = x t-1 + γ ( η t 1 ,…, η t n ) ( γ tiny: say 1/n) x t (j) = γ ( η 1 j + η 2 j + … + η t +1 j ) Color of element j: Does random walk over time with step size ¼ γ Ν(0,1) time x(i) -1 Fixed if reaches -1 or +1. Disc(row i): ∑ a ij x t (j) does a random walk w/ step γ N(0,· λ 2 ) 𝑗

Analysis At time T = O(1/ γ 2 ) 1: With prob. ½, an element reaches -1 or +1. 2: Each row has discrepancy in expectation. At time T = O((log n)/ 𝛿 2 ) 1. Most likely all elements fixed 2. Expected discrepancy for a row = 𝜇 log 𝑜 (By Chernoff, all have discrepancy O( 𝜇 log 𝑜 log 𝑛 )

New Entropy Method

Entropy method Very powerful method to prove discrepancy upper bounds [Beck, Spencer 80’s]: Given an m x n matrix A, there is a partial coloring satisfying 𝑏 𝑗 𝐵 ≤ 𝜇 𝑗 𝑏 𝑗 2 1 𝑜 𝑕 𝜇 𝑗 ≈ ln 𝜇 𝑗 if 𝜇 𝑗 < 1 provided ∑ 𝑕 ( 𝜇 𝑗 ) ≤ 5 𝑗 2 if 𝜇 𝑗 ≥ 1 ≈ 𝑓 −𝜇 𝑗 E.g. can ask for partial coloring with 0 discrepancy on n/log n rows, and reasonable amount on others.

Lovett Meka Algorithm Do a sticky random walk. If some row 𝐵 𝑗 gets tight (disc( 𝐵 𝑗 ) = 𝜇 𝑗 𝑏 𝑗 2 ) start Move in space 𝐵 𝑗 x = 0 Make progress as long as dimension = Ω 𝑜 2 ≤ 𝑜 ∑ 𝑗 exp −𝜇 𝑗 2 (better than entropy method) Guarantees a partial coloring, even if n/4 𝜇 𝑗 ’s are 0.

Comparison with iterated rounding Fact: If n variables, but ≤ n/2 constraints Ax = b (rest: 0 ≤ x ≤ 1) There exists a basic feasible solution with > n/2 variables 0-1. Iterated rounding: LP with m constraints, drop except n/2. (no control on dropped constrains ( 𝑏 𝑗 x ≤ 𝑐 𝑗 ), error up to | 𝑏 𝑗 | 1 ) Lovett-Meka Lemma: Can find solution with ≥ n/2 integral variables with error ≤ 𝜇 𝑗 𝑏 𝑗 2 ). E.g. Can set n/10 constraints to have 0 error, and controlled bounds on others

Lower Bounds

Discrepancy If disc(A) > D 𝐵𝐵 ∞ ≥ 𝐸 for all 𝐵 ∈ − 1,1 𝑜 (say A: n x n matrix for convenience) If 𝜏 𝑛𝑗𝑜 ( 𝐵 ) ≥ 𝐸 𝐵𝐵 2 ≥ 𝜏 𝑛𝑗𝑜 𝐵 𝐵 2 Could be very weak bound. Can consider 𝜏 𝑛𝑗𝑜 (PA) P:diag, tr(P) = n

Determinant Lower Bound Thm (Lovasz Spencer Vesztergombi’86 ): herdisc 𝐵 ≥ detlb (A) 𝑙 × 𝑙 submatrix 𝐶 of 𝐵 det 𝐶 1 / 𝑙 detlb(A): max 𝑙 max (simple geometric argument) Conjecture (LSV’86): Herdisc ≤ O(1) detlb Remark: For TU Matrices, Herdisc(A) =1, detlb = 1 (every submatrix has det -1,0 or +1)

Hoffman’s example log 𝑜 Hoffman: Detlb(A) ≤ 2 herdisc 𝐵 ≥ log log 𝑜 Palvolgyi’11: Ω log 𝑜 gap T: k-ary tree of depth k. ≈ 𝑙 𝑙 nodes. ... S: All edges out of a node. S’: All leaf to root paths. Both S and S’ are TU. Claim: 𝐸𝑓𝑢𝐸𝑐 𝑇 ∪ 𝑇𝑇 ≤ 2 ( Expand determinant) Herdisc( 𝑇 ∪ 𝑇𝑇 ) = k

Matousek’11: herdisc(A) ≤ O(log n log 𝑛 ) detlb. Idea: SDP Duality -> Dual Witness for large herdisc(A). Dual Witness -> Submatrix with large determinant. Other implications: herdisc 𝐵 1 ∪ ⋯ ∪ 𝐵 𝑢 ≤ 𝑃 log 𝑜 log 𝑛 𝑢 max 𝑗 . herdisc 𝐵 𝑗

Matousek’s result Thm: Herdisc(A) = O(log n log 𝑛 ) detlb(A) Pf: Recall, Disc 𝐵 = 𝑃 log 𝑛 log 𝑜 Hervecdisc(A) Some S, s.t. Vecdisc 𝐵 | 𝑇 = Herdisc(A)/ 𝑃 log 𝑛 log 𝑜 Will show: vecdisc 𝐵 | 𝑇 ≤ 𝑃 log n detlb 𝐵 | 𝑇 Let us use A for 𝐵 | 𝑇

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) - PowerPoint PPT Presentation

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) Outline Discrepancy Theory What is it Basic Results (non-constructive) SDP connection Algorithms for discrepancy New methods in discrepancy (upper/lower bounds)

Discrepancy of Random Set Systems Rebecca Hoberg and Thomas Rothvo Discrepancy theory Set

Constructive Discrepancy Minimization for Convex Sets Thomas Rothvoss UW Seattle Discrepancy

Flow Cytometry Data Assessment Flow Cytometry Data Assessment with L2 Discrepancy Learning with

The discrepancy of the linear flow on the torus Bence Borda Alfr ed R enyi Institute of

Solving Mixed-Integer SDPs Marc Pfetsch, TU Darmstadt based on work together with Tristan Gally

Discrepancy Theory and Applications to Bin Packing Thomas Rothvoss Joint work with Becca Hoberg

On some sets with minimal L 2 discrepancy Dmitriy Bilyk University of South Carolina, Columbia,

Lower Bounds for L 1 Discrepancy Armen Vagharshakyan Brown University January 10, 2013 Armen

Solving SDPs for synchronization and MaxCut problems via the Grothendieck inequality Song Mei

A Quantum Interior Point Method for LPs and SDPs Iordanis Kerenidis 1 Anupam Prakash 1 1 CNRS,

SDPs with rank constraint & Grothendieck inequalities Frank Vallentin (TU Delft, CWI

Approximating 1-Qubit Gates: Energy and Discrepancy Steven Damelin 2 Joint work with: Alec Greene

Discrepancy and Optimization Nikhil Bansal IPCO Summer School (lecture 2)

Improved Discrepancy Bounds for Hybrid Sequences Harald Niederreiter RICAM Linz and University

Online Linear Discrepancy Mitchel T. Keller , Noah Streib, and William T. Trotter School of

Point distributions on the sphere: energy minimization, discrepancy, and more. Dmitriy Bilyk

enhancing resilience ACE Training PMI training Centre Meeting Semarang, Indonesia 28 April 2015

Lesson 13 Multi hazard analysis 13.01 Prof. R. Nagarajan, CSRE , IIT Bombay GNR 639 GNR 639 :

Continuity Planning Background A total of 12,161 square kilometers had burned by the end of

ARES STX ECOMM PLAN ARES STX ECOMM PLAN Presented by Chuck Sprick KE5RAD March 30, 2008 ECOMM

Disc 0: Welcome to CS 61A! Lab 128L | Soda 275, Tu 5 p.m. - 6:30 p.m Disc 128 | Evans 9, 5

CS32 Summer 2013 Object-Oriented Programming in C++ Templates and STL Victor Amelkin September

Key features 1. Require no dedicated resources 2. Almost no post-processing is needed 3. Low I/O

CS 61A Discussion 3 Recursion Albert Xu Attendance: links.cs61a.org/albert-disc Slides: