Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics - PowerPoint PPT Presentation

Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics Summer School

A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges

A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems

A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems How well can we approximate MAXCUT?

A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems How well can we approximate MAXCUT? Simple ½-approximation algorithm: Choose U randomly.

A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems How well can we approximate MAXCUT? Simple ½-approximation algorithm: Choose U randomly. But can we do better?

MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write

MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write x i ’s are 0/1 valued

MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write counts the number of edges crossing the cut x i ’s are 0/1 valued

MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write counts the number of edges crossing the cut x i ’s are 0/1 valued Now we can leverage the Sum-of-Squares (SOS) Hierarchy …

MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write counts the number of edges crossing the cut x i ’s are 0/1 valued Now we can leverage the Sum-of-Squares (SOS) Hierarchy … We will utilize an alternative view based on the notion of a pseudo-expectation…

AN ALTERNATIVE VIEW OF SOS Pseudo-expectation [informally]: An operator that behaves like an expectation over a distribution on solutions degree ≤ d polynomials in n variables

AN ALTERNATIVE VIEW OF SOS Pseudo-expectation [informally]: An operator that behaves like an expectation over a distribution on solutions degree ≤ d polynomials in n variables This formulation is the starting point for state-of-the-art algorithms for quantum separability , tensor completion , tensor PCA , finding a planted sparse vector in a subspace , the best separable state problem , …

AN ALTERNATIVE VIEW OF SOS Pseudo-expectation [informally]: An operator that behaves like an expectation over a distribution on solutions degree ≤ d polynomials in n variables This formulation is the starting point for state-of-the-art algorithms for quantum separability , tensor completion , tensor PCA , finding a planted sparse vector in a subspace , the best separable state problem , … Let’s see what it looks like for MAXCUT…

Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2

Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 (1) – (3) are the usual constraints that say Ẽ behaves like it is taking the expectation under some distribution on assignments to the variables

Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 (1) – (3) are the usual constraints that say Ẽ behaves like it is taking the expectation under some distribution on assignments to the variables (4) is because we want the distribution to be supported on 0/1 valued assignments

Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 But why is this a relaxation for MAXCUT?

Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 Claim: If there is a cut that has at least k edges crossing, there is a feasible solution to (1) – (4) with objective value ≥ k

Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 Claim: If there is a cut that has at least k edges crossing, there is a feasible solution to (1) – (4) with objective value ≥ k Proof: if a 1 , a 2 , …, a n is the indicator vector of the cut U, set

Can we efficiently solve this relaxation?

Can we efficiently solve this relaxation? Theorem: There is an n O(d) -time algorithm for finding such an operator, if it exists

Can we efficiently solve this relaxation? Theorem: There is an n O(d) -time algorithm for finding such an operator, if it exists It is a semidefinite program on a n O(d) x n O(d) matrix whose entries are the pseudo-expectation applied to monomials

Can we efficiently solve this relaxation? Theorem: There is an n O(d) -time algorithm for finding such an operator, if it exists It is a semidefinite program on a n O(d) x n O(d) matrix whose entries are the pseudo-expectation applied to monomials How well does SOS approximate MAXCUT?

APPROXIMATION ALGORITHMS FOR MAXCUT Revolutionary work of [Goemans, Williamson] : Theorem: There is a -approximation algorithm for for MAXCUT

APPROXIMATION ALGORITHMS FOR MAXCUT Revolutionary work of [Goemans, Williamson] : Theorem: There is a -approximation algorithm for for MAXCUT We will give an alternate proof by rounding the degree two Sum-of-Squares relaxation

Main Question: How do you round a pseudo-expectation to find a cut? I.e. if I give you how do you find a cut with at least edges crossing (in expectation)?

Main Question: How do you round a pseudo-expectation to find a cut? I.e. if I give you how do you find a cut with at least edges crossing (in expectation)? Main Idea: Use a sample from a Gaussian distribution whose moments match the pseudo-moments

Main Question: How do you round a pseudo-expectation to find a cut? I.e. if I give you how do you find a cut with at least edges crossing (in expectation)? Main Idea: Use a sample from a Gaussian distribution whose moments match the pseudo-moments Aside: Rounding higher degree relaxations is much harder b/c you cannot necc. find a r.v. whose moments match the pseudo-moments

Claim: Without loss of generality, can assume for all i

Claim: Without loss of generality, can assume for all i Intuition: You can always change U to V\U without changing the value of the cut, so WLOG x i has probability 1/2 of being in U

GAUSSIAN ROUNDING Let y be a Gaussian vector with mean and covariance for and

GAUSSIAN ROUNDING Let y be a Gaussian vector with mean and covariance for and Now set if and otherwise

GAUSSIAN ROUNDING Let y be a Gaussian vector with mean and covariance for and Now set if and otherwise We will show that for each (i, j) we have which, by linearity of expectation, will complete the proof

For each edge (i,j), calculate contribution to objective value :

For each edge (i,j), calculate contribution to objective value : for

For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing :

For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing : and

For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing : and Now we can compute: independent std Gaussians

Putting it all together, we have for every edge (i, j): which completes the proof

Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics - PowerPoint PPT Presentation

Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics Summer School A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges A CLASSIC HARD PROBLEM:

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

Sums of Squares Bianca Homberg and Minna Liu June 24, 2010 Abstract For our exploration topic,

ex Addition: 1-bit half adder A + Sum B Carry out Carry A B Sum out 0 0 A 0 1 Sum

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Squares of function spaces and function spaces on squares Miko laj Krupski University of

Lecture 1: Introduction to the Sum of Squares Hierarchy Lecture Outline Part I:

Basic Ruby Syntax sum = 0 Newline is statement separator i = 1 while i <= 10 do sum += i*i

A monolithic recursive solu#on A monolithic solu#on that counts up This starts at n, counts down

Whats My Identity? By Miss Elliott Squares vs. Rectangles Squares Rectangles 4 sides

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

Group embeddings of partial Latin squares Ian Wanless Monash University Latin squares Latin

Dixons random squares method Last time we discuss Dixons random squares method to

Basic Ruby Syntax No variable declarations sum = 0 Newline is statement separator i = 1 while

ex start small with a 1-bit (half) adder A B Carry out Sum A 0 0 Sum 0 1 B 1 0 1 1

Chapter 6 Methods 1 Opening Problem Find the sum of integers from 1 to 10, from 20 to 30, and

Relaxed Data Structures Dan Alistarh IST Austria & ETH Zurich ...but first, were hiring!

Relaxation of isolated IFIMAR (CONICET-UNMdP) Mar del Plata, Argentina quantum systems School

On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3

Just Relax Convex Programming Methods for Subset Selection and Sparse Approximation Joel A.

Single-Source Shortest Paths [for directed weighted graphs] Course: CS 5130 - Advanced Data

Oblivious Rounding and the Integrality Gap URIEL FEIGE, WEIZMANN MICHAL FELDMAN, TEL-AVIV U.

From obfuscation to white-box crypto: relaxation and security notions Matthieu Rivain WhibOx

Planning and Optimization C2. Delete Relaxation: Finding Relaxed Plans Malte Helmert and Gabriele

Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics - PowerPoint PPT Presentation

Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics Summer School A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges A CLASSIC HARD PROBLEM:

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

Sums of Squares Bianca Homberg and Minna Liu June 24, 2010 Abstract For our exploration topic,

ex Addition: 1-bit half adder A + Sum B Carry out Carry A B Sum out 0 0 A 0 1 Sum

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Squares of function spaces and function spaces on squares Miko laj Krupski University of

Lecture 1: Introduction to the Sum of Squares Hierarchy Lecture Outline Part I:

Basic Ruby Syntax sum = 0 Newline is statement separator i = 1 while i &lt;= 10 do sum += i*i

A monolithic recursive solu#on A monolithic solu#on that counts up This starts at n, counts down

Whats My Identity? By Miss Elliott Squares vs. Rectangles Squares Rectangles 4 sides

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

Group embeddings of partial Latin squares Ian Wanless Monash University Latin squares Latin

Dixons random squares method Last time we discuss Dixons random squares method to

Basic Ruby Syntax No variable declarations sum = 0 Newline is statement separator i = 1 while

ex start small with a 1-bit (half) adder A B Carry out Sum A 0 0 Sum 0 1 B 1 0 1 1

Chapter 6 Methods 1 Opening Problem Find the sum of integers from 1 to 10, from 20 to 30, and

Relaxed Data Structures Dan Alistarh IST Austria &amp; ETH Zurich ...but first, were hiring!

Relaxation of isolated IFIMAR (CONICET-UNMdP) Mar del Plata, Argentina quantum systems School

On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3

Just Relax Convex Programming Methods for Subset Selection and Sparse Approximation Joel A.

Single-Source Shortest Paths [for directed weighted graphs] Course: CS 5130 - Advanced Data

Oblivious Rounding and the Integrality Gap URIEL FEIGE, WEIZMANN MICHAL FELDMAN, TEL-AVIV U.

From obfuscation to white-box crypto: relaxation and security notions Matthieu Rivain WhibOx

Planning and Optimization C2. Delete Relaxation: Finding Relaxed Plans Malte Helmert and Gabriele

Basic Ruby Syntax sum = 0 Newline is statement separator i = 1 while i <= 10 do sum += i*i

Relaxed Data Structures Dan Alistarh IST Austria & ETH Zurich ...but first, were hiring!