Denote also q k = { q k α } α ∈ N n , the vector of coefficients of the polynomial q k , in the basis v d ( X ) , that is, � q k α X α q k ( X ) = � q k , v d ( X ) � = | α |≤ r and define the real symmetric matrix Q := � s k = 1 q k q T k � 0. s s � � � q k , v d ( X ) � 2 = q k ( X ) 2 = f ( X ) � v d ( X ) , Q v d ( X ) � = k = 1 k = 1 Conversely, let Q � 0 be a real s ( d ) × s ( d ) positive semidefinite symmetric matrix ( s ( d ) is the dimension of the vector space R [ X ] d ). As Q � 0, write Q = � s k = 1 q k q T k , so that � � s s � � � q k , v d ( X ) � 2 = q k ( X ) 2 f ( X ) := � v d ( X ) , Q v d ( X ) � = k = 1 k = 1 is SOS. Jean B. Lasserre semidefinite characterization
Denote also q k = { q k α } α ∈ N n , the vector of coefficients of the polynomial q k , in the basis v d ( X ) , that is, � q k α X α q k ( X ) = � q k , v d ( X ) � = | α |≤ r and define the real symmetric matrix Q := � s k = 1 q k q T k � 0. s s � � � q k , v d ( X ) � 2 = q k ( X ) 2 = f ( X ) � v d ( X ) , Q v d ( X ) � = k = 1 k = 1 Conversely, let Q � 0 be a real s ( d ) × s ( d ) positive semidefinite symmetric matrix ( s ( d ) is the dimension of the vector space R [ X ] d ). As Q � 0, write Q = � s k = 1 q k q T k , so that � � s s � � � q k , v d ( X ) � 2 = q k ( X ) 2 f ( X ) := � v d ( X ) , Q v d ( X ) � = k = 1 k = 1 is SOS. Jean B. Lasserre semidefinite characterization
Next, write the matrix v d ( X ) v d ( X ) T as: � v d ( X ) v d ( X ) T = B α x α , α ∈ N n 2 d for some real symmetric matrices ( B α ) . Checking whether � Q , v d ( X ) v d ( X ) T � f ( X ) := � v d ( X ) , Q v d ( X ) � = ���� � α f α X α � � Q , B α � X α = α ∈ N n 2 d for some Q � 0 reduces to checking the LMI � � B α , Q � α ∈ N n , | α | ≤ 2 d = f α , . Q � 0 has a solution! Jean B. Lasserre semidefinite characterization
Example Let t �→ f ( t ) = 6 + 4 t + 9 t 2 − 4 t 3 + 6 t 4 . Is f an SOS? Do we have T 1 a b c 1 ? f ( t ) = t b d e t t 2 t 2 c e f � �� � Q � 0 for some Q � 0? We must have: a = 6 ; 2 b = 4 ; d + 2 c = 9 ; 2 e = − 4 ; f = 6 . And so we must find a scalar c such that 6 2 c � 0 . Q = 2 9 − 2 c − 2 c − 2 6 Jean B. Lasserre semidefinite characterization
Example Let t �→ f ( t ) = 6 + 4 t + 9 t 2 − 4 t 3 + 6 t 4 . Is f an SOS? Do we have T 1 a b c 1 ? f ( t ) = t b d e t t 2 t 2 c e f � �� � Q � 0 for some Q � 0? We must have: a = 6 ; 2 b = 4 ; d + 2 c = 9 ; 2 e = − 4 ; f = 6 . And so we must find a scalar c such that 6 2 c � 0 . Q = 2 9 − 2 c − 2 c − 2 6 Jean B. Lasserre semidefinite characterization
With c = − 4 we have 6 2 − 4 � 0 . Q = 2 17 − 2 − 4 − 2 6 et � � ′ ′ ( 2 / 2 ) ( 2 / 2 ) 2 / 3 2 / 3 Q = 2 0 0 + 9 − 1 / 3 1 / 3 � � − 2 / 3 − 2 / 3 ( 2 ) / 2 ( 2 ) / 2 � � ′ 1 / ( 18 ) 1 / ( 18 ) � � + 18 4 / ( 18 ) 4 / ( 18 ) � � − 1 / ( 18 ) − 1 / ( 18 ) Jean B. Lasserre semidefinite characterization
and so f ( t ) = ( 1 + t 2 ) 2 + ( 2 − t − 2 t 2 ) 2 + ( 1 + 4 t − t 2 ) 2 which is an SOS polynomial. Jean B. Lasserre semidefinite characterization
SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar Jean B. Lasserre semidefinite characterization
SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar Jean B. Lasserre semidefinite characterization
SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar Jean B. Lasserre semidefinite characterization
SUCH POSITIVITY CERTIFICATES allow to infer GLOBAL Properties of FEASIBILITY and OPTIMALITY, ... the analogue of (well-known) previous ones valid in the CONVEX CASE ONLY! Farkas Lemma → Krivine-Stengle KKT-Optimality conditions → Schmüdgen-Putinar Jean B. Lasserre semidefinite characterization
In addition, polynomials NONNEGATIVE ON A SET K ⊂ R n are ubiquitous. They also appear in many important applications (outside optimization), . . . modeled as particular instances of the so called Generalized Moment Problem, among which: Probability, Optimal and Robust Control, Game theory, Signal processing, multivariate integration, etc. Jean B. Lasserre semidefinite characterization
The Generalized Moment Problem � � s s � � ≤ ( GMP ) : µ i ∈ M ( K i ) { inf f i d µ i : h ij d µ i = b j , j ∈ J } K i K i i = 1 i = 1 with M ( K i ) space of Borel measures on K i ⊂ R n i , i = 1 , . . . , s . � � Global OPTIM → µ ∈ M ( K ) { inf f d µ : 1 d µ = 1 } K K is the simplest instance of the GMP! Jean B. Lasserre semidefinite characterization
The Generalized Moment Problem � � s s � � ≤ ( GMP ) : µ i ∈ M ( K i ) { inf f i d µ i : h ij d µ i = b j , j ∈ J } K i K i i = 1 i = 1 with M ( K i ) space of Borel measures on K i ⊂ R n i , i = 1 , . . . , s . � � Global OPTIM → µ ∈ M ( K ) { inf f d µ : 1 d µ = 1 } K K is the simplest instance of the GMP! Jean B. Lasserre semidefinite characterization
For instance, one may also want: • To approximate sets defined with QUANTIFIERS, like .e.g., R f := { x ∈ B : f ( x , y ) ≤ 0 for all y such that ( x , y ) ∈ K } D f := { x ∈ B : f ( x , y ) ≤ 0 for some y such that ( x , y ) ∈ K } where f ∈ R [ x , y ] , B is a simple set (box, ellipsoid). • To compute convex polynomial underestimators p ≤ f of a polynomial f on a box B ⊂ R n . (Very useful in MINLP.) Jean B. Lasserre semidefinite characterization
For instance, one may also want: • To approximate sets defined with QUANTIFIERS, like .e.g., R f := { x ∈ B : f ( x , y ) ≤ 0 for all y such that ( x , y ) ∈ K } D f := { x ∈ B : f ( x , y ) ≤ 0 for some y such that ( x , y ) ∈ K } where f ∈ R [ x , y ] , B is a simple set (box, ellipsoid). • To compute convex polynomial underestimators p ≤ f of a polynomial f on a box B ⊂ R n . (Very useful in MINLP.) Jean B. Lasserre semidefinite characterization
The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is needed. (Global optimization is only one example.) In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!. Jean B. Lasserre semidefinite characterization
The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is needed. (Global optimization is only one example.) In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!. Jean B. Lasserre semidefinite characterization
The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is needed. (Global optimization is only one example.) In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!. Jean B. Lasserre semidefinite characterization
The moment-LP and moment-SOS approaches consist of using a certain type of positivity certificate (Krivine-Vasilescu-Handelman’s or Putinar’s certificate) in potentially any application where such a characterization is needed. (Global optimization is only one example.) In many situations this amounts to solving a HIERARCHY of : LINEAR PROGRAMS, or SEMIDEFINITE PROGRAMS ... of increasing size!. Jean B. Lasserre semidefinite characterization
LP- and SDP-hierarchies for optimization Replace f ∗ = sup λ,σ j { λ : f ( x ) − λ ≥ 0 ∀ x ∈ K } with: The SDP-hierarchy indexed by d ∈ N : m � f ∗ d = sup { λ : f − λ = σ 0 + σ j g j ; deg ( σ j g j ) ≤ 2 d } ���� ���� j = 1 SOS SOS or, the LP-hierarchy indexed by d ∈ N : m � � α j ( 1 − g j ) β j ; θ d = sup { λ : f − λ = c αβ g j | α + β | ≤ 2 d } ���� α,β j = 1 ≥ 0 Jean B. Lasserre semidefinite characterization
LP- and SDP-hierarchies for optimization Replace f ∗ = sup λ,σ j { λ : f ( x ) − λ ≥ 0 ∀ x ∈ K } with: The SDP-hierarchy indexed by d ∈ N : m � f ∗ d = sup { λ : f − λ = σ 0 + σ j g j ; deg ( σ j g j ) ≤ 2 d } ���� ���� j = 1 SOS SOS or, the LP-hierarchy indexed by d ∈ N : m � � α j ( 1 − g j ) β j ; θ d = sup { λ : f − λ = c αβ g j | α + β | ≤ 2 d } ���� α,β j = 1 ≥ 0 Jean B. Lasserre semidefinite characterization
Theorem Both sequence ( f ∗ d ) , and ( θ d ) , d ∈ N , are MONOTONE NON DECREASING and when K is compact (and satisfies a technical Archimedean assumption) then: f ∗ = d →∞ f ∗ lim d = d →∞ θ d . lim Jean B. Lasserre semidefinite characterization
• What makes this approach exciting is that it is at the crossroads of several disciplines/applications: Commutative, Non-commutative, and Non-linear ALGEBRA Real algebraic geometry, and Functional Analysis Optimization, Convex Analysis Computational Complexity in Computer Science, which BENEFIT from interactions! • As mentioned ... potential applications are ENDLESS! Jean B. Lasserre semidefinite characterization
• What makes this approach exciting is that it is at the crossroads of several disciplines/applications: Commutative, Non-commutative, and Non-linear ALGEBRA Real algebraic geometry, and Functional Analysis Optimization, Convex Analysis Computational Complexity in Computer Science, which BENEFIT from interactions! • As mentioned ... potential applications are ENDLESS! Jean B. Lasserre semidefinite characterization
• Has already been proved useful and successful in applications with modest problem size, notably in optimization, control, robust control, optimal control, estimation, computer vision, etc. (If sparsity then problems of larger size can be addressed) • HAS initiated and stimulated new research issues: in Convex Algebraic Geometry (e.g. semidefinite representation of convex sets, algebraic degree of semidefinite programming and polynomial optimization) in Computational algebra (e.g., for solving polynomial equations via SDP and Border bases) Computational Complexity where LP- and SDP-HIERARCHIES have become an important tool to analyze Hardness of Approximation for 0/1 combinatorial problems ( → links with quantum computing) Jean B. Lasserre semidefinite characterization
• Has already been proved useful and successful in applications with modest problem size, notably in optimization, control, robust control, optimal control, estimation, computer vision, etc. (If sparsity then problems of larger size can be addressed) • HAS initiated and stimulated new research issues: in Convex Algebraic Geometry (e.g. semidefinite representation of convex sets, algebraic degree of semidefinite programming and polynomial optimization) in Computational algebra (e.g., for solving polynomial equations via SDP and Border bases) Computational Complexity where LP- and SDP-HIERARCHIES have become an important tool to analyze Hardness of Approximation for 0/1 combinatorial problems ( → links with quantum computing) Jean B. Lasserre semidefinite characterization
Recall that both LP- and SDP- hierarchies are GENERAL PURPOSE METHODS .... NOT TAILORED to solving specific hard problems!! Jean B. Lasserre semidefinite characterization
Recall that both LP- and SDP- hierarchies are GENERAL PURPOSE METHODS .... NOT TAILORED to solving specific hard problems!! Jean B. Lasserre semidefinite characterization
A remarkable property of the SOS hierarchy: I When solving the optimization problem f ∗ = min { f ( x ) : g j ( x ) ≥ 0 , j = 1 , . . . , m } P : one does NOT distinguish between CONVEX, CONTINUOUS NON CONVEX, and 0/1 (and DISCRETE) problems! A boolean variable x i is modelled via the equality constraint “ x 2 i − x i = 0". In Non Linear Programming (NLP), modeling a 0/1 variable with the polynomial equality constraint “ x 2 i − x i = 0" and applying a standard descent algorithm would be considered “stupid"! Each class of problems has its own ad hoc tailored algorithms. Jean B. Lasserre semidefinite characterization
A remarkable property of the SOS hierarchy: I When solving the optimization problem f ∗ = min { f ( x ) : g j ( x ) ≥ 0 , j = 1 , . . . , m } P : one does NOT distinguish between CONVEX, CONTINUOUS NON CONVEX, and 0/1 (and DISCRETE) problems! A boolean variable x i is modelled via the equality constraint “ x 2 i − x i = 0". In Non Linear Programming (NLP), modeling a 0/1 variable with the polynomial equality constraint “ x 2 i − x i = 0" and applying a standard descent algorithm would be considered “stupid"! Each class of problems has its own ad hoc tailored algorithms. Jean B. Lasserre semidefinite characterization
A remarkable property of the SOS hierarchy: I When solving the optimization problem f ∗ = min { f ( x ) : g j ( x ) ≥ 0 , j = 1 , . . . , m } P : one does NOT distinguish between CONVEX, CONTINUOUS NON CONVEX, and 0/1 (and DISCRETE) problems! A boolean variable x i is modelled via the equality constraint “ x 2 i − x i = 0". In Non Linear Programming (NLP), modeling a 0/1 variable with the polynomial equality constraint “ x 2 i − x i = 0" and applying a standard descent algorithm would be considered “stupid"! Each class of problems has its own ad hoc tailored algorithms. Jean B. Lasserre semidefinite characterization
Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ... Jean B. Lasserre semidefinite characterization
Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ... Jean B. Lasserre semidefinite characterization
Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ... Jean B. Lasserre semidefinite characterization
Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ... Jean B. Lasserre semidefinite characterization
Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ... Jean B. Lasserre semidefinite characterization
Even though the moment-SOS approach DOES NOT SPECIALIZE to each class of problems: It recognizes the class of (easy) SOS-convex problems as FINITE CONVERGENCE occurs at the FIRST relaxation in the hierarchy. Finite convergence also occurs for general convex problems and generically for non convex problems → (NOT true for the LP-hierarchy.) The SOS-hierarchy dominates other lift-and-project hierarchies (i.e. provides the best lower bounds) for hard 0/1 combinatorial optimization problems! → The Theoretical Computer Science community speaks of a META-algorithm ... Jean B. Lasserre semidefinite characterization
A remarkable property: II FINITE CONVERGENCE of the SOS-hierarchy is GENERIC! ... and provides a GLOBAL OPTIMALITY CERTIFICATE, the analogue for the NON CONVEX CASE of the KKT-OPTIMALITY conditions in the CONVEX CASE! Jean B. Lasserre semidefinite characterization
Theorem (Marshall, Nie) Let x ∗ ∈ K be a global minimizer of f ∗ = min { f ( x ) : g j ( x ) ≥ 0 , j = 1 , . . . , m } . P : and assume that: (i) The gradients {∇ g j ( x ∗ ) } are linearly independent, (ii) Strict complementarity holds ( λ ∗ j g j ( x ∗ ) = 0 for all j.) (iii) Second-order sufficiency conditions hold at ( x ∗ , λ ∗ ) ∈ K × R m + . � m Then f ( x ) − f ∗ = σ ∗ σ ∗ ∀ x ∈ R n , for some 0 ( x ) + j ( x ) g j ( x ) , j = 1 SOS polynomials { σ ∗ j } . Moreover, the conditions (i)-(ii)-(iii) HOLD GENERICALLY! Jean B. Lasserre semidefinite characterization
Theorem (Marshall, Nie) Let x ∗ ∈ K be a global minimizer of f ∗ = min { f ( x ) : g j ( x ) ≥ 0 , j = 1 , . . . , m } . P : and assume that: (i) The gradients {∇ g j ( x ∗ ) } are linearly independent, (ii) Strict complementarity holds ( λ ∗ j g j ( x ∗ ) = 0 for all j.) (iii) Second-order sufficiency conditions hold at ( x ∗ , λ ∗ ) ∈ K × R m + . � m Then f ( x ) − f ∗ = σ ∗ σ ∗ ∀ x ∈ R n , for some 0 ( x ) + j ( x ) g j ( x ) , j = 1 SOS polynomials { σ ∗ j } . Moreover, the conditions (i)-(ii)-(iii) HOLD GENERICALLY! Jean B. Lasserre semidefinite characterization
Certificates of positivity already exist in convex optimization f ∗ = f ( x ∗ ) = min { f ( x ) : g j ( x ) ≥ 0 , j = 1 , . . . , m } when f and − g j are CONVEX. Indeed if Slater’s condition holds there exist nonnegative KKT-multipliers λ ∗ j ∈ R m + such that: m � ∗ g j ( x ∗ ) = 0 ; ∗ g j ( x ∗ ) = 0 , j = 1 , . . . , m . ∇ f ( x ∗ ) − λ j λ j j = 1 ... and so ... the Lagrangian � L λ ∗ ( x ) := f ( x ) − f ∗ − ∗ g j ( x ) , λ j j = 1 satisfies L λ ∗ ( x ∗ ) = 0 and L λ ∗ ( x ) ≥ 0 for all x . Therefore: L λ ∗ ( x ) ≥ 0 ⇒ f ( x ) ≥ f ∗ ∀ x ∈ K ! Jean B. Lasserre semidefinite characterization
Certificates of positivity already exist in convex optimization f ∗ = f ( x ∗ ) = min { f ( x ) : g j ( x ) ≥ 0 , j = 1 , . . . , m } when f and − g j are CONVEX. Indeed if Slater’s condition holds there exist nonnegative KKT-multipliers λ ∗ j ∈ R m + such that: m � ∗ g j ( x ∗ ) = 0 ; ∗ g j ( x ∗ ) = 0 , j = 1 , . . . , m . ∇ f ( x ∗ ) − λ j λ j j = 1 ... and so ... the Lagrangian � L λ ∗ ( x ) := f ( x ) − f ∗ − ∗ g j ( x ) , λ j j = 1 satisfies L λ ∗ ( x ∗ ) = 0 and L λ ∗ ( x ) ≥ 0 for all x . Therefore: L λ ∗ ( x ) ≥ 0 ⇒ f ( x ) ≥ f ∗ ∀ x ∈ K ! Jean B. Lasserre semidefinite characterization
In summary: KKT-OPTIMALITY PUTINAR’s CERTIFICATE when f and − g j are CONVEX in the non CONVEX CASE m m � � λ ∗ j ∇ g j ( x ∗ ) = 0 σ j ( x ∗ ) ∇ g j ( x ∗ ) = 0 ∇ f ( x ∗ ) − ∇ f ( x ∗ ) − j = 1 j = 1 � m � m f ( x ) − f ∗ − f ( x ) − f ∗ − λ ∗ σ ∗ j g j ( x ) j ( x ) g j ( x ) j = 1 j = 1 ≥ 0 for all x ∈ R n (= σ ∗ 0 ( x )) ≥ 0 for all x ∈ R n . for some SOS { σ ∗ j } , and σ ∗ j ( x ∗ ) = λ ∗ j . Jean B. Lasserre semidefinite characterization
In summary: KKT-OPTIMALITY PUTINAR’s CERTIFICATE when f and − g j are CONVEX in the non CONVEX CASE m m � � λ ∗ j ∇ g j ( x ∗ ) = 0 σ j ( x ∗ ) ∇ g j ( x ∗ ) = 0 ∇ f ( x ∗ ) − ∇ f ( x ∗ ) − j = 1 j = 1 � m � m f ( x ) − f ∗ − f ( x ) − f ∗ − λ ∗ σ ∗ j g j ( x ) j ( x ) g j ( x ) j = 1 j = 1 ≥ 0 for all x ∈ R n (= σ ∗ 0 ( x )) ≥ 0 for all x ∈ R n . for some SOS { σ ∗ j } , and σ ∗ j ( x ∗ ) = λ ∗ j . Jean B. Lasserre semidefinite characterization
So even though both LP- and SDP-relaxations were not designed for solving specific hard problems ... The SDP-relaxations behave reasonably well ("efficiently"?) as they provide the BEST LOWER BOUNDS in very different contexts (in contrast to LP-relaxations). → The Theoretical Computer Science (TCS) community even speaks of a META-ALGORITHM → ... considered as the most promising tool to prove/disprove the Unique Games Conjecture (UGC) Jean B. Lasserre semidefinite characterization
So even though both LP- and SDP-relaxations were not designed for solving specific hard problems ... The SDP-relaxations behave reasonably well ("efficiently"?) as they provide the BEST LOWER BOUNDS in very different contexts (in contrast to LP-relaxations). → The Theoretical Computer Science (TCS) community even speaks of a META-ALGORITHM → ... considered as the most promising tool to prove/disprove the Unique Games Conjecture (UGC) Jean B. Lasserre semidefinite characterization
So even though both LP- and SDP-relaxations were not designed for solving specific hard problems ... The SDP-relaxations behave reasonably well ("efficiently"?) as they provide the BEST LOWER BOUNDS in very different contexts (in contrast to LP-relaxations). → The Theoretical Computer Science (TCS) community even speaks of a META-ALGORITHM → ... considered as the most promising tool to prove/disprove the Unique Games Conjecture (UGC) Jean B. Lasserre semidefinite characterization
A Lagrangian interpretation of LP-relaxations Consider the optimization problem f ∗ = min { f ( x ) : x ∈ K } , P : where K is the compact basic semi-algebraic set: K := { x ∈ R n : g j ( x ) ≥ 0 ; j = 1 , . . . , m } . Assume that: • For every j = 1 , . . . , m (and possibly after scaling), g j ( x ) ≤ 1 for all x ∈ K . • The family { g j , 1 − g j } generate R [ x ] . Jean B. Lasserre semidefinite characterization
Lagrangian relaxation The dual method of multipliers, or Lagrangian relaxation consists of solving: ρ := max u { G ( u ) : u ≥ 0 } , � m with u �→ G ( u ) := min f ( x ) − u j g j ( x ) . x j = 1 Equivalently: � m ρ = max u ,λ { λ : f ( x ) − u j g j ( x ) ≥ λ, ∀ x . } j = 1 , i.e., ρ < f ∗ , In general, there is a DUALITY GAP except in the CONVEX case where f and − g j are all convex (and under some conditions). Jean B. Lasserre semidefinite characterization
With d ∈ N fixed, consider the new optimization problem P d : � m g j ( x ) α j ( 1 − g j ( x )) β j ≥ 0 f ∗ d = min { f ( x ) : x j = 1 � ∀ α, β : | α + β | = � j α j + β j ≤ 2 d Of course P and P d are equivalent and so f ∗ d = f ∗ . ... because P d is just P with additional redundant constraints! Jean B. Lasserre semidefinite characterization
The Lagrangian relaxation of P d consists of solving: � � m g j ( x ) α j ( 1 − g j ( x )) β j ≥ λ, ρ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } Theorem ρ d ≤ f ∗ for all d ∈ N , and if K is compact and the family of polynomials { g j , 1 − g j } generates R [ x ] , then: d →∞ ρ d = f ∗ . lim Jean B. Lasserre semidefinite characterization
The Lagrangian relaxation of P d consists of solving: � � m g j ( x ) α j ( 1 − g j ( x )) β j ≥ λ, ρ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } Theorem ρ d ≤ f ∗ for all d ∈ N , and if K is compact and the family of polynomials { g j , 1 − g j } generates R [ x ] , then: d →∞ ρ d = f ∗ . lim Jean B. Lasserre semidefinite characterization
The previous theorem provides a rationale for the well-known fact that : adding redundant constraints to P helps when doing relaxations! On the other hand ... we don’t know HOW TO COMPUTE ρ d ! Jean B. Lasserre semidefinite characterization
The previous theorem provides a rationale for the well-known fact that : adding redundant constraints to P helps when doing relaxations! On the other hand ... we don’t know HOW TO COMPUTE ρ d ! Jean B. Lasserre semidefinite characterization
The LP-hierarchy may be viewed as the BRUTE FORCE SIMPLIFICATION of m � � g j ( x ) α j ( 1 − g j ( x )) β j ≥ λ, ρ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } to ... m � � g j ( x ) α j ( 1 − g j ( x )) β j − λ = 0 , θ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } Jean B. Lasserre semidefinite characterization
The LP-hierarchy may be viewed as the BRUTE FORCE SIMPLIFICATION of m � � g j ( x ) α j ( 1 − g j ( x )) β j ≥ λ, ρ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } to ... m � � g j ( x ) α j ( 1 − g j ( x )) β j − λ = 0 , θ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } Jean B. Lasserre semidefinite characterization
and indeed, ... with | α + β | ≤ 2 d , the set of ( u , λ ) such that u ≥ 0 and m � � g j ( x ) α j ( 1 − g j ( x )) β j − λ = 0 , f ( x ) − u αβ ∀ x . α,β j = 1 is a CONVEX POLYTOPE! and so, computing θ d is solving a Linear Program! and one has f ∗ ≥ ρ d ≥ θ d for all d . Jean B. Lasserre semidefinite characterization
and indeed, ... with | α + β | ≤ 2 d , the set of ( u , λ ) such that u ≥ 0 and m � � g j ( x ) α j ( 1 − g j ( x )) β j − λ = 0 , f ( x ) − u αβ ∀ x . α,β j = 1 is a CONVEX POLYTOPE! and so, computing θ d is solving a Linear Program! and one has f ∗ ≥ ρ d ≥ θ d for all d . Jean B. Lasserre semidefinite characterization
However as already mentioned For most easy convex problems (except LP) finite convergence is impossible! Other obstructions to exactness occur Typically, if K is the polytope { x : g j ( x ) ≥ 0 , j = 1 , . . . , m } and f ∗ = f ( x ∗ ) with g j ( x ) ∗ = 0, j ∈ J ( x ∗ ) , then finite convergence is impossible as soon as the exists x � = x ∗ with J ( x ) = J ( x ∗ ) ( x not necessarily in K ) Jean B. Lasserre semidefinite characterization
However as already mentioned For most easy convex problems (except LP) finite convergence is impossible! Other obstructions to exactness occur Typically, if K is the polytope { x : g j ( x ) ≥ 0 , j = 1 , . . . , m } and f ∗ = f ( x ∗ ) with g j ( x ) ∗ = 0, j ∈ J ( x ∗ ) , then finite convergence is impossible as soon as the exists x � = x ∗ with J ( x ) = J ( x ∗ ) ( x not necessarily in K ) Jean B. Lasserre semidefinite characterization
A less brutal simplification With k ≥ 1 FIXED, consider the LESS BRUTAL SIMPLIFICATION of m � � g j ( x ) α j ( 1 − g j ( x )) β j ≥ λ, ρ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } to ... m � � g j ( x ) α j ( 1 − g j ( x )) β j − λ = σ ( x ) , ρ k d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ α,β j = 1 | α + β | ≤ 2 d ; σ SOS of degree at most 2 k } Jean B. Lasserre semidefinite characterization
A less brutal simplification With k ≥ 1 FIXED, consider the LESS BRUTAL SIMPLIFICATION of m � � g j ( x ) α j ( 1 − g j ( x )) β j ≥ λ, ρ d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ ∀ x . α,β j = 1 | α + β | ≤ 2 d } to ... m � � g j ( x ) α j ( 1 − g j ( x )) β j − λ = σ ( x ) , ρ k d = max u ≥ 0 ,λ { λ : f ( x ) − u αβ α,β j = 1 | α + β | ≤ 2 d ; σ SOS of degree at most 2 k } Jean B. Lasserre semidefinite characterization
Why such a simplification? d = f ∗ as d → ∞ . With k fixed, ρ k Computing ρ k d is now solving an SDP (and not an LP any more!) � n + k � However, the size of the LMI constraint of this SDP is n (fixed) and does not depend on d ! For convex problems where f and − g j are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, 1 = f ∗ (never the case for the LP-hierarchy) that is, ρ k • A polynomial f is SOS-CONVEX if its Hessian ∇ 2 f ( x ) factors as L ( x ) L ( x ) T for some polynomial matrix L ( x ) . For instance, separable polynomials f ( x ) = � n i = 1 f i ( x i ) , with convex f i ’s are SOS-CONVEX. Jean B. Lasserre semidefinite characterization
Why such a simplification? d = f ∗ as d → ∞ . With k fixed, ρ k Computing ρ k d is now solving an SDP (and not an LP any more!) � n + k � However, the size of the LMI constraint of this SDP is n (fixed) and does not depend on d ! For convex problems where f and − g j are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, 1 = f ∗ (never the case for the LP-hierarchy) that is, ρ k • A polynomial f is SOS-CONVEX if its Hessian ∇ 2 f ( x ) factors as L ( x ) L ( x ) T for some polynomial matrix L ( x ) . For instance, separable polynomials f ( x ) = � n i = 1 f i ( x i ) , with convex f i ’s are SOS-CONVEX. Jean B. Lasserre semidefinite characterization
Why such a simplification? d = f ∗ as d → ∞ . With k fixed, ρ k Computing ρ k d is now solving an SDP (and not an LP any more!) � n + k � However, the size of the LMI constraint of this SDP is n (fixed) and does not depend on d ! For convex problems where f and − g j are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, 1 = f ∗ (never the case for the LP-hierarchy) that is, ρ k • A polynomial f is SOS-CONVEX if its Hessian ∇ 2 f ( x ) factors as L ( x ) L ( x ) T for some polynomial matrix L ( x ) . For instance, separable polynomials f ( x ) = � n i = 1 f i ( x i ) , with convex f i ’s are SOS-CONVEX. Jean B. Lasserre semidefinite characterization
Why such a simplification? d = f ∗ as d → ∞ . With k fixed, ρ k Computing ρ k d is now solving an SDP (and not an LP any more!) � n + k � However, the size of the LMI constraint of this SDP is n (fixed) and does not depend on d ! For convex problems where f and − g j are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, 1 = f ∗ (never the case for the LP-hierarchy) that is, ρ k • A polynomial f is SOS-CONVEX if its Hessian ∇ 2 f ( x ) factors as L ( x ) L ( x ) T for some polynomial matrix L ( x ) . For instance, separable polynomials f ( x ) = � n i = 1 f i ( x i ) , with convex f i ’s are SOS-CONVEX. Jean B. Lasserre semidefinite characterization
Why such a simplification? d = f ∗ as d → ∞ . With k fixed, ρ k Computing ρ k d is now solving an SDP (and not an LP any more!) � n + k � However, the size of the LMI constraint of this SDP is n (fixed) and does not depend on d ! For convex problems where f and − g j are SOS-CONVEX polynomials, the first relaxation in the hierarchy is exact, 1 = f ∗ (never the case for the LP-hierarchy) that is, ρ k • A polynomial f is SOS-CONVEX if its Hessian ∇ 2 f ( x ) factors as L ( x ) L ( x ) T for some polynomial matrix L ( x ) . For instance, separable polynomials f ( x ) = � n i = 1 f i ( x i ) , with convex f i ’s are SOS-CONVEX. Jean B. Lasserre semidefinite characterization
An alternative moment-approach Jean B. Lasserre semidefinite characterization
So far we have considered LP- and SDP-moment approaches based on CERTIFICATES of POSITIVITY on K That is: One approximates FROM INSIDE the (convex cone) C d ( K ) of polynomials nonnegative on K : For instance if K = { x : g j ( x ) ≥ 0 , j = 1 , . . . , m } , by the convex cones: m � C k d ( K ) = { σ 0 + σ j g j : deg ( σ j g j ) ≤ 2 k } ∩ R [ x ] d ���� ���� j = 1 SOS SOS m � � α j ( 1 − g j ) β j } ∩ R [ x ] d Γ k d ( K ) = { c αβ g j ���� ( α,β ) ∈ N 2 m j = 1 ≥ 0 2 k Jean B. Lasserre semidefinite characterization
So far we have considered LP- and SDP-moment approaches based on CERTIFICATES of POSITIVITY on K That is: One approximates FROM INSIDE the (convex cone) C d ( K ) of polynomials nonnegative on K : For instance if K = { x : g j ( x ) ≥ 0 , j = 1 , . . . , m } , by the convex cones: m � C k d ( K ) = { σ 0 + σ j g j : deg ( σ j g j ) ≤ 2 k } ∩ R [ x ] d ���� ���� j = 1 SOS SOS m � � α j ( 1 − g j ) β j } ∩ R [ x ] d Γ k d ( K ) = { c αβ g j ���� ( α,β ) ∈ N 2 m j = 1 ≥ 0 2 k Jean B. Lasserre semidefinite characterization
An alternative is to try to approximate C d ( K ) FROM OUTSIDE! Given a sequence y = ( y α ) , α ∈ N n : • Let L y : R [ x ] → R be the Riesz linear functional: � � g β x β ) �→ L y ( g ) := g (= g β y β β β • Define the localizing matrix M k ( g y ) with respect to y and g ∈ R [ x ] is the real symmetric matrix with rows and columns indexed by α ∈ N n and with entries M k ( g y )[ α, β ] = L y ( x α + β g j ) , α, β ∈ N n k . ⋆ If y comes from a measure µ then � L y ( x α + β g j ) = x α + β g j ( x ) d µ. Jean B. Lasserre semidefinite characterization
An alternative is to try to approximate C d ( K ) FROM OUTSIDE! Given a sequence y = ( y α ) , α ∈ N n : • Let L y : R [ x ] → R be the Riesz linear functional: � � g β x β ) �→ L y ( g ) := g (= g β y β β β • Define the localizing matrix M k ( g y ) with respect to y and g ∈ R [ x ] is the real symmetric matrix with rows and columns indexed by α ∈ N n and with entries M k ( g y )[ α, β ] = L y ( x α + β g j ) , α, β ∈ N n k . ⋆ If y comes from a measure µ then � L y ( x α + β g j ) = x α + β g j ( x ) d µ. Jean B. Lasserre semidefinite characterization
Recommend
More recommend