some results on polynomial optimization problems daniel
play

Some results on polynomial optimization problems Daniel Bienstock, - PowerPoint PPT Presentation

Some results on polynomial optimization problems Daniel Bienstock, Columbia University QCQP: min f 0 ( x ) s.t. f i ( x ) 0 , 1 i m x R n Here, f i ( x ) = x T M i x + c T i x + d i Each M i is n n , wlog symmetric Folklore


  1. Some results on polynomial optimization problems Daniel Bienstock, Columbia University

  2. QCQP: min f 0 ( x ) s.t. f i ( x ) ≥ 0 , 1 ≤ i ≤ m x ∈ R n Here, f i ( x ) = x T M i x + c T i x + d i Each M i is n × n , wlog symmetric Folklore result: QCQP is Strongly NP-hard

  3. A simple example x 2 max ( x 1 − 1) 2 + x 2 s.t. 2 ≥ 3 ( x 1 + 1) 2 + x 2 2 ≥ 3 x 2 10 + x 2 1 2 ≤ 2

  4. A simple example x 2 max ( x 1 − 1) 2 + x 2 s.t. 2 ≥ 3 ( x 1 + 1) 2 + x 2 2 ≥ 3 x 2 10 + x 2 1 2 ≤ 2 2 ) ( 0 , 2 ) ( 0 , −

  5. CDT (Celis-Dennis-Tapia) problem x T Q 0 x + c T min 0 x x T Q 1 x + c T s.t. 1 x + d 1 ≤ 0 x T Q 2 x + c T 2 x + d 2 ≤ 0 where Q 1 ≻ 0 , Q 2 ≻ 0

  6. CDT (Celis-Dennis-Tapia) problem x T Q 0 x + c T min 0 x x T Q 1 x + c T s.t. 1 x + d 1 ≤ 0 x T Q 2 x + c T 2 x + d 2 ≤ 0 where Q 1 ≻ 0 , Q 2 ≻ 0 Generalization of the trust-region subproblem: x T Qx + c T x min � x − µ � 2 ≤ r 2 s.t. which is solvable using many techniques

  7. Theorem (Barvinok, 1993) For each fixed integer p there is a polynomial-time algorithm that given a system x T M i x = 0 , 1 ≤ i ≤ p, x ∈ R n � x � = 1 , correctly determines feasibility. → nonconstructive.

  8. Weakening of Barvinok’s theorem For each fixed p ≥ 1 , there is an algorithm that given a system x T M i x = 0 , 1 ≤ i ≤ p, x ∈ R n � x � = 1 , and given 0 < ǫ < 1 , either • Proves that the system is infeasible , or • Proves that is ǫ -feasible, in time polynomial in the data and in log ǫ − 1 . (so still nonconstructive)

  9. Theorem (SIOPT, forthcoming). For each fixed m ≥ 1 there is an algorithm that given . = x T A 0 x + c T min f 0 ( x ) 0 x x T A i x + c T s.t. i x + d i ≤ 0 1 ≤ i ≤ m, where A 1 ≻ 0 , and 0 < ǫ < 1 , either (1) proves that the problem is infeasible, or (2) computes an ǫ -feasible vector x such that there exists no ˆ feasible x ∈ R n with f 0 ( x ) < f (ˆ x ) − ǫ in time polynomial in the number of bits in the data and log ǫ − 1

  10. Sketch: Given a system x T A i x + c T i x + d i ≤ 0 1 ≤ i ≤ m, where A 1 ≻ 0 , how to prove infeasibility or feasibility? Assume 1 x + d 1 = � x � 2 − 1 , x T A 1 x + c T and | f i ( x ) | ≤ U i .

  11. Sketch: Given a system x T A i x + c T i x + d i ≤ 0 1 ≤ i ≤ m, 1 x + d 1 = � x � 2 − 1 , and | f i ( x ) | ≤ U i . with x T A 1 x + c T x T A i x + c T i v 0 x + d i v 2 0 + s 2 i = 0 1 ≤ i ≤ m, (1a) s 2 i + w 2 − v 2 i 0 = 0 2 ≤ i ≤ m, (1b) U i m s 2 i + w 2 � x � 2 + s 2 + v 2 i � 1 + 0 = m + 1 . (1c) U i i =2

  12. x T A i x + c T i v 0 x + d i v 2 0 + s 2 i = 0 1 ≤ i ≤ m, (2a) s 2 i + w 2 − v 2 i 0 = 0 2 ≤ i ≤ m, (2b) U i m s 2 i + w 2 � x � 2 + s 2 i + v 2 � 1 + 0 = m + 1 . (2c) U i i =2 → (2a) for i = 1 is � x � 2 − v 2 0 + s 2 1 = 0. Adding it and all of (2b) yields m s 2 i + w 2 � x � 2 + s 2 − mv 2 � i 1 + 0 = 0 U i i =2 Together with (2c) this implies v 2 0 = 1 . If v 0 = 1 then (2a) means that x is feasible.

  13. New result on “true” version of CDT problem x T Q 0 x + c T min 0 x x T Q i x + c T s.t. i x + d i ≤ 0 , i = 1 , 2 where Q 1 ≻ 0 , Q 2 ≻ 0 . Sakaue, Nakatsukasa, Takeda, Iwata (2015); “simple” algorithm. Assume KKT conditions hold .

  14. H ( λ 1 , λ 2 ) x = y x T Q i x + c T i x + d i ≤ 0 , i = 1 , 2 λ i ( x T Q i x + c T i x + d i ) = 0 , i = 1 , 2 λ i ≥ 0 , i = 1 , 2 Here . H = Q 0 + λ 1 Q 1 + λ 2 Q 2 . y = − ( c 0 + λ 1 c 1 + λ 2 c 2 ) 1. Compute a polynomially large set of candidates for λ 1 , λ 2 . 2. Given λ 1 , λ 2 , solve Hx = y to obtain x .

  15. λ i ( x T Q i x + c T i x + d i ) = 0 , i = 1 , 2 is equivalent to   Q i − H c i  = 0 − H 0 y λ i det  c T y T d i i So, two determinantal equations λ 1 det M 1 ( λ 1 , λ 2 ) = λ 2 det M 2 ( λ 1 , λ 2 ) = 0 .

  16. λ i ( x T Q i x + c T i x + d i ) = 0 , i = 1 , 2 is equivalent to   Q i − H c i  = 0 − H 0 y λ i det  c T y T d i i So, two determinantal equations λ 1 det M 1 ( λ 1 , λ 2 ) = λ 2 det M 2 ( λ 1 , λ 2 ) = 0 . Recall H = Q 0 + λ 1 Q 1 + λ 2 Q 2 , y = − ( c 0 + λ 1 c 1 + λ 2 c 2 )

  17. λ i ( x T Q i x + c T i x + d i ) = 0 , i = 1 , 2 is equivalent to   Q i − H c i  = 0 − H 0 y λ i det  c T y T d i i So, two determinantal equations λ 1 det M 1 ( λ 1 , λ 2 ) = λ 2 det M 2 ( λ 1 , λ 2 ) = 0 . Theorem: If the two equations hold then: det B ( λ 1 ) = 0. Here, B , of the form λ 1 E + F , is the B´ ezoutian . B is n 2 × n 2 .

  18. Smale’s 17 th problem Can a zero of n polynomial equations on n unknowns be found approximately , on the average in polynomial time? • Beltr´ an and Pardo (2009) – a randomized (Las Vegas) uniform algorithm that computes an approximate zero in expected polynomial time urgisser, Cucker (2012) – a deterministic O ( n log log n ) (uniform) algo- • B¨ rithm for computing approximate zeros • Techniques: Homotopy (path-following method solving a sequence of problems), Newton’s method

  19. Smale’s 17 th problem Can a zero of n polynomial equations on n unknowns be found approximately , on the average in polynomial time? (abridged; and we are cheating) • Beltr´ an and Pardo (2009) – a randomized (Las Vegas) uniform algorithm that computes an approximate zero in expected polynomial time urgisser, Cucker (2012) – a deterministic O ( n log log n ) (uniform) algo- • B¨ rithm for computing approximate zeros • Techniques: Homotopy (path-following method solving a sequence of problems), Newton’s method But we are cheating: All of this is over C n , not R n

  20. Smale’s 17 th problem Can a zero of n polynomial equations on n unknowns be found approximately , on the average in polynomial time? (abridged; and we are cheating) • Beltr´ an and Pardo (2009) – a randomized (Las Vegas) uniform algorithm that computes an approximate zero in expected polynomial time urgisser, Cucker (2012) – a deterministic O ( n log log n ) (uniform) algo- • B¨ rithm for computing approximate zeros • Techniques: Homotopy (path-following method solving a sequence of problems), Newton’s method But we are cheating: All of this is over C n , not R n So what can be done over the reals?

  21. ACOPF Input: an undirected graph G . • For every vertex i , two variables: e i and f i • For every edge { k, m } , four (specific) quadratics: H Q H P k,m ( e k , f k , e m , f m ) , k,m ( e k , f k , e m , f m ) e f e f k k m m H Q H P m m,k ( e k , f k , e m , f m ) , m,k ( e k , f k , e m , f m ) k

  22. � min w k k � L P H P k,m ( e k , f k , e m , f m ) ≤ U P s.t. k ≤ ∀ k k { k,m }∈ δ ( k ) L Q � H Q k,m ( e k , f k , e m , f m ) ≤ U Q k ≤ ∀ k k { k,m }∈ δ ( k ) V L ≤ � ( e k , f k ) � ≤ V U ∀ k k k � H P v k = k,m ( e k , f k , e m , f m ) ∀ k { k,m }∈ δ ( k ) w k = F k ( v k )

  23. Complexity Theorem (2011) Lavaei and Low: OPF is (weakly) NP-hard on trees. Theorem (2014) van Hentenryck et al: OPF is (weakly) NP-hard on trees. Theorem (2007) B. and Verma (2009): OPF is strongly NP-hard on gen- eral graphs. Recent insight: use the SDP relaxation (Lavaei and Low, 2009 + many others) SDP Relaxation of OPF: Fact: The SDP relaxation sometimes has a rank-1 solution!! Fact: And when not, sometimes it gives a good bound.

  24. But: the SDP relaxation is always slow on large graphs • Real-life grids → > 10 4 vertices • SDP relaxation of OPF does not terminate But...

  25. But: the SDP relaxation is always slow on large graphs • Real-life grids → > 10 4 vertices • SDP relaxation of OPF does not terminate But... Fact? Real-life grids have small tree-width Definition 1: A graph has treewidth ≤ w if it has a chordal supergraph with clique number ≤ w + 1

  26. But: the SDP relaxation is always slow on large graphs • Real-life grids → > 10 4 vertices • SDP relaxation of OPF does not terminate But... Fact? Real-life grids have small tree-width Definition 2: A graph has treewidth ≤ w if it is a subgraph of an intersection graph of subtrees of a tree, with ≤ w + 1 subtrees overlapping at any vertex

  27. But: the SDP relaxation is always slow on large graphs • Real-life grids → > 10 4 vertices • SDP relaxation of OPF does not terminate But... Fact? Real-life grids have small tree-width Definition 2: A graph has treewidth ≤ w if it is a subgraph of an inter- section graph of subtrees of a tree, with ≤ w + 1 subtrees overlapping at any vertex (Seymour and Robertson, early 1980s)

  28. But: the SDP relaxation is always slow on large graphs • Real-life grids → > 10 4 vertices • SDP relaxation of OPF does not terminate But... Fact? Real-life grids have small tree-width Matrix-completion Theorem gives fast SDP implementations: Real-life grids with ≈ 3 × 10 3 vertices: → 20 minutes runtime

  29. Much previous work using treewidth • Bienstock and ¨ Ozbay (Sherali-Adams + treewidth) • Wainwright and Jordan (Sherali-Adams + treewidth) • Grimm, Netzer, Schweighofer • Laurent (Sherali-Adams + treewidth) • Lasserre et al (moment relaxation + treewidth) • Waki, Kim, Kojima, Muramatsu older work ... • Lauritzen (1996): tree-junction theorem • Bertele and Brioschi (1972) (Nemhauser 1960s): nonserial dynamic pro- gramming • Bounded tree-width in combinatorial optimization (early 1980s) (Arnborg et al plus too many authors)

Recommend


More recommend