sparsity and decomposition in semidefinite optimization
play

Sparsity and decomposition in semidefinite optimization Lieven - PowerPoint PPT Presentation

Sparsity and decomposition in semidefinite optimization Lieven Vandenberghe Electrical and Computer Engineering, UCLA Joint work with Martin S. Andersen, Joachim Dahl, Xin Jiang, Yifan Sun MIDAS Seminar April 6, 2018 Semidefinite program


  1. Sparsity and decomposition in semidefinite optimization Lieven Vandenberghe Electrical and Computer Engineering, UCLA Joint work with Martin S. Andersen, Joachim Dahl, Xin Jiang, Yifan Sun MIDAS Seminar April 6, 2018

  2. Semidefinite program (SDP) tr ( CX ) minimize tr ( A i X ) = b i , i = 1 , . . ., m subject to X � 0 • variable is n × n symmetric matrix X • tr ( Y X ) = � ij Y ij X ij is standard matrix inner product • inequality X � 0 means X is positive semidefinite Applications • matrix inequalities arise naturally in many areas (for example, control, statistics) • relaxations of nonconvex quadratic and polynomial optimization • used in convex modeling systems (CVX, YALMIP, CVXPY, ...) 1

  3. Sparse semidefinite optimization tr ( CX ) minimize tr ( A i X ) = b i , i = 1 , . . ., m subject to X � 0 • large SDPs often have sparse coefficient matrices C , A i • solution X is usually dense, even when C and A i are very sparse This talk • structure in solution X that results from sparsity in coefficients A i , C • applications to different types of SDP algorithms 2

  4. Band structure cost of solving SDP with banded matrices (bandwidth 11 with m = 100 constraints) 10 3 SDPT3 SeDuMi Time per iteration (seconds) 10 2 10 1 10 0 O ( n 2 ) 10 − 1 10 − 2 10 2 10 3 n • for bandwidth 1 (linear program), cost/iteration is linear in n • for bandwidth > 1 , cost grows as n 2 or faster [Andersen, Dahl, Vandenberghe 2010] 3

  5. Power flow optimization an optimization problem with non-convex quadratic constraints Variables • complex voltage v i at each node (bus) of the network • complex power flow s ij entering the link (line) from node i to node j Non-convex constraints • (lower) bounds on voltage magnitudes v min ≤ | v i | ≤ v max • flow balance equations: s ij s ji g ij | v i − v j | 2 s ij + s ji = ¯ g ij bus i bus j g ij is admittance of line from node i to j 4

  6. Semidefinite relaxation of optimal power flow problem • introduce matrix variable X = Re ( vv H ) , i.e. , with elements X ij = Re ( v i ¯ v j ) • voltage bounds and flow balance equations are convex in X : v 2 min ≤ X ii ≤ v 2 v min ≤ | v i | ≤ v max −→ max g ij | v i − v j | 2 −→ g ij ( X ii + X j j − 2 X ij ) s ij + s ji = ¯ s ij + s ji = ¯ • replace constraint X = Re ( vv H ) with weaker constraint X � 0 • relaxation is exact if optimal X has rank two Sparsity in SDP relaxation: off-diagonal X ij appears in constraints only if there is a line between buses i and j [Jabr 2006] [Bai et al. 2008] [Lavaei and Low 2012] [Molzahn et al. 2013] ... 5

  7. Outline 1. Chordal graphs and sparse matrices 2. Decomposition of sparse matrix cones 3. Multifrontal algorithms for logarithmic barrier functions 4. Minimum rank positive semidefinite completion

  8. Sparsity graph   1 2 A 11 A 21 A 31 0 A 51     A 21 A 22 0 A 42 0     A 31 0 A 33 0 A 53 A = 5     0 A 42 0 A 44 A 54     A 51 0 A 53 A 54 A 55   3 4 • sparsity pattern of symmetric n × n matrix is set of ‘nonzero’ positions E ⊆ {{ i , j } | i , j ∈ { 1 , 2 , . . ., n }} • A has sparsity pattern E if A ij = 0 if i � j and { i , j } � E • notation: A ∈ S n E • represented by undirected graph ( V , E ) with edges E , vertices V = { 1 , . . ., n } • clique (maximal complete subgraph) forms maximal ‘dense’ principal submatrix 6

  9. Sparsity graph   1 2 A 11 A 31 A 51 A 21 0     A 21 A 22 0 A 42 0     A 31 0 A 33 0 A 53 A = 5     0 A 42 0 A 44 A 54     A 51 0 A 53 A 54 A 55   3 4 • sparsity pattern of symmetric n × n matrix is set of ‘nonzero’ positions E ⊆ {{ i , j } | i , j ∈ { 1 , 2 , . . ., n }} • A has sparsity pattern E if A ij = 0 if i � j and { i , j } � E • notation: A ∈ S n E • represented by undirected graph ( V , E ) with edges E , vertices V = { 1 , . . ., n } • clique (maximal complete subgraph) forms maximal ‘dense’ principal submatrix 6

  10. Chordal graph • undirected graph with vertex set V , edge set E ⊆ {{ v , w } | v , w ∈ V } G = ( V , E ) • a chord of a cycle is an edge between non-consecutive vertices • G is chordal if every cycle of length greater than three has a chord a a f f b b e c e c d d not chordal chordal also known as triangulated, decomposable, rigid circuit graph, ... 7

  11. History chordal graphs have been studied in many disciplines since the 1960s • combinatorial optimization (a class of perfect graphs) • linear algebra (sparse factorization, completion problems) • database theory • machine learning (graphical models, probabilistic networks) • nonlinear optimization (partial separability) first used in semidefinite optimization by Fujisawa, Kojima, Nakata (1997) 8

  12. Chordal sparsity and Cholesky factorization Cholesky factorization of positive definite A ∈ S n E : PAP T = LDL T P a permutation, L unit lower triangular, D positive diagonal • if E is chordal, then there exists a permutation for which P T ( L + L T ) P ∈ S n E A has a ‘zero fill’ Cholesky factorization • if E is not chordal, then for every P there exist positive definite A ∈ S n E for which P T ( L + L T ) P � S n E [Rose 1970] 9

  13. Examples Simple patterns Sparsity pattern of a Cholesky factor : edges of non-chordal sparsity pattern : fill entries in Cholesky factorization a chordal extension of non-chordal pattern 10

  14. Supernodal elimination tree (clique tree) 1 2 13 , 14 , 15 , 16 , 17 3 4 14 , 15 , 17 16 , 17 5 6 11 , 12 10 7 8 10 , 16 9 8 , 9 10 11 8 , 10 9 , 10 9 , 16 12 13 1 , 2 5 , 6 , 7 3 14 15 6 , 7 16 4 17 • vertices of tree are cliques of chordal sparsity graph • top row of each block is intersection of clique with parent clique • bottom rows are (maximal) supernodes ; form a partition of { 1 , 2 , . . ., n } • for each v , cliques that contain v form a subtree of elimination tree 11

  15. Supernodal elimination tree (clique tree) 1 2 13 , 14 , 15 , 16 , 17 3 4 14 , 15 , 17 16 , 17 5 6 11 , 12 10 7 8 10 , 16 9 8 , 9 10 11 8 , 10 9 , 10 9 , 16 12 13 5 , 6 , 7 1 , 2 3 14 15 6 , 7 16 4 17 • vertices of tree are cliques of chordal sparsity graph • top row of each block is intersection of clique with parent clique • bottom rows are (maximal) supernodes ; form a partition of { 1 , 2 , . . ., n } • for each v , cliques that contain v form a subtree of elimination tree 11

  16. Outline 1. Chordal graphs and sparse matrices 2. Decomposition of sparse matrix cones 3. Multifrontal algorithms for logarithmic barrier functions 4. Minimum rank positive semidefinite completion

  17. Sparse matrix cones we define two matrix cones in S n E (symmetric n × n matrices with pattern E ) • positive semidefinite matrices with sparsity pattern E S n + ∩ S n E = { X ∈ S n E | X � 0 } • matrices with sparsity pattern E that have a positive semidefinite completion Π E ( S n + ) = { Π E ( X ) | X � 0 } Π E is projection on S n E Properties • two cones are convex • closed, pointed, with nonempty interior (relative to S n E ) • form a pair of dual cones (for the trace inner product) 12

  18. Positive semidefinite matrices with chordal sparsity pattern S ∈ S n E is positive semidefinite if and only if it can be expressed as � P T with H i � 0 S = γ i H i P γ i cliques γ i (for an index set β , P β is 0 - 1 matrix of size | β | × n with P β x = x β for all x ) = + + P T P T P T γ 1 H 1 P γ 1 � 0 γ 2 H 2 P γ 2 � 0 γ 3 H 3 P γ 3 � 0 S � 0 [Griewank and Toint 1984] [Agler, Helton, McCullough, Rodman 1988] 13

  19. Decomposition from Cholesky factorization • example with two cliques: H 1 = + H 2 H 1 and H 2 follow by combining columns in Cholesky factorization = + • readily computed from update matrices in multifrontal Cholesky factorization 14

  20. PSD completable matrices with chordal sparsity X ∈ S n E has a positive semidefinite completion if and only if X γ i γ i � 0 for all cliques γ i follows from duality and clique decomposition of positive semidefinite cone Example (three cliques γ 1 , γ 2 , γ 3 ) X γ 1 γ 1 � 0 X γ 2 γ 2 � 0 PSD completable X X γ 3 γ 3 � 0 [Grone, Johnson, Sá, Wolkowicz, 1984] 15

  21. Example: sparse nearest matrix problems • find nearest sparse PSD-completable matrix with given sparsity pattern � X − A � 2 minimize F X ∈ Π E ( S n + ) subject to • find nearest sparse PSD matrix with given sparsity pattern � S + A � 2 minimize F S ∈ S n + ∩ S n subject to E these two problems are duals K = Π E ( S n + ) X ⋆ − K ∗ = −( S n + ∩ S n E ) A − S ⋆ 16

  22. Decomposition methods if E is chordal, the two problems can be written as � X − A � 2 Primal: minimize F X γ i γ i � 0 subject to for all cliques γ i � A + � P T γ i H i P γ i � 2 Dual: minimize F i H i � 0 subject to for all cliques γ i Algorithms • Dykstra’s algorithm (dual block coordinate ascent) • (accelerated) dual projected gradient algorithm (FISTA) • Douglas-Rachford splitting, ADMM sequence of projections on PSD cones of order | γ i | (eigenvalue decomposition) 17

Recommend


More recommend