On Recent Improvements in the Interior-Point Optimizer in MOSEK ISMP2015 – 14 July 2015 Pittsburgh (US) Andrea Cassioli, PhD andrea.cassioli@mosek.com www.mosek.com
Overview 1 Few words about MOSEK 2 New features in upcoming v8 3 QCQP to COP automatic conversion 4 Pitfalls in PSD detection 5 Some computational experience
Few words about MOSEK MOSEK is one of the leading provider of high-quality optimization software world-wide. General LP Convex QP SDP MIP CQP
Version 8 - work in progress 1 Improved presolve. • Faster. • Eliminator uses much less space. • Eliminator has increased stability emphasis. • Added some conic presolve. 2 Revised scaling procedure for conic problems: • Emphasize accuracy of the unscaled solution. • Scales semidefinite problems too. 3 Automatic dualizer for conic problems (no matrix variables). 4 Rewritten interior-point optimizer for conic problems. • Emphasize numerical stability for semidefinite problems. 5 QCQPs internally reformulated to conic form.
Convex Quadratic vs. Conic Quadratic From our practical experience the conic model is : • numerically more robust, • easier to exploit duality, • better when quadratic constraints are present, • better for primal infeasible problems, • a more general framework. However, users are still very much used to QCQPs formulations, therefore • Convert ( QO ) to conic form ( CQO ). • Map the primal and dual solutions back.
From QCQP to CQO The quadratic optimization model 1 2 x T Q T 0 x + c T x minimize 1 2 x T Q T subject to i x + a i : x ≤ b i , i = 1 . . . , m . (QO) Assumptions: • Symmetry: Q i = Q T i , i = , . . . , m . • Convexity: Q i � 0. Hence, Q i should be positive semidefinite.
The conic optimization model c T x minimize subject to Ax = b i , i = 1 , . . . , m , (CQO) x ∈ K , where K = K 1 × K 2 × · · · . Each K k can have the form • Linear: { x ∈ R n i | x ≥ 0 } . • Quadratic: { x ∈ R n i | x 1 ≥ � x 2: n i �} . • Rotated quadratic: { x ∈ R n i | 2 x 1 x 2 ≥ � x 3: n i � 2 , x 1 , x 2 ≥ 0 } .
The separable reformulation If L i s such that L i L T i = Q i are known, then the separable equivalent is 1 2 f T 0 f 0 + c T x minimize 1 2 f T subject to i f i + a i : x ≤ b i , i = 1 , . . . , m , (SQO) L T i x − f i = 0 . • The separable problem formulation is (much) bigger. • But the sparse representation may require much less storage if Q i is dense but low rank. • L i does not have to be lower triangular.
Conic reformulation From ( QO ) to ( CQO ): t 0 + c T x minimize subject to t i + a i : x = b i , i = , 1 . . . , m , (CQO) L T i x − f i = 0 , = 1 , z i 2 z i t i ≥ � f i � 2 . • Theory: • Both problems solves in the same worst case complexity using an interior-point method. • No bad duality states is introduced in the conic reformulation ART [1].
Conic Reformulation Converting QO to CQO is a trivial procedure once L i ‘s are known. So who should do that? the user! • Factorization may be already available. • Better control on the choice of the way to factorize Q i ‘s, However, MOSEK v8 will make the conversion automatically.
Quadratic PSD form characterization The statements are equivalent i ) Q i � 0 . ii ) λ min ( Q i ) ≥ 0 . ∃ L i | Q i = L i L T iii ) i . v T Q i v ≥ 0 , iv ) ∀ v . Practical observation: • How does the modeler knows ( QO ) is convex? • Claim: The modeler knows L i !
Automatic conversion implemented in MOSEK (I) Purpose is to compute L such that Q = LL T or in practice Q ≈ LL T considering rounding errors. Assumptions on the users: • Users applies this to (near) positive semidefinite problems. • Users prefer a false positive to a false negative.
How to deal with factorizations? Motivating example minimize − x 1 − x 2 ( x 1 − x 2 ) 2 subject to ≤ 0 , 0 ≤ x 1 , x 2 ≤ 1 Often in practice the quadratic constraints could be affected by a small error ε , i.e. � � 1 − 1 x T x ≤ 0 − 1 1 + ǫ Typical error sources: • Introduced by user. • Coming from finite precision floating point precision computations.
Practicabilities about the conversion Observe: • ǫ < 0 : The problem is not convex. • ǫ = 0 : x ∗ 1 = x ∗ 2 = 1. • ǫ > 0 : x ∗ 1 = x ∗ 2 = 0. Conclusions: • Hard to produce a 100% automatic fool proof conversion. • Conversion should be done at the modelling stage!
Automatic conversion implemented in MOSEK (II) Lemma If Q is symmetric positive semidefinite then it holds e T 1 Qe 1 = Q 11 ≥ 0 and Q 11 = 0 ⇒ Q 1: = Q :1 = 0 .
Automatic conversion implemented in MOSEK (III) Lemma If Q is symmetric positive semidefinite and Q 11 > 0 , then E 1 Q 1 E T = Q 1 1 0 Q 22 − Q 21 Q T Q 1 = 21 0 Q 11 where � � � Q 11 0 E = . � Q 21 / Q 11 I Moreover, Q 22 − Q 21 Q T 21 Q 11 will be positive semidefinite.
Automatic conversion implemented in MOSEK (IV) Hence, if Q is positive definite then Q = LL T where L = E 1 E 2 · · · E n . Fact: L will be lower triangular. But what if Q 11 ≈ 0?
Automatic conversion implemented in MOSEK (V) • Q 11 ≤ − ε then Q is said to be NOT positive semidefinite. • − ε < Q 11 ≤ ε then • Replace Q 11 by ε . • If the complete Q is determined PSD, then replace L :1 by 0 in the final result. • Default value: ε = 10 − 10 . The procedure will detect � 0 1 � 10 8 1 negative semidefinite.
Automatic conversion implemented in MOSEK (VI) Note the procedure is applied to a scaled Q i.e. SQS T where S = diag ( s ) and all diagonal elements of SQS T belongs to {− 1 , 0 , 1 } . Makes the usage of a absolute constant sensible.
MOSEK results The MOSEK procedure produces on our example: � � 1 0 L = . − 1 0
An alternative procedure • Q 11 ≤ − ε then Q is said to be NOT positive semidefinite. • − ε < Q 11 ≤ ε then replace Q 11 by ε . Take a look at the example � � 1 − 1 Q = − 1 1 and hence � � 1 0 = L 10 − 10 − 1 which most likely is not what the user intended because this implies x = 0.
Discussion • Procedure can be fooled. • Alternative approaches: • Revised Schnable and Eskow approach [5]. • Rank revealing Cholesky [4]. (Pivotting required!) • Alternatives are computational more complicated or (much more) expensive.
Preliminary computation results Setting: • 64 bit Linux. • 1 thread only. • v7.1 vs. v8 • Public and customer provided models. time Small ≤ 6 s Medium ≤ 60 s Large > 60 s An optimizer o is declared a winner if t o ≤ max( t min + 0 . 01 , 1 . 005 t min ) .
Algorithms in MOSEK • ( QO ): Solves a homogenized KKT system using (=nonsymmetric primal-dual algorithm) ( [3] ). • ( CQO ): Symmetric primal-dual algorithm based on the Nesterov-Todd direction ART ([2]).
Quadratic problems (linear constraints only) small medium large 7.1 8.0 7.1 8.0 7.1 8.0 Num. 220 220 10 10 1 1 Firsts 187 158 2 8 0 1 Total time 128.41 56.20 359.13 311.56 444.28 244.01
Param ILS instances Available at www.cs.ubc.ca/labs/beta/Projects/ParamILS/ . 7.1 8.0 Num. 100 100 Firsts 0 100 Total time 917.955 90.179
Quadratically constrained problems small medium 7.1 8.0 7.1 8.0 Num. 239 239 8 8 Firsts 161 150 3 5 Total time 350.790 94.290 1360.417 213.454
Discussion • Conic reformulations wins because • it requires less iterations. • dualization sometimes lead to huge wins. • employs better linear algebra (newer code path). However, for smallish models the nonconic formulation is better.
Summary • MOSEK version 8 will internally solve quadratic and quadratically constrained problems on conic form. • Improves robustness, • Solution speed on average. • Checking positive semi definiteness is tricky. • It is recommended to formulate problem on conic form • or as a separable problem.
Thank you! Andrea Cassioli, PhD andrea.cassioli@mosek.com www.mosek.com
References E. D. Andersen, C. Roos, and T. Terlaky. Notes on duality in second order and p -order cone optimization. Optimization , 51(4):627–643, 2002. E. D. Andersen, C. Roos, and T. Terlaky. On implementing a primal-dual interior-point method for conic quadratic optimization. Math. Programming , 95(2), February 2003. E. D. Andersen and Y. Ye. On a homogeneous algorithm for the monotone complementarity problem. Math. Programming , 84(2):375–399, February 1999. M. Gu and L. Miranian. Strong rank revealing cholesky factorization. Electronic Transactions on Numerical Analysis , 17:76–92, 2004. R. B. Schnable and E. Eskow. A revised modified Cholesky Factorization Algorithm. SIAM J. on Optim. , 9(4):1135–1148, 1999.
Recommend
More recommend