A Rank-Constrained Optimization approach: Application to Factor Analysis Ramón A. Delgado, Juan C. Agüero, Graham C. Goodwin School of Electrical Engineering and Computer Science, The University of Newcastle, Australia.
Handling the Rank Constraints 1 2 Some Existing Representations A new representation of rank constraints 3 Rank-Constrained Optimisation 4 Application to Factor Analysis 5 Conclusions 6 Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 2 / 22
Motivation why include rank constraints? Complexity of a model (e.g rank of a Hankel matrix) Low-Rank Decomposition (Principal Components Analysis) Recent interest on sparse representations. Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 4 / 22
Some Difficulties with rank constraints rank { X } ≤ r Non-differentiable Nonlinear Combinatorial nature in optimization Find a differentiable representation for the rank constraints, and hopefully reduce the number of nonlinearities. Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 5 / 22
Some existing representations By construction � I r � 0 rank { X } ≤ r ⇐ ⇒ X = A B . 0 0 rank { X } ≤ r ⇐ ⇒ X = UV . where A ∈ R m × m , B ∈ R n × n , U ∈ R m × r , V ∈ R r × n . Disadvantage Already assign a structure to X . Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 7 / 22
Some existing representations Using the Characteristic Polynomial Consider that c i ( X ) , for i = 1 , . . . , n are the coefficients of the characteristic polynomial of X . X ∈ S n + , [Helmersson 2009] rank { X } ≤ r ⇐ ⇒ c n − r − 1 ( − X ) = 0 . X ∈ S n + , [d’Aspremont 2003] rank { X } = min v ∈ R n � n i = 1 v i s.t. c i ( G )( 1 − v i ) = 0 , v i ≥ 0 for i = 1 , . . . , n . Disadvantages Only valid for X ∈ S n + . c i ( X ) is, in general, nonlinear. Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 8 / 22
Some existing representations used in optimization Closely related to our representation ⇒ ∃ R ∈ R ( m − r ) × m such that I. Markovsky rank { X } ≤ r ⇐ RX = 0 m − r × m and R is full row rank X ∈ R m × n , but a rank constraint is now imposed on an auxiliary matrix. J. Dattorro X ∈ S n + , rank { X } ≤ r ⇐ ⇒ ∃ W ∈ Φ n , r such that trace ( WX ) = 0 . where Φ n , r = { W ∈ S n , 0 � W � I , trace ( W ) = n − r } Only valid for X ∈ S n + . Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 9 / 22
Main result 1 Theorem Let X ∈ R m × n then rank { X } ≤ r ⇐ ⇒ ∃ W ∈ Φ n , r , such that GX = 0 m × n where Φ n , r = { W ∈ S n , 0 � W � I , trace ( W ) = n − r } 1 submitted for publication Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 11 / 22
Advantages Differentiable Freedom to impose a desired structure on X (e.g. Hankel). Generalisation of Dattorro’s result Avoid some difficulties in Markovsky and d’Aspremont’s results. Disadvantage We still have a bilinear condition WX = 0 m × n . Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 12 / 22
Rank-Constrained Optimisation An equivalent representation for rank-constrained optimization problems P bi : θ ∈ R p min min W ∈ S n f ( θ ) P rco : θ ∈ R p f ( θ ) min s.t. θ ∈ Ω ≡ s.t. θ ∈ Ω X ( θ ) W = 0 m × n rank { X ( θ ) } ≤ r W ∈ Φ n , r Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 14 / 22
Rank-Constrained Optimisation We have developed a local optimisation method For the case X ( θ ) ∈ S n + , a global optimisation method. The bilinear constraint can be imposed in several ways: X ( θ ) ∈ R m × n � X ( θ ) W � = 0 ⇐ ⇒ X ( θ ) W = 0 m × n X ( θ ) ∈ S n trace ( X ( θ ) W ) = 0 ⇐ ⇒ X ( θ ) W = 0 n × n + Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 15 / 22
Factor Analysis Consider a measured output y k ∈ R N , factors f k ∈ R r , idiosyncratic noise v k ∈ R N , and a model: y k = Af k + v k (1) where A ∈ R N × n is a tall matrix. f k ∼ N ( 0 , Φ) (2) v k ∼ N ( 0 , Ψ) (3) Then, y k ∼ N ( 0 , Σ) , where Σ is given by Σ = A Φ A ⊤ + Ψ Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 17 / 22
Sparse Noise Covariance Existing approaches require Ψ diagonal. ◮ Ψ = σ 2 I , then there is a closed-form solution, e.g. PCA. ◮ Ψ diagonal, e.g. Maximum Likelihood (EM algorithm). Considering the advances on sparse representations. We propose to assume that Ψ is sparse. P rcofa : Ψ ∈ S N � Ψ � 1 min subject to rank { Σ − Ψ } ≤ r Ψ � 0 Σ − Ψ � 0 Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 18 / 22
Numerical Example Local Optimization method Consider r = 3 factors, N = 20 measured outputs and T = 100 Samples, Ψ ij = ( 0 . 7 ) | i − j | . and the performance index d ( P m ) = 1 − trace ( AP m A ⊤ ) trace ( AA ⊤ ) Method d ( · ) Total Time [s] PCA 0.1992 0.0136 RCO 0.1002 274.6732 EM 0.1330 0.0198 Table: Mean value over N mc = 100 Monte Carlo simulations of d ( · ) . Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 19 / 22
Conclusions We have developed a new representation of rank constraints. ◮ Second-order differentiable. ◮ Avoid several nonlinearities, excepting a bilinear constraint. We have developed two optimization algorithms. ◮ Local Optimisation. ◮ Global optimisation. We have applied the method: ◮ Factor Analysis with Correlated Errors. ◮ Sparse Control (tomorrow FrA04.2). Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 21 / 22
Thanks for your attention! Any questions? Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 22 / 22
Global Optimization Example Consider r = 1 factor, N = 3 measured outputs and T = 100 Samples. � � d ( · ) Ψ � 1 time [s] RCO 0.066 4.46 23.83 RCO-G 0.066 310.19 4 . 46 � vec ( � d ( · ) Ψ) � 1 time [s] RCO 0.165 11.12 43.20 RCO-G 0.218 9 . 17 1768.31 Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 23 / 22
Solving the Optimisation Problem Proposed method Solve P bi iteratively. In each iteration, solves a feasibility problem that deals with the bilinear constraint. Given the estimate θ m ∈ Ω , at iteration m W ∈ S n � X ( θ ) W � θ ∈ R p min min subject to f ( θ ) ≤ f ( θ m ) − η m θ ∈ Ω W ∈ Φ n , r Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 24 / 22
Branch and Bound 1st 2nd Global Optimum Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 25 / 22
ℓ 0 -norm ℓ 1 -norm Computationally more Computationally efficient. expensive. Unintuitive to chose the ℓ 0 -norm constraints have a sparse parameters clear interpretation. Poor handling of Can handle Group group-constraints constraints. ℓ 0 -norm equivalence P eq : w ∈ R p f ( θ ) θ ∈ R p min min P ℓ 0 : θ ∈ R p f ( θ ) min s.t. θ ∈ Ω ≡ θ ◦ w = 0 s.t. θ ∈ Ω 0 ≤ w ≤ 1 � θ � 0 ≤ r 1 ⊤ w = p − r Delgado, Agüero & Goodwin 2014 (IFAC 2014). Delgado, Agüero & Goodwin submitted to Automatica. Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 26 / 22
Local Optimization Ψ (d) True (e) PCA (f) EM (g) RCO Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 27 / 22
Concave Minimisation Concave minimization may lead to a combinatorial problem. (e.g. matrix-rank function is quasi-concave on the Positive Semidefinite Cone, thus leads to Concave-minimisation and “reverse-convex” constraints) Delgado, Agüero, Goodwin (UoN) IFAC 2014, Cape Town 28 / 22
Recommend
More recommend