si231 matrix computations lecture 6 positive semidefinite
play

SI231 Matrix Computations Lecture 6: Positive Semidefinite Matrices - PowerPoint PPT Presentation

SI231 Matrix Computations Lecture 6: Positive Semidefinite Matrices Ziping Zhao Fall Term 20202021 School of Information Science and Technology ShanghaiTech University, Shanghai, China Lecture 6: Positive Semidefinite Matrices positive


  1. SI231 Matrix Computations Lecture 6: Positive Semidefinite Matrices Ziping Zhao Fall Term 2020–2021 School of Information Science and Technology ShanghaiTech University, Shanghai, China

  2. Lecture 6: Positive Semidefinite Matrices • positive semidefinite matrices • application: subspace method for super-resolution spectral analysis • application: Euclidean distance matrices • matrix inequalities Ziping Zhao 1

  3. Hightlights • a matrix A ∈ S n is said to be positive semidefinite (PSD) if x T Ax ≥ 0 , for all x ∈ R n ; and positive definite (PD) if for all x ∈ R n with x � = 0 x T Ax > 0 , • a matrix A ∈ S n is PSD (resp. PD) – if and only if its eigenvalues are all non-negative (resp. positive); – if and only if it can be factored as A = B T B for some B ∈ R m × n • in this lecture, we will deal with the real-symmetric matrices–the Hermitian case follows along the same lines Ziping Zhao 2

  4. Quadratic Form Let A ∈ S n . For x ∈ R n , the matrix product x T Ax is called a quadratic form. • some basic facts (try to verify): – x T Ax = � n � n j =1 x i x j a ij = � n i + � n − 1 � n i =1 a ii x 2 j = i +1 2 a ij x i x j i =1 i =1 – x T Ax = � n i + � n − 1 � n i =1 a ii x 2 j = i +1 ( a ij + a ji ) x i x j for general A ∈ R n × n , i =1 there may exist A 1 and A 2 s.t. x T A 1 x = x T A 2 x ∗ it suffices to consider unique symmetric A for general A ∈ R n × n since x T Ax = x T � 1 � 2 ( A + A T ) x – complex case: ∗ the quadratic form is defined as x H Ax , where x ∈ C n ∗ for A ∈ H n , x H Ax is real for any x ∈ C n Ziping Zhao 3

  5. Positive Semidefinite Matrices A matrix A ∈ S n is said to be • positive semidefinite (PSD) if x T Ax ≥ 0 for all x ∈ R n • positive definite (PD) if x T Ax > 0 for all x ∈ R n with x � = 0 • indefinite if both A and − A are not PSD Notation: • A � 0 means that A is PSD • A ≻ 0 means that A is PD • A � 0 means that A is indefinite • if A is PD, then it is also PSD • The concepts negative semidefinite and negative definite may be defined by reversing the inequalities or, equivalently, by saying − A is PSD or PD, respectively. Ziping Zhao 4

  6. Example: Covariance Matrices • let y 0 , y 2 , . . . y T − 1 ∈ R n be a sequence of multi-dimensional data samples – examples: patches in image processing, multi-channel signals in signal pro- cessing, history of returns of assets in finance [Brodie-Daubechies-et al.’09] , ... � T − 1 µ y = 1 • sample mean: ˆ t =0 y t T � T − 1 ˆ C y = 1 µ y ) T • sample covariance: t =0 ( y t − ˆ µ y )( y t − ˆ T µ y ) T x | 2 ≥ 0 • a sample covariance is PSD: x T ˆ � T − 1 C y x = 1 t =0 | ( y t − ˆ T • the (statistical) covariance of y t is also PSD – to put into context, assume that y t is a wide-sense stationary random process – the covariance, defined as C y = E[( y t − µ y )( y t − µ y ) T ] where µ y = E[ y t ] , can be shown to be PSD Ziping Zhao 5

  7. Example: Hessian • let f : R n → R be a twice differentiable function • the Hessian of f , denoted by ∇ 2 f ( x ) ∈ S n , is a matrix whose ( i, j ) th entry is given by ∂ 2 f � � ∇ 2 f ( x ) i,j = ∂x i ∂x j • Fact: f is convex if and only if ∇ 2 f ( x ) � 0 for all x in the problem domain • example: consider the quadratic function f ( x ) = 1 2 x T Rx + q T x + c It can be verified that ∇ 2 f ( x ) = R . Thus, f is convex if and only if R � 0 Ziping Zhao 6

  8. Illustration of Quadratic Functions 20 10 15 5 f(x) 10 f(x) 0 5 −5 0 1 1 −10 0.5 0.5 1 1 0.5 0 0.5 0 0 0 −0.5 −0.5 −0.5 −0.5 −1 −1 −1 −1 x1 x2 x1 x2 (a) PSD A . (b) indefinite A . Ziping Zhao 7

  9. PSD Matrix Inequalities • the notion of PSD matrices can be used to define inequalities for matrices • PSD matrix inequalities are frequently used in topics like semidefinite programming • definition: – A � B means that A − B is PSD – A ≻ B means that A − B is PD – A � B means that A − B is indefinite • results that immediately follow from the definition: let A , B , C ∈ S n . – A � 0 , α ≥ 0 (resp. A ≻ 0 , α > 0 ) = ⇒ α A � 0 (resp. α A ≻ 0 ) – A , B � 0 (resp. A � 0 , B ≻ 0 ) = ⇒ A + B � 0 (resp. A + B ≻ 0 ) – A � B , B � C (resp. A � B , B ≻ C ) = ⇒ A � C (resp. A ≻ C ) – A � B does not imply B � A Ziping Zhao 8

  10. PSD Matrix Inequalities • more results: let A , B ∈ S n . – A � B = ⇒ λ k ( A ) ≥ λ k ( B ) for all k ; the converse is not always true – A � I (resp. A ≻ I ) ⇐ ⇒ λ k ( A ) ≥ 1 for all k (resp. λ k ( A ) > 1 for all k ) – I � A (resp. I ≻ A ) ⇐ ⇒ λ k ( A ) ≤ 1 for all k (resp. λ k ( A ) < 1 for all k ) ⇒ B − 1 � A − 1 – if A , B ≻ 0 then A � B ⇐ • some results as consequences of the above results: – for A � B � 0 , det( A ) ≥ det( B ) – for A � B , tr( A ) ≥ tr( B ) – for A � B ≻ 0 , tr( A − 1 ) ≤ tr( B − 1 ) Ziping Zhao 9

  11. PSD Matrix Inequalities • the Schur complement: let � A � B X = , B T C where A ∈ S m , B ∈ R m × n , C ∈ S n with C ≻ 0 . Let S = A − BC − 1 B T , which is called the Schur complement of C . • We have X � 0 (resp. X ≻ 0 ) ⇐ ⇒ S � 0 (resp. S ≻ 0 ) – example: let C be PD. By the Schur complement, C − bb T � 0 1 − b T C − 1 b ≥ 0 ⇐ ⇒ Ziping Zhao 10

  12. PSD Matrices and Eigenvalues Theorem 5.1. Let A ∈ S n , and let λ 1 , . . . , λ n be the eigenvalues of A . We have 1. A � 0 ⇐ ⇒ λ i ≥ 0 for i = 1 , . . . , n 2. A ≻ 0 ⇐ ⇒ λ i > 0 for i = 1 , . . . , n • proof: let A = VΛV T be the eigendecomposition of A . x T VΛV T x ≥ 0 , for all x ∈ R n A � 0 ⇐ ⇒ z T Λz ≥ 0 , for all z ∈ R ( V T ) = R n ⇐ ⇒ i =1 λ i | z i | 2 ≥ 0 , � n for all z ∈ R n ⇐ ⇒ ⇐ ⇒ λ i ≥ 0 for all i The PD case is proven by the same manner. Ziping Zhao 11

  13. Example: Ellipsoid • an ellipsoid of R n centered at 0 is defined as E = { x ∈ R n | x T P − 1 x ≤ 1 } , for some PD P ∈ S n l1 l2 0 • let P = VΛV T be the eigendecomposition – V determines the directions of the semi-axes – λ 1 , . . . , λ n determine the lengths of the semi-axes 1 2 – ℓ i = λ i v i Ziping Zhao 12

  14. Example: Ellipsoid • an ellipsoid of R n centered at 0 is defined as E = { x ∈ R n | x T P − 1 x ≤ 1 } , for some PD P ∈ S n l1 l2 0 • note: – in direction v 1 , x T P − 1 x is large, hence ellipsoid is fat in direction v 1 – in direction v n , x T P − 1 x is small, hence ellipsoid is thin in direction v n � – λ max /λ min gives maximum eccentricity E = { x ∈ R n | x T Q − 1 x ≤ 1 } , for some PD Q ∈ S n , the E ⊇ ˜ • ˜ E ⇐ ⇒ A � B Ziping Zhao 13

  15. Example: Multivariate Gaussian Distribution • probability density function for a Gaussian-distributed vector x ∈ R n : � � 1 − 1 2( x − µ ) T Σ − 1 ( x − µ ) p ( x ) = 2 exp n 1 (2 π ) 2 (det( Σ )) where µ and Σ are the mean and covariance of x , resp. – Σ is PD – Σ determines how x is spread, by the same way as in ellipsoid Ziping Zhao 14

  16. Example: Multivariate Gaussian Distribution 0.15 0.25 0.2 0.1 0.15 f(x) f(x) 0.1 0.05 0.05 0 0 2 2 0 0 3 3 −2 2 −2 2 1 1 0 0 −1 −1 −2 −2 x2 x2 −3 −3 x1 x1 � 1 � 1 � � 0 0 . 8 (a) µ = 0 , Σ = . (b) µ = 0 , Σ = . 0 1 0 . 8 1 Ziping Zhao 15

  17. Some Properties of PSD Matrices • it can be directly seen from the definition that – A � 0 = ⇒ a ii ≥ 0 for all i – A ≻ 0 = ⇒ a ii > 0 for all i • A is PSD, x T Ax = 0 ⇐ ⇒ Ax = 0 for a x . ( A is PD ⇐ ⇒ A is nonsingular.) • extension (also direct): partition A as � � A 11 A 12 A = . A 21 A 22 Then, A � 0 = ⇒ A 11 � 0 , A 22 � 0 . Also, A ≻ 0 = ⇒ A 11 ≻ 0 , A 22 ≻ 0 • further extension: – a principal submatrix of A , denoted by A I , where I = { i 1 , . . . , i m } ⊆ { 1 , . . . , n } , m < n , is a submatrix obtained by keeping only the rows and columns indicated by I ; i.e., [ A I ] jk = a i j ,i k for all j, k ∈ { 1 , . . . , m } – if A is PSD (resp. PD), then any principal submatrix of A is PSD (resp. PD) Ziping Zhao 16

  18. Some Properties of PSD Matrices Property 5.1. Let A ∈ S n , B ∈ R n × m , and C = B T AB . We have the following properties: 1. A � 0 = ⇒ C � 0 (specially, A ≻ 0 = ⇒ C � 0 ) 2. suppose A ≻ 0 . It holds that C ≻ 0 ⇐ ⇒ B has full column rank 3. suppose B is nonsingular. It holds that A ≻ 0 ⇐ ⇒ C ≻ 0 , and that A � 0 ⇐ ⇒ C � 0 . • proof sketch: the 1st property is trivial. For the 2nd property, observe ⇒ z T Az > 0 , ∀ z ∈ R ( B ) \ { 0 } . C ≻ 0 ⇐ ( ∗ ) If A ≻ 0 , ( ∗ ) reduces to C ≻ 0 ⇐ ⇒ Bx � = 0 , ∀ x � = 0 (or B has full column rank). The 3rd property is proven by the similar manner. Ziping Zhao 17

Recommend


More recommend