stiefel manifolds and their applications
play

Stiefel Manifolds and their Applications Pierre-Antoine Absil - PowerPoint PPT Presentation

Stiefel Manifolds and their Applications Pierre-Antoine Absil (UCLouvain) CESAME seminar 22 September 2009 1 Structure Definition and visualization A glimpse of applications Geometry of the Stiefel manifolds Applications 2


  1. Stiefel Manifolds and their Applications Pierre-Antoine Absil (UCLouvain) CESAME seminar 22 September 2009 1

  2. Structure ◮ Definition and visualization ◮ A glimpse of applications ◮ Geometry of the Stiefel manifolds ◮ Applications 2

  3. Collaborations ◮ Chris Baker (Sandia) ◮ Thomas Cason (UCLouvain) ◮ Kyle Gallivan (Florida State University) ◮ Damien Laurent (UCLouvain) ◮ Rob Mahony (Australian National University) ◮ Chafik Samir (U Clermont-Ferrand) ◮ Rodolphe Sepulchre (U of Li` ege) ◮ Fabian Theis (TU Munich) ◮ Paul Van Dooren (UCLouvain) ◮ ... 3

  4. Definition Stiefel manifold: Definition The (compact) Stiefel manifold V n , p is the set of all p -tuples ( x 1 , . . . , x p ) of orthonormal vectors in R n . If we turn p -tuples into n × p matrices as follows � � ( x 1 , . . . , x p ) �→ · · · , x 1 x p the definition becomes V n , p = { X ∈ R n × p : X T X = I p } . 4

  5. Definition Visualization: an element of V 3 , 2 5

  6. Definition Stiefel manifold: (very unfaithful) artist view V n , p 6

  7. Definition Stiefel manifold: optimization problems R f V n , p 7

  8. Definition Stiefel manifold: optimization algorithms R x f V n , p 8

  9. Definition Stiefel manifold: Extensions ◮ Recall: Real case: V p ( R n ) = { X ∈ R n × p : X T X = I p } =: V n , p . ◮ Complex case: V p ( C n ) = { X ∈ C n × p : X H X = I p } . ◮ Quaternion case: V p ( H n ) = { X ∈ H n × p : X ∗ X = I p } . ◮ If M is a Riemannian manifold, one can define V p ( TM ) = { ( ξ 1 , . . . , ξ p ) |∃ x ∈ M : ξ i ∈ T x M , � ξ i , ξ j � = δ ij } . 9

  10. Definition Stiefel manifold: Particular cases ◮ Recall: Real case: V p ( R n ) = { X ∈ R n × p : X T X = I p } =: V n , p . ◮ p = 1: the sphere V n , 1 = { x ∈ R n : x T x = 1 } . ◮ p = n : the orthogonal group V n , n = O n = { X ∈ R n × n : X T X = I n } . 10

  11. Definition Notation ◮ E. Stiefel (1935): V n , m (compact), V ∗ n , m (noncompact). ◮ I. M. James (1976): O n , k (compact) Stiefel manifold, O ∗ n , k noncompact Stiefel manifold, V n , k in the real case, W n , k in the complex case, X n , k in the quaternion case. ◮ Helmke & Moore (1994): St ( k , n ) compact Stiefel manifold, ST ( k , n ) noncompact Stiefel manifold. ◮ Edelman, Arias, & Smith (1998): V n , p . ◮ Bridges & Reich (2001): V k ( R n ). ◮ Bloch et al. (2006): V ( n , N ) = { Q ∈ R nN ; QQ T = I n } . 11

  12. Glimpse of applications A glimpse of applications ◮ Principal component analysis ◮ Lyapunov exponents of a dynamical system ◮ Procrustes problem ◮ Blind Source Separation - soft dimension reduction 12

  13. Geometry Geometry ◮ Dimension ◮ Tangent spaces ◮ Projection onto tangent spaces ◮ Geodesics 13

  14. Geometry Stiefel manifold: dimension Dimension of V n , p : ◮ 1st vector: one unit-norm constraint: n − 1 DOF. ◮ 2nd vector: unit-norm and orthogonal to 1st: n − 2 DOF. ◮ ... ◮ p th vector: n − p DOF. Total: dim( V n , p ) = pn − (1 + 2 + · · · + p ) = pn − p ( p + 1) / 2 = p ( n − p ) + p ( p − 1) / 2 . 14

  15. Geometry Stiefel manifold: tangent space ˙ Y (0) T X V n , p X V n , p Y ( t ) 15

  16. Geometry Stiefel manifold: tangent space Let X ∈ V n , p and let Y ( t ) be a curve on V n , p with Y (0) = X . Then ˙ Y (0) is a tangent vector to V n , p at X . The set of all such vectors is the tangent space to V n , p at X . We have Y ( t ) T Y ( t ) = I p for all t d d t ( Y ( t ) T Y ( t )) = 0 for all t Y (0) T Y (0) + Y (0) T ˙ ˙ Y (0) = 0 X T ˙ Y (0) is skew Y (0) = X Ω + X ⊥ K , Ω T = − Ω . ˙ Hence T X V n , p = { X Ω + X ⊥ K : Ω T = − Ω , K ∈ R ( n − p ) × p } . 16

  17. Geometry Stiefel manifold: projection onto the tangent space Z P T X V n , p ( Z ) ˙ Y (0) T X V n , p X V n , p Y ( t ) 17

  18. Geometry Stiefel manifold: projection onto the tangent space ◮ Tangent space: T X V n , p = { X Ω + X ⊥ K : Ω T = − Ω , K ∈ R ( n − p ) × p } . ◮ Normal space: N X V n , p = { XS : S T = S } . ◮ Projection onto the tangent space: P T X V n , p ( Z ) = Z − X sym ( X T Z ) = ( I − XX T ) Z + X skew ( X T Z ) , where sym ( M ) = 1 2 ( M + M T ) and skew ( M ) = 1 2 ( M − M T ). 18

  19. Geometry Stiefel manifold: geodesics X V n , p 19

  20. Geometry Stiefel manifold: geodesics A curve X ( t ) on V n , p is a geodesic if, for all t , ¨ X ( t ) ∈ N X ( t ) V n , p . Ross Lippert showed that � X (0) T ˙ X (0) T ˙ � − ˙ X (0) X (0) I 2 p , p e − tX (0) T ˙ � � ˙ X (0) . X ( t ) = X (0) X (0) exp t X (0) T ˙ X (0) I 20

  21. Geometry Stiefel manifold: quotient geodesics Bijection between V n , p and O n / O n − p : U � �� � � � : U T U = I n } ∈ O n / O n − p V n , p ∋ X ↔ { X X ⊥ Quotient geodesics: If � A � − B T U ( t ) = U (0) exp t . 0 B then U : , 1: p ( t ) ∈ V n , p follows a quotient geodesic . 21

  22. Applications Applications ◮ Principal component analysis ◮ Lyapunov exponents of a dynamical system ◮ Procrustes problem ◮ Blind Source Separation - soft dimension reduction 22

  23. Applications Principal component analysis ◮ Let A = A T ∈ R n × n . ◮ Goal: Compute the p dominant eigenvectors of A . ◮ Principle: Let N = diag ( p , p − 1 , · · · , 1) and solve tr ( X T AXN ) . max X T X = I p The columns of X are the p dominant eigenvectors or A . ◮ A basic method: Steepest-descent on V n , p . ◮ Let f : R n × p → R : X �→ tr ( X T AXN ). ◮ We have 1 2 grad f ( X ) = AXN . ◮ Thus 1 2 grad f | V n , p ( X ) = P T X V n , p ( AXN ) = AXN − X sym ( X T AXN ), where sym ( Z ) := ( Z + Z T ) / 2. ◮ Basic algorithm: Follow ˙ X = grad f | V n , p ( X ). 23

  24. Applications Principal component analysis ◮ Let A = A T ∈ R n × n . ◮ Goal: Compute the p dominant eigenvectors of A . ◮ Principle: Let N = diag ( p , p − 1 , · · · , 1) and solve tr ( X T AXN ) . max X T X = I p The columns of X are the p dominant eigenvectors or A . ◮ A basic method: Steepest-descent on V n , p . ◮ Let f : R n × p → R : X �→ tr ( X T AXN ). ◮ We have 1 2 grad f ( X ) = AXN . ◮ Thus 1 2 grad f | V n , p ( X ) = P T X V n , p ( AXN ) = AXN − X sym ( X T AXN ), where sym ( Z ) := ( Z + Z T ) / 2. ◮ Basic algorithm: Follow ˙ X = grad f | V n , p ( X ). 24

  25. Applications Principal component analysis ◮ Let A = A T ∈ R n × n . ◮ Goal: Compute the p dominant eigenvectors of A . ◮ Principle: Let N = diag ( p , p − 1 , · · · , 1) and solve tr ( X T AXN ) . max X T X = I p The columns of X are the p dominant eigenvectors or A . ◮ A basic method: Steepest-descent on V n , p . ◮ Let f : R n × p → R : X �→ tr ( X T AXN ). ◮ We have 1 2 grad f ( X ) = AXN . ◮ Thus 1 2 grad f | V n , p ( X ) = P T X V n , p ( AXN ) = AXN − X sym ( X T AXN ), where sym ( Z ) := ( Z + Z T ) / 2. ◮ Basic algorithm: Follow ˙ X = grad f | V n , p ( X ). 25

  26. Applications Principal component analysis ◮ Let A = A T ∈ R n × n . ◮ Goal: Compute the p dominant eigenvectors of A . ◮ Principle: Let N = diag ( p , p − 1 , · · · , 1) and solve tr ( X T AXN ) . max X T X = I p The columns of X are the p dominant eigenvectors or A . ◮ A basic method: Steepest-descent on V n , p . ◮ Let f : R n × p → R : X �→ tr ( X T AXN ). ◮ We have 1 2 grad f ( X ) = AXN . ◮ Thus 1 2 grad f | V n , p ( X ) = P T X V n , p ( AXN ) = AXN − X sym ( X T AXN ), where sym ( Z ) := ( Z + Z T ) / 2. ◮ Basic algorithm: Follow ˙ X = grad f | V n , p ( X ). 26

  27. Applications Principal component analysis ◮ Let A = A T ∈ R n × n . ◮ Goal: Compute the p dominant eigenvectors of A . ◮ Principle: Let N = diag ( p , p − 1 , · · · , 1) and solve tr ( X T AXN ) . max X T X = I p The columns of X are the p dominant eigenvectors or A . ◮ A basic method: Steepest-descent on V n , p . ◮ Let f : R n × p → R : X �→ tr ( X T AXN ). ◮ We have 1 2 grad f ( X ) = AXN . ◮ Thus 1 2 grad f | V n , p ( X ) = P T X V n , p ( AXN ) = AXN − X sym ( X T AXN ), where sym ( Z ) := ( Z + Z T ) / 2. ◮ Basic algorithm: Follow ˙ X = grad f | V n , p ( X ). 27

  28. Applications Principal component analysis ◮ Let A = A T ∈ R n × n . ◮ Goal: Compute the p dominant eigenvectors of A . ◮ Principle: Let N = diag ( p , p − 1 , · · · , 1) and solve tr ( X T AXN ) . max X T X = I p The columns of X are the p dominant eigenvectors or A . ◮ A basic method: Steepest-descent on V n , p . ◮ Let f : R n × p → R : X �→ tr ( X T AXN ). ◮ We have 1 2 grad f ( X ) = AXN . ◮ Thus 1 2 grad f | V n , p ( X ) = P T X V n , p ( AXN ) = AXN − X sym ( X T AXN ), where sym ( Z ) := ( Z + Z T ) / 2. ◮ Basic algorithm: Follow ˙ X = grad f | V n , p ( X ). 28

  29. Applications Computing Lyapunov exponents: a method on the Stiefel manifold ◮ Ref: T. Bridges and S. Reich, Computing Lyapunov exponents on a Stiefel manifold , Physica D 156, pp. 219–238, 2001. ◮ Dynamical system: ˙ x = f ( x ). ◮ Nominal trajectory: x ∗ ( t ). ◮ Goal: Describe the behavior of nearby trajectories. 29

  30. Applications Computing Lyapunov exponents: a method on the Stiefel manifold ◮ Dynamical system: ˙ x = f ( x ). ◮ Nominal trajectory: x ∗ ( t ). ◮ Goal: Describe the behavior of nearby trajectories. 30

Recommend


More recommend