pca on manifolds application to spaces of landmarks
play

PCA on manifolds: application to spaces of landmarks Dr Sergey - PowerPoint PPT Presentation

PCA on manifolds: application to spaces of landmarks Dr Sergey Kushnarev Singapore University of Technology and Design Infinite-dimensional Riemannian geometry with applications to image matching and shape analysis. ESI, Vienna January 16,


  1. PCA on manifolds: application to spaces of landmarks Dr Sergey Kushnarev Singapore University of Technology and Design Infinite-dimensional Riemannian geometry with applications to image matching and shape analysis. ESI, Vienna January 16, 2015

  2. Principal Component Analysis Goal of PCA: find the sequence of linear subspaces V k that best represent the variability of the data.

  3. Principal Component Analysis Goal of PCA: find the sequence of linear subspaces V k that best represent the variability of the data. x 1 , . . . , x n ∈ R d . V k = span { v 1 , v 2 , ..., v k } . n � ( v · x i ) 2 , ◮ v 1 = arg max � v � =1 i =1 n ( v 1 · x i ) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 ( v j · x i ) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  4. Principal Component Analysis Goal of PCA: find the sequence of linear subspaces V k that best represent the variability of the data. x 1 , . . . , x n ∈ R d . V k = span { v 1 , v 2 , ..., v k } . n � ( v · x i ) 2 , ◮ v 1 = arg max � v � =1 i =1 n ( v 1 · x i ) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 ( v j · x i ) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  5. Principal Component Analysis Goal of PCA: find the sequence of linear subspaces V k that best represent the variability of the data. x 1 , . . . , x n ∈ R d . V k = span { v 1 , v 2 , ..., v k } . n � ( v · x i ) 2 , ◮ v 1 = arg max � v � =1 i =1 n ( v 1 · x i ) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 ( v j · x i ) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  6. Principal Component Analysis Goal of PCA: find the sequence of linear subspaces V k that best represent the variability of the data. x 1 , . . . , x n ∈ R d . V k = span { v 1 , v 2 , ..., v k } . n � ( v · x i ) 2 , ◮ v 1 = arg max � v � =1 i =1 n ( v 1 · x i ) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 ( v j · x i ) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  7. Principal Component Analysis Goal of PCA: find the sequence of linear subspaces V k that best represent the variability of the data. x 1 , . . . , x n ∈ R d . V k = span { v 1 , v 2 , ..., v k } . n � ( v · x i ) 2 , ◮ v 1 = arg max � v � =1 i =1 n ( v 1 · x i ) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 ( v j · x i ) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  8. Principal Component Analysis Maximizing projected variance: n k − 1 ( v j · x i ) 2 + ( v · x i ) 2 � � v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1 Or, equivalently , minimizing residuals: n k − 1 � x i − ( v j · x i ) v j � 2 + � x i − ( v · x i ) v � 2 � � v k = arg min � v � =1 , v ⊥ V k − 1 i =1 j =1

  9. PCA on a manifold? M x k

  10. Tangent PCA Karcher mean N � d ( x k , y ) 2 µ = arg min y ∈ M k =1 M x k µ

  11. Tangent PCA Karcher mean N � d ( x k , y ) 2 µ = arg min y ∈ M k =1 M x k v k µ T µ M

  12. Tangent PCA Shape space linearization via v k ∈ T µ M M x k v k µ T µ M Tangent PCA: finding the sequence of linear subspaces V k ∈ T µ M that best represent the variability of the data.

  13. Tangent PCA Shape space linearization via v k ∈ T µ M M x k v k µ T µ M Tangent PCA: finding the sequence of linear subspaces V k ∈ T µ M that best represent the variability of the data.

  14. Problems with linearization dist ( s 1 , s 2 ) 2 = dist (exp µ ( v 1 ) , exp µ ( v 2 )) 2 = � v 1 − v 2 � 2 − 1 3 K ( v 1 , v 2 ) + o ( � v � 4 ) . If K ( v 1 , v 2 ) < 0, then dist (exp µ ( v 1 ) , exp µ ( v 2 )) > � v 1 − v 2 � If K ( v 1 , v 2 ) > 0, then dist (exp µ ( v 1 ) , exp µ ( v 2 )) < � v 1 − v 2 � K = 0 K > 0 K < 0 s 2 s 2 v 2 v 2 s 2 v 2 µ µ µ v 1 v 1 v 1 s 1 s 1 s 1

  15. PCA on a manifold, effects of curvature y 2 − 1 ds 2 = (1 + y 2 )( dx 2 + dy 2 ) , M = R 2 , K = ( y 2 + 1) 3 . Points uniformly distributed in T 0 M .

  16. PCA on a manifold, effects of curvature y 2 − 1 ds 2 = (1 + y 2 )( dx 2 + dy 2 ) , K = ( y 2 + 1) 3 . Points uniformly distributed in M .

  17. Geodesic PCA (Fletcher) Geodesic PCA: finding the sequence of geodesic subspaces S k ∈ M that best represent the variability of the data. M x k µ S 2 S 1

  18. Geodesic PCA (Fletcher) Projection π : M → H . d ( x , y ) 2 . π H ( x ) = arg min y ∈ H x Log x y Log µ x H Log µ y π ( x ) y µ y

  19. Geodesic PCA (Fletcher) Geodesic subspace S = Exp µ V , V ⊂ T µ M . Projection π : M → H . d ( x , y ) 2 . π H ( x ) = arg min y ∈ H n � d ( µ, π S v ( x i )) 2 , ◮ v 1 = arg max � v � =1 i =1 n d ( µ, π S v 1 ( x i )) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 d ( µ, π S vj ( x i )) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  20. Geodesic PCA (Fletcher) Geodesic subspace S = Exp µ V , V ⊂ T µ M . Projection π : M → H . d ( x , y ) 2 . π H ( x ) = arg min y ∈ H n � d ( µ, π S v ( x i )) 2 , ◮ v 1 = arg max � v � =1 i =1 n d ( µ, π S v 1 ( x i )) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 d ( µ, π S vj ( x i )) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  21. Geodesic PCA (Fletcher) Geodesic subspace S = Exp µ V , V ⊂ T µ M . Projection π : M → H . d ( x , y ) 2 . π H ( x ) = arg min y ∈ H n � d ( µ, π S v ( x i )) 2 , ◮ v 1 = arg max � v � =1 i =1 n d ( µ, π S v 1 ( x i )) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 d ( µ, π S vj ( x i )) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  22. Geodesic PCA (Fletcher) Geodesic subspace S = Exp µ V , V ⊂ T µ M . Projection π : M → H . d ( x , y ) 2 . π H ( x ) = arg min y ∈ H n � d ( µ, π S v ( x i )) 2 , ◮ v 1 = arg max � v � =1 i =1 n d ( µ, π S v 1 ( x i )) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 d ( µ, π S vj ( x i )) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  23. Geodesic PCA (Fletcher) Geodesic subspace S = Exp µ V , V ⊂ T µ M . Projection π : M → H . d ( x , y ) 2 . π H ( x ) = arg min y ∈ H n � d ( µ, π S v ( x i )) 2 , ◮ v 1 = arg max � v � =1 i =1 n d ( µ, π S v 1 ( x i )) 2 + ( v · x i ) 2 , � ◮ v 2 = arg max � v � =1 , v ⊥ V 1 i =1 ◮ . . . n k − 1 d ( µ, π S vj ( x i )) 2 + ( v · x i ) 2 . � � ◮ v k = arg max � v � =1 , v ⊥ V k − 1 i =1 j =1

  24. Variance vs residuals Maximizing variance: N v i = � d ( µ, π S v ( x j )) 2 , arg max � v � =1 , v ∈ V ⊥ j =1 i − 1 Minimizing residuals N v i = � d ( x j , π S v ( x j )) 2 , arg min � v � =1 , v ∈ V ⊥ j =1 i − 1

  25. What the PCA?! ◮ Tangent PCA (linearization) ◮ Geodesic PCA (computationally intensive) ◮ Let’s try: Curvature PCA (quadratic PCA?)

  26. What the PCA?! ◮ Tangent PCA (linearization) ◮ Geodesic PCA (computationally intensive) ◮ Let’s try: Curvature PCA (quadratic PCA?)

  27. What the PCA?! ◮ Tangent PCA (linearization) ◮ Geodesic PCA (computationally intensive) ◮ Let’s try: Curvature PCA (quadratic PCA?)

  28. Dealing with curvature √ Sphere of radius 1 / K (sectional curvature is K ). √ √ cos β = tan Kc sin β = sin Kb √ , √ tan sin Ka Ka √ Note: if K < 0, cos β = tanh i Kc √ , which is the same as above, tanh i Ka since tanh ix = i tan x .

  29. Dealing with curvature √ Sphere of radius 1 / K (sectional curvature is K ). √ √ cos β = tan Kc sin β = sin Kb √ , √ tan sin Ka Ka √ Note: if K < 0, cos β = tanh i Kc √ , which is the same as above, tanh i Ka since tanh ix = i tan x .

  30. Dealing with curvature √ 1 � � Projected variance c = √ arctan tan( Ka ) cos β , K √ 1 � � Residual b = √ arcsin sin( Ka ) sin β K

  31. Dealing with curvature Looking infinitesimally c 2 = a 2 cos 2 β + K 6 a 4 sin 2 (2 β ) + o ( K ) Projected variance b 2 = a 2 sin 2 β − K 3 a 4 sin 4 β + o ( K ) Residual

  32. For vectors { a 1 , a 2 , . . . , a n } ∈ T p M representing the data { x 1 , x 2 , . . . , x n } , we seek direction v , s.t. the projected variance in maximized : n � � a k � 2 cos 2 β + K ( a k , v ) � � a k � 4 sin 2 (2 β ) � v 1 = arg max , 6 v , � v � =1 k =1 where cos β k = � a k , v � / � a k �� v � . Or, not equivalently , the residuals are minimized : n � � � a k � 2 sin 2 β − K ( a k , v ) � a k � 4 sin 4 β � v 1 = arg min 3 v , � v � =1 k =1

Recommend


More recommend