Faculty of Science Diffusion Processes and Dimensionality Reduction on Manifolds ESI, Vienna, Feb. 2015 Stefan Sommer Department of Computer Science, University of Copenhagen February 23, 2015 Slide 1/22
Outline • Dimensionality Reduction • Diffusion PCA • Development and Anisotropic Diffusions • Examples Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 2/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data • Non-Euclidean generalizations of PCA : • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) • Geodesic PCA (GPCA, Huckeman et al., ’10) • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data • Non-Euclidean generalizations of PCA : PGA: analysis • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) relative to the • Geodesic PCA (GPCA, Huckeman et al., ’10) data mean • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data • Non-Euclidean generalizations of PCA : PGA: analysis • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) relative to the • Geodesic PCA (GPCA, Huckeman et al., ’10) data mean • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) data points on non-linear manifold Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data • Non-Euclidean generalizations of PCA : PGA: analysis • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) relative to the • Geodesic PCA (GPCA, Huckeman et al., ’10) data mean • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) intrinsic mean µ Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data • Non-Euclidean generalizations of PCA : PGA: analysis • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) relative to the • Geodesic PCA (GPCA, Huckeman et al., ’10) data mean • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) tangent space T µ M Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data • Non-Euclidean generalizations of PCA : PGA: analysis • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) relative to the • Geodesic PCA (GPCA, Huckeman et al., ’10) data mean • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) projection of data point to T µ M Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data PGA: analysis • Non-Euclidean generalizations of PCA : relative to the • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) data mean • Geodesic PCA (GPCA, Huckeman et al., ’10) • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) Euclidean PCA in tangent space Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean space that preserves structure of data • Non-Euclidean generalizations of PCA : • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) • Geodesic PCA (GPCA, Huckeman et al., ’10) • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
Dimensionality Reduction in Non-Linear PGA: Manifolds • dim. reduction and linearizations - mappings from non-linear manifolds to low dimensional Euclidean GPCA: space that preserves structure of data • Non-Euclidean generalizations of PCA : • Principal Geodesic Analysis (PGA, Fletcher et al., ’04) • Geodesic PCA (GPCA, Huckeman et al., ’10) • Horizontal Component Analysis (HCA, Sommer, ’13) • Principal Nested Spheres ((C)PNS, Jung et al., ’12) • Barycentric Subspaces (BS, Pennec, ’15) HCA: Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 3/22
PGA, GPCA, HCA, PNS, . . . • search for explicitly constructed parametric subspaces: geodesic sprays, geodesics, iterated development, . . . • in general manifolds, these subspaces are not totally geodesic • projections to subspaces are problematic: geodesics may be dense on tori Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 4/22
Generalizing Linear Statistics Euclidean Riemannian norm � x − y � distances d ( x , y ) vectors v 0 for geodesics linear subspaces geodesic sprays . . . . . . why are geodesics fundamental when estimating covariance? • Euclidean space analogies can lead to non-local constructions • to goal of this talk is to get closer to constructions defined “infinitesimally” Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 5/22
Euclidean PCA Usual formulation: • eigendecomposition ( u 1 , λ 1 ) ,..., ( u d , λ d ) of sample covar. matrix C • principal components: x n = U T ( y n − µ ) Probabilistic interpretation (Tipping, Bishop, ’99): • latent variable model y = Wx + µ + ε , ε ∼ N ( 0 , σ 2 I ) x ∼ N ( 0 , I ) , • marginal distribution y ∼ N ( µ , C σ ) , C σ = WW T + σ 2 I • MLE of W : W ML = U (Λ − σ 2 I ) 1 / 2 + rotation Λ = diag ( λ 1 ,..., λ d ) • principal components: E [ x n | y n ] = ( W T W + σ 2 I ) − 1 W T ML ( y n − µ ) Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 6/22
Diffusion PCA • probabilistic PCA does not explicitly use subspaces • on Riemannian manifolds, the Eells-Elworthy-Malliavin construction gives a map � Diff : FM → Dens ( M ) • Γ ⊂ Dens ( M ) : the image � Diff ( FM ) , the set of (normalized) densities resulting from diffusions in FM • µ ∈ Γ ≈ anisotropic normal distribution • with µ = � Diff ( x , X α ) = p µ µ 0 , define the log-likelihood N ∑ ln L ( x , X α ) = ln L ( µ ) = ln p µ ( y i ) i = 1 • Diffusion PCA: maxim. ln L ( x , X α ) for ( x , X α ) ∈ FM • MLE of data y i under the assumption y ∼ µ ∈ Γ Stefan Sommer (sommer@diku.dk) (Department of Computer Science, University of Copenhagen) — Diffusion Processes and Dimensionality Reduction on Manifolds Slide 7/22
Recommend
More recommend