Nonlinear Methods Data often lies on or near a nonlinear - PowerPoint PPT Presentation

Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27

Laplacian Eigenmaps Linear methods – Lower-dimensional linear projection that preserves distances between all points Laplacian Eigenmaps (key idea) – preserve local information only Project points into a low-dim Construct graph from data points space using “eigenvectors of (capture local information) the graph”

Step 1 - Graph Construction Similarity Graphs: Model local neighborhood relations between data points G(V,E) V – Vertices (Data points) (1) E – Edge if ||xi – xj|| ≤ ε ε – neighborhood graph (2) E – Edge if k-NN, yields directed graph connect A with B if A → B OR A ← B (symmetric kNN graph) connect A with B if A → B AND A ← B (mutual kNN graph) Directed nearest neighbors (symmetric) kNN graph mutual kNN graph

Step 1 - Graph Construction Similarity Graphs: Model local neighborhood relations between data points Choice of ε and k : Chosen so that neighborhood on graphs represent neighborhoods on the manifold (no “shortcuts” connect different arms of the swiss roll) Mostly ad-hoc

Step 1 - Graph Construction Similarity Graphs: Model local neighborhood relations between data points G(V,E,W) V – Vertices (Data points) E – Edges (nearest neighbors) W - Edge weights E.g. 1 if connected, 0 otherwise (Adjacency graph) Gaussian kernel similarity function (aka Heat kernel) s 2 →∞ results in adjacency graph

Step 2 – Embed using Graph Laplacian • Graph Laplacian (unnormalized version) . L = D – W W – Weight matrix D – Degree matrix = diag ( d 1 , …. , d n ) Note: If graph is connected, 1 is an eigenvector

Step 2 – Embed using Graph Laplacian • Graph Laplacian (unnormalized version) L = D – W Solve generalized eigenvalue problem Order eigenvalues 0 = l 1 ≤ l 2 ≤ l 3 ≤ … ≤ l n To embed data points in d-dim space, project data points onto eigenvectors associated with l 2 , l 3 , …, l d+1 ignore 1 st eigenvector – same embedding for all points Original Representation Transformed representation data point projections x i → (f 2 (i ), …, f d+1 (i)) (D-dimensional vector) (d-dimensional vector)

Understanding Laplacian Eigenmaps • Best projection onto a 1-dim space – Put all points in one place (1 st eigenvector – all 1s) – If two points are close on graph, their embedding is close (eigenvector values are similar – captured by smoothness of eigenvectors) Laplacian eigenvectors of swiss roll example (for large # data points)

Step 2 – Embed using Graph Laplacian • Justification – points connected on the graph stay as close as possible after embedding RHS = f T (D-W) f = f T D f - f T W f = LHS

Step 2 – Embed using Graph Laplacian • Justification – points connected on the graph stay as close as possible after embedding constraint removes arbitrary scaling factor in embedding Wrap constraint into the Lagrangian: objective function

Example – Unrolling the swiss roll N=number of nearest neighbors, t = the heat kernel parameter (Belkin & Niyogi’03)

Example – Understanding syntactic structure of words • 300 most frequent words of Brown corpus • Information about the frequency of its left and right neighbors (600 Dimensional space.) verbs • The algorithm run with N = 14, t = 1 prepositions

PCA vs. Laplacian Eigenmaps PCA Laplacian Eigenmaps Linear embedding Nonlinear embedding based on largest eigenvectors of based on smallest eigenvectors of D x D correlation matrix S = XX T n x n Laplacian matrix L = D – W between features between data points eigenvectors give latent features eigenvectors directly give - to get embedding of points, embedding of data points project them onto the latent features x i → [v 1T x i , v 2T x i , … v dT x i ] T x i → [f 2 (i ), …, f d+1 (i)] T D x1 d x1 D x1 d x1

Dimensionality Reduction Methods • Feature Selection - Only a few features are relevant to the learning task Score features (mutual information, prediction accuracy, domain knowledge) Regularization • Latent features – Some linear/nonlinear combination of features provides a more efficient representation than observed feature Linear: Low-dimensional linear subspace projection PCA (Principal Component Analysis), MDS (Multi Dimensional Scaling), Factor Analysis, ICA (Independent Component Analysis) Nonlinear: Low-dimensional nonlinear projection that preserves local information along the manifold Laplacian Eigenmaps ISOMAP, Kernel PCA, LLE (Local Linear Embedding), Many, many more … 40

Nonlinear Methods Data often lies on or near a nonlinear - PowerPoint PPT Presentation

Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all points Laplacian Eigenmaps (key idea)

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear Control Lecture # 1 Introduction Nonlinear Control Lecture # 1 Introduction Nonlinear

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Nonlinear models Posterior Gradient Ascent Adaptive Step Size Approach to Limit Example Will

Nonlinear tensor product approximation Vladimir Temlyakov ICERM; October 3, 2014 Vladimir

Nonlinear Filtering using Particles and Outline Nonlinear Quadrature Filtering Monte Carlo

Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear

Nonlinear Control Lecture # 1 Introduction & Two-Dimensional Systems Nonlinear Control

Original motivation for nonlinear preconditioning A nonlinear system F ( u ) = 0 may be

Solution of Nonlinear Equations: Graphical and Incremental Search Methods Mike Renfro September

Advanced Introduction to Machine Learning, CMU-10715 Manifold Learning Barnabs Pczos

Digital Agency Performance Metrics Casey Cobb Business Track Who am I? Why am I qualified to

Can account of Fermi motion describe the EMC effect? A ( , p t ) d Z F 2 N ( x/ , Q 2 )

Dark Halo Contraction and the Stellar Initial Mass Function Aaron A. Dutton (CITA National

Dimension Reduction for Classification Alfred O. Hero Dept. EECS, Dept BME, Dept. Statistics

Visualization ( Nonlinear dimensionality reduction ) Fei Sha Yahoo! Research

6.1 Dimensionality reduction Previously in the course, we have discussed algorithms suited for a

Machine Learning 2 DS 4420 - Spring 2020 Dimensionality reduction 2 Byron C Wallace Today A

Nonlinear Methods Data often lies on or near a nonlinear - PowerPoint PPT Presentation

Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all points Laplacian Eigenmaps (key idea)

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear Control Lecture # 1 Introduction Nonlinear Control Lecture # 1 Introduction Nonlinear

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Nonlinear models Posterior Gradient Ascent Adaptive Step Size Approach to Limit Example Will

Nonlinear tensor product approximation Vladimir Temlyakov ICERM; October 3, 2014 Vladimir

Nonlinear Filtering using Particles and Outline Nonlinear Quadrature Filtering Monte Carlo

Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear

Nonlinear Control Lecture # 1 Introduction &amp; Two-Dimensional Systems Nonlinear Control

Original motivation for nonlinear preconditioning A nonlinear system F ( u ) = 0 may be

Solution of Nonlinear Equations: Graphical and Incremental Search Methods Mike Renfro September

Advanced Introduction to Machine Learning, CMU-10715 Manifold Learning Barnabs Pczos

Digital Agency Performance Metrics Casey Cobb Business Track Who am I? Why am I qualified to

Can account of Fermi motion describe the EMC effect? A ( , p t ) d Z F 2 N ( x/ , Q 2 )

Dark Halo Contraction and the Stellar Initial Mass Function Aaron A. Dutton (CITA National

Dimension Reduction for Classification Alfred O. Hero Dept. EECS, Dept BME, Dept. Statistics

Visualization ( Nonlinear dimensionality reduction ) Fei Sha Yahoo! Research

6.1 Dimensionality reduction Previously in the course, we have discussed algorithms suited for a

Machine Learning 2 DS 4420 - Spring 2020 Dimensionality reduction 2 Byron C Wallace Today A

Nonlinear Control Lecture # 1 Introduction & Two-Dimensional Systems Nonlinear Control