A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour - PowerPoint PPT Presentation

A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour Druckmann Lab Meeting October 26, 2018

Introduction: Decoding and Dimensionality Reduction Autoencoders Perform well in most applications but not always! § Don’t generalize easily § Hard to understand theoretically § # " ! " # " $ . . . . . . . . . . . . . . . . . . Decoder Encoder

Introduction: Decoding and Dimensionality Reduction Our focus today: Decoder side Observed data: ! " ∈ ℝ % , ' = 1,2 ⋯ , , § Common latents - " ∈ ℝ . generate the data in an unknown nolinear fashion: § ! 0" = 1 0 - " + 3 0" ! " / - " . . . . . . . . . Decoder

Introduction: Role of Dynamics Dynamics play an important role in solving the inverse problem The inverse problem is still ill-posed ! Latents can only be identified up to an § isomorphism ! " = * ! "+, $ %" = ' % ! " + ) %" ! " $ " # . . . . . . . . . Decoder

Koopman Theory Resolves this Ambiguity Generalizes eigenfunction/eigenvalue to nonllinear dynamics: ! " = $ ! "%& § Koopman Operator: Linear, Infinite-dimensional § '( ! = ( $ ! › Linearizes the dynamics ') ! = *)(!) Eigenfunctions/Eigenvalues: § 1 ) $ ! = *) ! - " = . 2 / * / ) / ! "%& Goal: /0& ) interacts nicely with $

Polynomials are Eigenfunctions of Linear Dynamics! Linear Dynamical System ! " = $! "%& - ! ' / ! ' ( . ( eigen pairs of $ / / ! / ! ⋯ . ( , / ! ' = ' ( ) ' ( * ⋯ ' ( , polynomials . ( ) . ( * 0 ) . ( * 0 * ⋯ . ( , 0 1 0 * ⋯ ' ( , / ! / ! / ! 0 ) ' ( * 0 1 posynomials ' = ' ( ) . ( ) complex 2 ( ’s for conjugate pairs periodic combinations ReLu, etc. with same '

Polynomials are Eigenfunctions of Linear Dynamics! ! " = $! "%& * ! * ! ⋯ ' ( - * ! ' ( ) ' ( + . ! = Good news: they also form a basis! Nonlinear dimensionality reduction for dynamical systems ≡ Low-rank harmonic analysis

Polynomial Principal Component Analysis (Poly-PCA) § Replace deterministic linear dynamics with an AR model § Model observations as polynomials of degree ≤ " in latents ) % = .) %/0 + 1 % ⊗+ + - $% # $% = ' $ ( ) % ' $ : Symmetric tensor of polynomial coefficients ) % : Latents augmented with 1 ⊗+ = + > ; = minimize ' 7 , 9 : ; # $% − ' $ ( ) % ||) % − .) %/0 || = $,% %

̇ ̇ Example: Van der Pol Oscillator with Quadratic Measurements § A 2-dimensional nonlinear oscillator % 1 = % 2 & ' " % # + ) "# ! "# = % # * % 2 = 3 1 − % 1 0 % 2 − % 1 origin 10 0 10 time (s) 10 For known rank-1 ' " ≡ phase retrieval 0 2 4 6 8 10 ( s ) t i m e

Why does linear dimensionality reduction fail? § Nonlinearity changes topology ICA PCA origin L L E tsne 10 0 10 time (s) 0 2 4 6 8 10 time (s)

Poly-PCA Recovers True Latents Recovered Ground-truth Non-singular origin origin linear map 10 10 0 2 4 6 8 10 time (s)

Axioms of Dimensionality Reduction § Nonsingular linear transformations of latents should also be a solution § Nonsingular and stable linear transformations of measurements should result in the same latents › Gives stability and robustness to outliers § Stable reconstruction possible if › ! " = $ % " is Lipschitz: far away latents do not map to close observations Poly-PCA is compatible with these Axioms!

Some Poly-PCA Theory Poly-PCA ≡ constrained PCA § › ALS has no local minima, local minima appear from the polynomial constraints Experimental observation: Poly-PCA has few local minima (compared to Bezout’s § theorem) x local minimum local maximum Least squares error Poly-PCA error LS Poly-PCA x (unique) feasible manifold Also gives a good intuitive initialization

Some Poly-PCA Theory § Linear ambiguity can be handled by small penalization ⊗3 4 + 6 + 4 + 8 + 4 minimize & ' , ) * + . ,- − & , 0 1 - ||1 - || 4 ||& , || 4 ,,- - , convex one global minimum local minimum 2 optimal solution nonconvex after small regularization one global minimum manifold of equivalent local minima local minimum 1 slope = All minima of Poly-PCA

Some Poly-PCA Theory § Local minima are unique up to a linear transformation, i.e. for * : '() ! " and # $ in general position and % > ) ⊗) = ℬ $ / 3 " ⊗) ⇒ ! " = 53 " # $ / ! " › Generalization of linear preserver theory '() § Conjecture: minimum required samples is % > + , + 1. )

Equivalence to a 1-layer Decoder with Polynomial Activation ( ( ⊗+ ⇒ ! " - . / ⊗0 = $ 0 1 . / ! " = $ ) "% ) "% %&' %&' Not easy to train (Mondelli, Montanari, 2018) § Better to train directly on ! " § . . . + neuron 1 Universal approximation theorem ≡ Taylor § approximation theorem . . . . . . . . . + neuron p Poly-PCA Decoder

Poly-PCA Initialization Strategy 1: Use PCA § › Beats PCA › Need larger Lipschitz constant Strategy 2: Data Embedding + PCA § Ground-truth › Use Taken’s Embedding Theorem ! " (1) ! "&' (1) Embedded ! "&(' (1)

A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour - PowerPoint PPT Presentation

A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour Druckmann Lab Meeting October 26, 2018 Introduction: Decoding and Dimensionality Reduction Autoencoders Perform well in most applications but not always! Dont generalize

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Applied Machine Learning Dimensionality reduction using PCA Siamak Ravanbakhsh COMP 551 (Fall

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

DIMENSIONALITY REDUCTION DIMENSIONALITY REDUCTION MATTHIEU BLOCH April 21, 2020 1 / 26

Probabilistic Dimensionality Reduction Neil D. Lawrence University of Sheffield Facebook, London

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 In this subfield, we think

Spatial Data: Dimensionality Reduction CSC444 Techniques In this subfield, we think of a data

Dimensionality Reduction INFO-4604, Applied Machine Learning University of Colorado Boulder

Dimensionality Reduction Techniques for Proximity Problems Piotr Indyk, SODA 2000 CS 468 |

A unified theory for the origin of grid cells through the lens of pattern formation Ben

Implementation of clinical pharmacy services in non- academic hospitals: opportunities and links

BiSE SEA Me Meeti ting 14 September 2015 building networks connecting business

Integrating NHS Pharmacy and Medicines Optimisation into the new NHS landscape Richard Seal,

Neutrino Mass in the Standard Model Bob McElrath Universitt Heidelberg, Germany Pheno 2010

How Do You Use Look-up Tables? Agenda Introduction Data Step Merge PROC SQL Join

The E 11 origin of gauged maximal supergravities Fabio Riccioni Kings College London based

How to Make Beautiful Technical Documents with LaTeX PHYS 87 Benjam n Grinstein UCSD Fall