Dimensionality Reduc1on Lecture 9 David Sontag New York - PowerPoint PPT Presentation

Dimensionality ¡Reduc1on ¡ Lecture ¡9 ¡ David ¡Sontag ¡ New ¡York ¡University ¡ Slides adapted from Carlos Guestrin and Luke Zettlemoyer

Class ¡notes ¡ • PS5 ¡will ¡be ¡released ¡by ¡Friday, ¡due ¡Monday ¡ 4/14 ¡ • Feedback ¡on ¡project ¡proposals ¡will ¡be ¡sent ¡to ¡ you ¡between ¡now ¡and ¡Monday ¡ • PS6 ¡will ¡be ¡released ¡the ¡week ¡of ¡4/14 ¡ • Make ¡sure ¡you ¡are ¡making ¡steady ¡progress ¡on ¡ your ¡projects ¡– ¡keep ¡to ¡your ¡Imeline! ¡

Dimensionality ¡reducIon ¡ • Input ¡data ¡may ¡have ¡thousands ¡or ¡millions ¡of ¡ dimensions! ¡ – e.g., ¡text ¡data ¡has ¡???, ¡images ¡have ¡??? ¡ ¡ • Dimensionality ¡reduc1on : ¡represent ¡data ¡with ¡ fewer ¡dimensions ¡ – easier ¡learning ¡– ¡fewer ¡parameters ¡ – visualizaIon ¡– ¡show ¡high ¡dimensional ¡data ¡in ¡2D ¡ – discover ¡“intrinsic ¡dimensionality” ¡of ¡data ¡ • high ¡dimensional ¡data ¡that ¡is ¡truly ¡lower ¡dimensional ¡ ¡ • noise ¡reducIon ¡

!"#$%&"'%()$*+,-"'% � .&&+#/-"'%0(*1-1(21//)'3"#1-$456(4"$&('%( 1(4'7$)(*"#$%&"'%14(&/1,$ � 831#/4$&0 n = 2 n = 3 k = 1 k = 2 Slide from Yi Zhang

Example ¡(from ¡Bishop) ¡ • Suppose ¡we ¡have ¡a ¡dataset ¡of ¡digits ¡(“3”) ¡ perturbed ¡in ¡various ¡ways: ¡ • What ¡operaIons ¡did ¡I ¡perform? ¡What ¡is ¡the ¡ data’s ¡intrinsic ¡dimensionality? ¡ • Here ¡the ¡underlying ¡manifold ¡is ¡ nonlinear ¡

Lower ¡dimensional ¡projecIons ¡ • Obtain ¡new ¡feature ¡vector ¡by ¡transforming ¡the ¡original ¡ features ¡x 1 ¡… ¡x n ¡ z 1 = w (1) w (1) ⌥ + x i ⌥ 0 In ¡general ¡will ¡not ¡be ¡ i … inverIble ¡– ¡cannot ¡go ¡ i from ¡ z ¡back ¡to ¡ x ¡ z k = w ( k ) w ( k ) ⌥ + x i 0 i i • New ¡features ¡are ¡linear ¡combinaIons ¡of ¡old ¡ones ¡ • Reduces ¡dimension ¡when ¡k<n ¡ • This ¡is ¡typically ¡done ¡in ¡an ¡unsupervised ¡sebng ¡ ¡ – just ¡ X , ¡but ¡no ¡Y ¡

Which ¡projecIon ¡is ¡becer? ¡ From notes by Andrew Ng

Reminder: ¡Vector ¡ProjecIons ¡ • Basic ¡definiIons: ¡ – A.B ¡= ¡|A||B|cos ¡θ ¡ – cos ¡θ ¡= ¡|adj|/|hyp| ¡ ¡ • Assume ¡|B|=1 ¡(unit ¡vector) ¡ – A.B ¡= ¡|A|cos ¡θ ¡ – So, ¡dot ¡product ¡is ¡length ¡of ¡ projecIon!!! ¡

Using ¡a ¡new ¡basis ¡for ¡the ¡data ¡ • Project ¡a ¡point ¡into ¡a ¡(lower ¡dimensional) ¡space: ¡ – point : ¡ x ¡ = ¡(x 1 ,…,x n ) ¡ ¡ – select ¡a ¡basis ¡– ¡set ¡of ¡unit ¡(length ¡1) ¡basis ¡vectors ¡ ( u 1 ,…, u k ) ¡ • we ¡consider ¡orthonormal ¡basis: ¡ ¡ – u j • u j =1, ¡and ¡ u j • u l =0 ¡for ¡j ≠ l ¡ – select ¡a ¡center ¡– ¡ x , ¡defines ¡offset ¡of ¡space ¡ ¡ – best ¡coordinates ¡ in ¡lower ¡dimensional ¡space ¡ defined ¡by ¡dot-‑products: ¡(z 1 ,…,z k ), ¡z j i ¡= ¡( x i -‑ x ) • u j ¡

Maximize ¡variance ¡of ¡projecIon ¡ Let x (i) be the i th data point minus the mean. Choose unit-length u to maximize: m m 1 1 Covariance ( x ( i ) T u ) 2 u T x ( i ) x ( i ) T u � � = matrix Σ m m i =1 i =1 � � m 1 x ( i ) x ( i ) T � u T = u. m i =1 Let ||u||=1 and maximize. Using the method of Lagrange multipliers, can show that the solution is given by the principal eigenvector of the covariance matrix! (shown on board)

Basic ¡PCA ¡algorithm ¡ [Pearson ¡1901, ¡ ¡Hotelling, ¡1933] ¡ • Start ¡from ¡m ¡by ¡n ¡data ¡matrix ¡ X ¡ • Recenter : ¡subtract ¡mean ¡from ¡each ¡row ¡of ¡ X ¡ – X c ¡ ← ¡X ¡– ¡X ¡ • Compute ¡covariance ¡ matrix: ¡ – ¡ Σ ¡ ← ¡ 1/m ¡X c T ¡X c ¡ • Find ¡ eigen ¡vectors ¡and ¡values ¡ of ¡ Σ ¡ ¡ • Principal ¡components: ¡k ¡eigen ¡vectors ¡with ¡ highest ¡eigen ¡values ¡

PCA ¡example ¡ Data: Projection: Reconstruction:

Dimensionality ¡reducIon ¡with ¡PCA ¡ In high-dimensional problem, data usually lies near a linear subspace, as noise introduces small variability Only keep data projections onto principal components with large eigenvalues Can ignore the components of lesser significance. m 1 X ( z i j ) 2 var( z j ) = m 25 Percentage ¡of ¡total ¡variance ¡captured ¡ i =1 m by ¡dimension ¡z j ¡for ¡j=1 ¡to ¡10: ¡ 1 λ j X ( x i · u j ) 2 = 20 P n l =1 λ l m i =1 Variance (%) = λ j 15 10 5 0 PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 You might lose some information, but if the eigenvalues �� much 23 Slide from Aarti Singh

Eigenfaces ¡ [Turk, ¡Pentland ¡’91] ¡ • Input ¡images: ¡  Principal components:

Eigenfaces ¡reconstrucIon ¡ • Each ¡image ¡corresponds ¡to ¡adding ¡together ¡ (weighted ¡versions ¡of) ¡the ¡principal ¡ components: ¡

Scaling ¡up ¡ • Covariance ¡matrix ¡can ¡be ¡really ¡big! ¡ – ¡ Σ ¡is ¡n ¡by ¡n ¡ – 10000 ¡features ¡can ¡be ¡common! ¡ ¡ – finding ¡eigenvectors ¡is ¡very ¡slow… ¡ • Use ¡singular ¡value ¡decomposiIon ¡(SVD) ¡ – Finds ¡k ¡eigenvectors ¡ – great ¡implementaIons ¡available, ¡e.g., ¡Matlab ¡svd ¡

SVD ¡ • Write ¡ X ¡= ¡Z ¡S ¡U T ¡ – X ¡ ← ¡data ¡matrix, ¡one ¡row ¡per ¡datapoint ¡ – S ¡ ← ¡singular ¡value ¡matrix, ¡diagonal ¡matrix ¡with ¡ entries ¡σ i ¡ • RelaIonship ¡between ¡singular ¡values ¡of ¡ X ¡and ¡ eigenvalues ¡of ¡ Σ ¡given ¡by ¡ λ i ¡= ¡σ i 2 /m ¡ – Z ¡ ← ¡weight ¡matrix, ¡one ¡row ¡per ¡datapoint ¡ • Z ¡Imes ¡ S ¡gives ¡coordinate ¡of ¡x i ¡in ¡eigenspace ¡ ¡ – U T ¡ ← ¡singular ¡vector ¡matrix ¡ • In ¡our ¡sebng, ¡each ¡row ¡is ¡eigenvector ¡ u j ¡

PCA ¡using ¡SVD ¡algorithm ¡ • Start ¡from ¡m ¡by ¡n ¡data ¡matrix ¡ X ¡ • Recenter : ¡subtract ¡mean ¡from ¡each ¡row ¡of ¡ X ¡ – X c ¡ ← ¡X ¡– ¡X ¡ • Call ¡SVD ¡ algorithm ¡on ¡ X c ¡– ¡ask ¡for ¡k ¡singular ¡ vectors ¡ • Principal ¡components: ¡k ¡singular ¡vectors ¡with ¡ highest ¡singular ¡values ¡(rows ¡of ¡ U T ) ¡ – Coefficients: ¡ project ¡each ¡point ¡onto ¡the ¡new ¡vectors ¡

Non-‑linear ¡methods ¡ � A%&,'- /)-%,-0"1&2.30.%$%#&4%"156-6&7/248 B'2("-*C&'45)%) D&/,1,&/,&(*!"#1"&,&(*C&'45)%)*=D!C? � E"&4%&,'- !"01",-"% *-9$%3"06 DF@GCH A"2'4*A%&,'-*8#$,//%&6*=AA8? 12 Slide from Aarti Singh

Isomap ¡ EsImate ¡manifold ¡using ¡ Goal: ¡use ¡ geodesic ¡ Embed ¡onto ¡2D ¡plane ¡ graph. ¡Distance ¡between ¡ distance ¡between ¡points ¡ so ¡that ¡Euclidean ¡distance ¡ points ¡given ¡by ¡distance ¡of ¡ (with ¡respect ¡to ¡manifold) ¡ approximates ¡graph ¡ shortest ¡path ¡ distance ¡ [Tenenbaum, Silva, Langford. Science 2000]

Dimensionality Reduc1on Lecture 9 David Sontag New York - PowerPoint PPT Presentation

Dimensionality Reduc1on Lecture 9 David Sontag New York University Slides adapted from Carlos Guestrin and Luke Zettlemoyer Class notes PS5 will be released by Friday,

Dimensionality Reduc1on Lecture 23 David Sontag New York University Slides adapted from Carlos

Dimensionality Reduc1on Lecture 23 David Sontag New York University Slides adapted from Carlos

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduc1on Lecture 23 David Sontag New York

Dimensionality Reduc1on contd Aarti Singh Machine Learning 10-601 Nov 10,

Dimensionality Reduc1on Machine Learning 10-601B Seyoung Kim

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Massachuse(s)Toxics)Use)Reduc1on)Act) (TURA):)Reducing)the)Use)of)Carcinogens) Rachel'Massey'

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Dimensionality Reduction INFO-4604, Applied Machine Learning University of Colorado Boulder

Estimation of Intrinsic Dimensionality Using High-Rate Vector Quantization Maxim Raginsky and

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Data Quality 101 What is Data Quality? May 5 th , 2020 Meradith Alspaugh & Alissa Parrish

From Inven=on to Innova=on: Compu=ng Research that Makes an

HAVING YOUR EMAIL LIST WORK FOR YOU # mpnon Mar Younkin @areeetinthekithen

Practical Near-Data Processing for In-Memory Analytics Frameworks Mingyu Gao, Grant Ayers,

See the Difference! Visualizing Assessment Data Carmen Allen and Jorge Martinez University of

INSTalytics : Cluster Filesystem Co-design for Big-data Analytics Muthian Sivathanu, Midhul

NADEEF: A Commodity Data Cleaning System Data analytics, QCRI Michele Dallachiesa Amr Ebaid

Dimensionality Reduc1on Lecture 9 David Sontag New York - PowerPoint PPT Presentation

Dimensionality Reduc1on Lecture 9 David Sontag New York University Slides adapted from Carlos Guestrin and Luke Zettlemoyer Class notes PS5 will be released by Friday,

Dimensionality Reduc1on Lecture 23 David Sontag New York University Slides adapted from Carlos

Dimensionality Reduc1on Lecture 23 David Sontag New York University Slides adapted from Carlos

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduc1on Lecture 23 David Sontag New York

Dimensionality Reduc1on contd Aarti Singh Machine Learning 10-601 Nov 10,

Dimensionality Reduc1on Machine Learning 10-601B Seyoung Kim

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Massachuse(s)Toxics)Use)Reduc1on)Act) (TURA):)Reducing)the)Use)of)Carcinogens) Rachel'Massey'

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Dimensionality Reduction INFO-4604, Applied Machine Learning University of Colorado Boulder

Estimation of Intrinsic Dimensionality Using High-Rate Vector Quantization Maxim Raginsky and

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Data Quality 101 What is Data Quality? May 5 th , 2020 Meradith Alspaugh &amp; Alissa Parrish

From Inven=on to Innova=on: Compu=ng Research that Makes an

HAVING YOUR EMAIL LIST WORK FOR YOU # mpnon Mar Younkin @areeetinthekithen

Practical Near-Data Processing for In-Memory Analytics Frameworks Mingyu Gao, Grant Ayers,

See the Difference! Visualizing Assessment Data Carmen Allen and Jorge Martinez University of

INSTalytics : Cluster Filesystem Co-design for Big-data Analytics Muthian Sivathanu, Midhul

NADEEF: A Commodity Data Cleaning System Data analytics, QCRI Michele Dallachiesa Amr Ebaid

Data Quality 101 What is Data Quality? May 5 th , 2020 Meradith Alspaugh & Alissa Parrish