The owning house data Can we separate the points with a line? - PowerPoint PPT Presentation

Linear ¡Discriminant ¡Analysis ¡ Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 28, 2014

The ¡owning ¡house ¡data ¡ Can we separate the points with a line? 200 Income (thousand rupees) 150 Equivalently, project the points onto 100 another line so that the projection of the 50 points in the two classes are separated 0 30 40 50 60 70 80 Age (Years) 2 ¡

Not ¡same ¡as ¡Latent ¡Dirichlet ¡Alloca?on ¡(also ¡LDA) ¡ Linear ¡Discriminant ¡Analysis ¡(LDA) ¡ § Reduce dimensionality, preserve as much class discriminatory information as possible A projection with non- A projection with ideal ideal separation separation 3 ¡ The ¡figures ¡are ¡from ¡Ricardo ¡Gu?errez-‑Osuna’s ¡slides ¡ ¡ ¡

Projec?on ¡onto ¡a ¡line ¡– ¡basics ¡ ! $ ! # ! # 0.5 1.1 = 0.5 1.1 1 0 " $ " $ # & 0.7 0.8 " % 1 × 2 vector Projection onto 2 × 2 matrix the x axis norm=1 two data points Distances from represents (0.5,0.7) and the origin (1.1,0.8) the x axis ! # ! $ ! # 0.7 0.8 0 1 0.5 1.1 = " $ " $ # & 0.7 0.8 " % Projection onto the y axis Distances from the origin 4 ¡

Projec?on ¡onto ¡a ¡line ¡– ¡basics ¡ ! $ ! # ! $ 1 1 0.5 1.1 = 0.85 1.34 # & " $ # & 2 2 # & " % 0.7 0.8 " % 1 × 2 vector, norm=1 Projection onto the x=y line the x=y line Distances from the origin distance of projection of x onto the line along w T x : w from origin = w T x a scalar x : any point w : some unit vector 5 ¡

Projec?on ¡vector ¡for ¡LDA ¡ § Define a measure of separation (discrimination) § Mean vectors µ 1 and µ 2 for the two classes c 1 and c 2 , with N 1 and N 2 points: µ i = 1 ∑ x N i x ∈ c i § The mean vector projected onto the a unit vector w : µ i = 1 ! ∑ w T x = w T µ i N i x ∈ c i 6 ¡

Towards ¡maximizing ¡separa?on ¡ § One approach: find a line such that the distance between projected means is maximized § Objective function J ( w ) µ 2 = w T ( µ 1 − µ 2 ) J ( w ) = ! µ 1 − ! Example: if w µ 1 is the unit vector along x Better separation µ 2 or y axis Better separation of means 7 ¡

How ¡much ¡are ¡the ¡points ¡scaQered? ¡ § Scatter: within each class, variance of the projected points 2 ! ( w T x − ! ) s 2 ∑ i = µ i x ∈ c i § Within-class scatter of the projected samples: ! 1 + ! s 2 s 2 2 µ 1 µ 2 8 ¡

Fisher’s ¡discriminant ¡ § Maximize difference between the projected means , normalized by within-class scatter 2 J ( w ) = ! µ 1 − ! µ 2 ! 1 + ! s 2 s 2 2 µ 1 µ 2 Separation of means and the points as well 9 ¡

Formula?on ¡of ¡the ¡objec?ve ¡func?on ¡ § Measure of scatter in the feature space ( x ) T ∑ ( ) ( ) S i = x − µ i x − µ i x ∈ c i § The within-class scatter matrix is: S W = S 1 + S 2 § The scatter of projections, in terms of S W 2 = 2 ! ( w T x − ! ) ( ) ∑ ∑ s 2 w T x − w T µ i i = µ i x ∈ c i x ∈ c i w T x − µ i T w = w T S i w ∑ ( ) x − µ i ( ) = x ∈ c i ! 1 + ! s 2 s 2 2 = w T S W w Hence: 10 ¡

Formula?on ¡of ¡the ¡objec?ve ¡func?on ¡ 2 µ 1 − ! ! § Similarly, the difference in terms of µ i ’ s in µ 2 the feature space 2 = w T µ 1 − w T µ 2 2 µ 1 − ! ! ( ) µ 2 = w T µ 1 − µ 2 T ( ) µ 1 − µ 2 ( ) ### w ! ## # " $ S B = w T S B w Between class scatter matrix § Fisher’s objective function in terms of S B and S W J ( w ) = w T S B w w T S W w 11 ¡

Maximizing ¡the ¡objec?ve ¡func?on ¡ § Take derivative and solve for it being zero ! $ w T S B w d ] = d [ dw J ( w ) & = 0 # w T S W w dw " % " $ " $ d w T S B w d w T S W w # % # % " $ " $ ⇒ w T S W w − w T S B w = 0 # % # % dw dw " $ " $ ⇒ w T S W w % 2 S B w − w T S B w % 2 S W w = 0 # # Dividing by " % " % ⇒ w T S W w ' S B w − w T S B w same ' S W w = 0 $ $ w T S W w w T S W w denominator # & # & ⇒ S B w − J ( w ) S W w = 0 ⇒ J ( w ) w = S − 1 W S B w The generalized eigenvalue problem 12 ¡

Limita?ons ¡of ¡LDA ¡ § LDA is a parametric method – Assumes Gaussian (normal) distribution of data – What if the data is very much non-Gaussian? µ 2 µ 1 = µ 2 µ 1 13 ¡

Limita?ons ¡of ¡LDA ¡ § LDA depends on mean for the discriminatory information – What if it is mainly in the variance? µ 1 = µ 2 14 ¡

The owning house data Can we separate the points with a line? - PowerPoint PPT Presentation

Linear Discriminant Analysis Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata August 28, 2014 The owning house data Can we separate the points with a line? 200 Income (thousand

IS OWNING A CAT BETTER Elie J. Diner, PhD SlideTalk THAN OWNING A DOG? University of Cat

Courtyard House Anjuna Goa For Sale Imagine owning your own private forest your beautiful home

Will Swan En Energy Ho House Lab Laborat atories Energy House: Whole House Testing'

Comparison of House and Senate Budget Amendments House Bills and Senate Bills 29 and 30 House

Open house Open house Open house Open house on on on on on on on on World Raw Cashew

17/12461/OUT Tottenham House Tottenham House Tottenham House - Front Tottenham House - Front

House B House Bill 5 (HB 5) ill 5 (HB 5) Graduation Requirements & Local Policy Wha hat t

My ecological house My ecological house is a house designed in such a way as to minimize

Comparison of House and Senate Budget Amendments House Bill 1500 and Senate Bill 800 House

House B House Bill 5 (HB 5) ill 5 (HB 5) Graduation Requirements & Local Policy Wha hat t

Hacking databases for owning your data Cesar Cerrudo Cesar Cerrudo Esteban Martinez Fayo

Owning the data centre, Cisco NX-OS George Hedfors Working for Cybercom Sweden East AB

Example: Age, Income and Owning a flat 250 Training set

MISSION: To be the worlds most valued gold mining business by finding, developing, and owning

WINNING MESSAGES ON JUDGES, GUNS AND OWNING THE CONSTITUTIONS TEXT, HISTORY & VALUES Mark

Owning Your Health: Wellness Resources for Young Adults Ages 18-24 Evaluation Link:

The Many Flavors of Penalized Linear Discriminant Analysis Daniela M. Witten Assistant Professor

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2013 Overview

Towards the ultimate precision limits in parameter estimation: An introduction to quantum

Lecture 19 Spatial GLM + Point Reference Spatial Data Colin Rundel 11/09/2017 1 Spatial GLM

Homework Homework Lecture 7: Linear Classification Methods Final projects? Groups Topics

Linear classification Course of Machine Learning Master Degree in Computer Science University of

Lecture 8 N.MORGAN / B.GOLD LECTURE 8

Linear classifiers CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall

The owning house data Can we separate the points with a line? - PowerPoint PPT Presentation

Linear Discriminant Analysis Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata August 28, 2014 The owning house data Can we separate the points with a line? 200 Income (thousand

IS OWNING A CAT BETTER Elie J. Diner, PhD SlideTalk THAN OWNING A DOG? University of Cat

Courtyard House Anjuna Goa For Sale Imagine owning your own private forest your beautiful home

Will Swan En Energy Ho House Lab Laborat atories Energy House: Whole House Testing'

Comparison of House and Senate Budget Amendments House Bills and Senate Bills 29 and 30 House

Open house Open house Open house Open house on on on on on on on on World Raw Cashew

17/12461/OUT Tottenham House Tottenham House Tottenham House - Front Tottenham House - Front

House B House Bill 5 (HB 5) ill 5 (HB 5) Graduation Requirements &amp; Local Policy Wha hat t

My ecological house My ecological house is a house designed in such a way as to minimize

Comparison of House and Senate Budget Amendments House Bill 1500 and Senate Bill 800 House

House B House Bill 5 (HB 5) ill 5 (HB 5) Graduation Requirements &amp; Local Policy Wha hat t

Hacking databases for owning your data Cesar Cerrudo Cesar Cerrudo Esteban Martinez Fayo

Owning the data centre, Cisco NX-OS George Hedfors Working for Cybercom Sweden East AB

Example: Age, Income and Owning a flat 250 Training set

MISSION: To be the worlds most valued gold mining business by finding, developing, and owning

WINNING MESSAGES ON JUDGES, GUNS AND OWNING THE CONSTITUTIONS TEXT, HISTORY &amp; VALUES Mark

Owning Your Health: Wellness Resources for Young Adults Ages 18-24 Evaluation Link:

The Many Flavors of Penalized Linear Discriminant Analysis Daniela M. Witten Assistant Professor

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2013 Overview

Towards the ultimate precision limits in parameter estimation: An introduction to quantum

Lecture 19 Spatial GLM + Point Reference Spatial Data Colin Rundel 11/09/2017 1 Spatial GLM

Homework Homework Lecture 7: Linear Classification Methods Final projects? Groups Topics

Linear classification Course of Machine Learning Master Degree in Computer Science University of

Lecture 8 N.MORGAN / B.GOLD LECTURE 8

Linear classifiers CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall

House B House Bill 5 (HB 5) ill 5 (HB 5) Graduation Requirements & Local Policy Wha hat t

House B House Bill 5 (HB 5) ill 5 (HB 5) Graduation Requirements & Local Policy Wha hat t

WINNING MESSAGES ON JUDGES, GUNS AND OWNING THE CONSTITUTIONS TEXT, HISTORY & VALUES Mark