Independent Component Independent Component Analysis y Class 20. 8 - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Independent Component Independent Component Analysis y Class 20. 8 Nov 2012 Instructor: Bhiksha Raj 8 Nov 2012 11755/18797 1

A brief review of basic probability  Uncorrelated: Two random variables X and Y are uncorrelated iff: uncorrelated iff:  The average value of the product of the variables equals the product of their individual averages  Setup: Each draw produces one instance of X and one instance of Y instance of Y  I.e one instance of (X,Y)  E[XY] = E[X]E[Y]  E[XY] E[X]E[Y]  The average value of X is the same regardless of the value of Y 8 Nov 2012 11755/18797 2

Uncorrelatedness  Which of the above represent uncorrelated RVs? 8 Nov 2012 11755/18797 3

A brief review of basic probability  Independence: Two random variables X and Y are independent iff:  Their joint probability equals the product of their individual probabilities  P(X Y) = P(X)P(Y)  P(X,Y) P(X)P(Y)   The average value of X is the same regardless of the value of Y  E[X|Y] = E[X] 8 Nov 2012 11755/18797 4

A brief review of basic probability  Independence: Two random variables X and Y are independent iff:  The average value of any function X is the same The average value of any function X is the same regardless of the value of Y  E[f(X)g(Y)] = E[f(X)] E[g(Y)] for all f(), g() E[f(X) (Y)] E[f(X)] E[ (Y)] f ll f() () 8 Nov 2012 11755/18797 5

Independence  Which of the above represent independent RVs?  Which represent uncorrelated RVs? Whi h t l t d RV ? 8 Nov 2012 11755/18797 6

A brief review of basic probability p(x) ( ) f(x)  The expected value of an odd function of an RV is 0 if  The RV is 0 mean  The RV is 0 mean  The PDF is of the RV is symmetric around 0  E[f(X)] = 0 if f(X) is odd symmetric E[f(X)] 0 if f(X) i dd t i 8 Nov 2012 11755/18797 7

A brief review of basic info. theory T(all), M(ed), S(hort)…    ( ) ( )[ log ( )] H X P X P X X  Entropy: The minimum average number of bits to  Entropy: The minimum average number of bits to transmit to convey a symbol X X T, M, S… M F F M..  Y Y   ( , ) ( , )[ log ( , )] H X Y P X Y P X Y , X Y  Joint entropy: The minimum average number of bits to convey sets (pairs here) of symbols 8 Nov 2012 11755/18797 8

A brief review of basic info. theory X X T, M, S… M F F M.. Y        ( | ) ( ) ( | )[ log ( | )] ( , )[ log ( | )] H X Y P Y P X Y P X Y P X Y P X Y , Y X X Y  Conditional Entropy: The minimum average number of bits to transmit to convey a symbol X, y y , after symbol Y has already been conveyed  Averaged over all values of X and Y  Averaged over all values of X and Y 8 Nov 2012 11755/18797 9

A brief review of basic info. theory              ( | ) ( ) ( | )[ log ( | )] ( ) ( )[ log ( )] ( ) H X Y P Y P X Y P X Y P Y P X P X H X Y X Y X  Conditional entropy of X = H(X) if X is  Conditional entropy of X = H(X) if X is independent of Y         ( ( , ) ) ( ( , )[ )[ log l ( ( , )] )] ( ( , )[ )[ log l ( ( ) ) ( ( )] )] H H X X Y Y P P X X Y Y P P X X Y Y P P X X Y Y P P X X P P Y Y , , X Y X Y        ( , ) log ( ) ( , ) log ( ) ( ) ( ) P X Y P X P X Y P Y H X H Y X , Y X , Y  Joint entropy of X and Y is the sum of the entropies of X and Y if they are independent p y p 8 Nov 2012 11755/18797 10

Onward.. 8 Nov 2012 11755/18797 11

Projection: multiple notes j M = W =  P = W (W T W) ‐ 1 W T ( )  Projected Spectrogram = P * M 8 Nov 2012 11755/18797 12

We’re actually computing a score M = H = ? W =  M ~ WH  H = pinv (W)M 8 Nov 2012 11755/18797 13

How about the other way? M = H = ? ? ? ? U = U = W = W =  M ~ WH W = M pinv (V) U = WH 8 Nov 2012 11755/18797 14

So what are we doing here? H = ? W = ?  M ~ WH is an approximation  Given W , estimate H to minimize error    2     2 arg min || || arg min ( ) H M W H M W H F ij ij H H i j  Must ideally find transcription of given notes 8 Nov 2012 11755/18797 15

Going the other way.. H W =? ?  M ~ WH is an approximation  Given H , estimate W to minimize error    2     2 arg min || || arg min ( ) W M W H M W H F ij ij W H i j  Must ideally find the notes corresponding to the d ll f d h d h transcription 8 Nov 2012 11755/18797 16

When both parameters are unknown H = ? W =? approx(M) = ? approx(M) ?  Must estimate both H and W to best approximate M  Ideally, must learn both the notes and their transcription! 8 Nov 2012 11755/18797 17

A least squares solution   2 , arg min || || W H M W H , F W H  Unconstrained  For any W,H that minimizes the error, W’=WA, H’=A -1 H also minimizes the error for any invertible A also minimizes the error for any invertible A H H  For our problem, lets consider the “truth”.. For our problem, lets consider the truth ..  When one note occurs, the other does not T h j = 0 for all i != j  h i i j  The rows of H are uncorrelated 8 Nov 2012 11755/18797 18

A least squares solution H  Assume: HH T = I  Normalizing all rows of H to length 1 g g  pinv (H) = H T  Projecting M onto H  Projecting M onto H  W = M pinv (H) = MH T  WH = M H T H  WH M H H   2 , arg min || || W H M W H , F W H   2 T H arg min || || H M M H Constraint: Rank(H) = 4 F H 8 Nov 2012 11755/18797 19

Finding the notes   2 T H arg min || || H M M H F H  Note H T H != I  Only HH T = I  Could also be rewritten as       T T arg min ( ) H trace M I H H M H H     T T arg min ( ) H trace M M I H H H     T T arg min ( )( ) H M I H H trace Correlatio n H      T T T T arg max ( ) H trace Correlatio n M H H H 8 Nov 2012 11755/18797 20

Finding the notes  Constraint: every row of H has length 1            T T T arg max ( ) H trace Correlatio n M H H trace H H H  Differentiating and equating to 0  H   T ( ( M ) M ) Correlatio Correlatio n n H H H  Simply requiring the rows of H to be orthonormal p y q g gives us that H is the set of Eigenvectors of the data in M T 8 Nov 2012 11755/18797 21

Equivalences        T T T arg max ( ) H trace Correlatio n M H H trace H H H  is identical to         2 2 T , arg min || || || || W H M W H h h h , F i i ij i j W H  i i j  Minimize least squares error with the constraint that the rows of H are length 1 and orthogonal to one another 8 Nov 2012 11755/18797 22

So how does that work?  There are 12 notes in the segment, hence we try to estimate 12 notes to estimate 12 notes.. 8 Nov 2012 11755/18797 23

So how does that work?  The first three “notes” and their contributions  The spectrograms of the notes are statistically uncorrelated The spectrograms of the notes are statistically uncorrelated 8 Nov 2012 11755/18797 24

Finding the notes  Can find W instead of H   2 2 T T arg min i || || || || W W M M W W W W M M F W  Solving the above with the constraints that the  Solving the above, with the constraints that the columns of W are orthonormal gives you the eigen vectors of the data in M eigen vectors of the data in M        T T arg max ( ) W W W M W W trace Correlatio n trace W   ( ) Correlatio n M W W 8 Nov 2012 11755/18797 25

So how does that work?  There are 12 notes in the segment, hence we try to estimate 12 notes.. 8 Nov 2012 11755/18797 26

Our notes are not orthogonal  Overlapping frequencies O l i f i  Note occur concurrently  Harmonica continues to resonate to previous note  More generally, simple orthogonality will not give us the desired solution 8 Nov 2012 11755/18797 28

What else can we look for?  Assume: The “transcription” of one note does not p depend on what else is playing  Or, in a multi ‐ instrument piece, instruments are playing independently of one another  Not strictly true, but still.. 8 Nov 2012 11755/18797 29

Formulating it with Independence     2 , arg min || || ( . . . . ) W H M W H rows of H are independen t , F W H  Impose statistical independence constraints on  Impose statistical independence constraints on decomposition 8 Nov 2012 11755/18797 30

Independent Component Independent Component Analysis y Class 20. 8 - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Independent Component Independent Component Analysis y Class 20. 8 Nov 2012 Instructor: Bhiksha Raj 8 Nov 2012 11755/18797 1 A brief review of basic probability Uncorrelated: Two random

Independent Component Analysis Aleix M. Martinez aleix@ece.osu.edu Independent Component

Introduction to Machine Learning 10701 Independent Component Analysis Barnabs Pczos &

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Introduction to Machine Learning CMU-10701 20. Independent Component Analysis Barnabs Pczos

Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006

Functional components Notification component Application received Refuse ? Notification

WIO IOSAP Project Budget Nairobi Convention WIO IOSAP Budget per Project Component COMPONENT

Hebbian Learning, Hebbian Learning Principal Component Analysis, and Independent Component

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Component selection 1 (c) 2020 A.J.M. Montagne Component selection + - + - + - 2 (c)

For use in AIM Awards centres Component Level: Level Three Component Guided Learning Hours: 21

For use in AIM Awards centres Component Level: Level Three Component Guided Learning Hours: 28

CS530L lab component of lab component of CS530L Security Systems course Security

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Component Analysis for PR & HS Component Analysis for PR & HS Computer Vision &

Deciding Kleene Algebra Terms (In-)Equivalence in Coq Nelma Moreira, David Pereira and Simo

Unix at 50 Unix V7 at 40 A Brief History of time_t Warner Losh Prehistory: 1950-1969 Bell Labs

Chip Developments of the Bonn Group Hans Krger, Bonn University -1- ASIC Design Projects

The story of the film so far... (Temporally homogeneous) Markov chains { X 0 , X 1 , . . . } are

SQLiteMap: package to manage vector graphical maps using SQLite Norbert Solymosi, 1 Andrea Harnos,

modified action as output modified action as output Not all that seems vecorised is...

R - SQL DBI RSQLite sqldf : sqldf > library(sqldf)

Executable File Formats Portable Executable (PE) Executable

Independent Component Independent Component Analysis y Class 20. 8 - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Independent Component Independent Component Analysis y Class 20. 8 Nov 2012 Instructor: Bhiksha Raj 8 Nov 2012 11755/18797 1 A brief review of basic probability Uncorrelated: Two random

Independent Component Analysis Aleix M. Martinez aleix@ece.osu.edu Independent Component

Introduction to Machine Learning 10701 Independent Component Analysis Barnabs Pczos &amp;

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

Introduction to Machine Learning CMU-10701 20. Independent Component Analysis Barnabs Pczos

Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006

Functional components Notification component Application received Refuse ? Notification

WIO IOSAP Project Budget Nairobi Convention WIO IOSAP Budget per Project Component COMPONENT

Hebbian Learning, Hebbian Learning Principal Component Analysis, and Independent Component

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Component selection 1 (c) 2020 A.J.M. Montagne Component selection + - + - + - 2 (c)

For use in AIM Awards centres Component Level: Level Three Component Guided Learning Hours: 21

For use in AIM Awards centres Component Level: Level Three Component Guided Learning Hours: 28

CS530L lab component of lab component of CS530L Security Systems course Security

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Component Analysis for PR &amp; HS Component Analysis for PR &amp; HS Computer Vision &amp;

Deciding Kleene Algebra Terms (In-)Equivalence in Coq Nelma Moreira, David Pereira and Simo

Unix at 50 Unix V7 at 40 A Brief History of time_t Warner Losh Prehistory: 1950-1969 Bell Labs

Chip Developments of the Bonn Group Hans Krger, Bonn University -1- ASIC Design Projects

The story of the film so far... (Temporally homogeneous) Markov chains { X 0 , X 1 , . . . } are

SQLiteMap: package to manage vector graphical maps using SQLite Norbert Solymosi, 1 Andrea Harnos,

modified action as output modified action as output Not all that seems vecorised is...

R - SQL DBI RSQLite sqldf : sqldf &gt; library(sqldf)

Executable File Formats Portable Executable (PE) Executable

Introduction to Machine Learning 10701 Independent Component Analysis Barnabs Pczos &

Component Analysis for PR & HS Component Analysis for PR & HS Computer Vision &

R - SQL DBI RSQLite sqldf : sqldf > library(sqldf)