Machine Learning for Signal Processing Non-negative Matrix Factorization Class 10. 7 Oct 2014 Instructor: Bhiksha Raj With examples from Paris Smaragdis 7 Oct 2014 11755/18797 1
The Engineer and the Musician Once upon a time a rich potentate discovered a previously unknown recording of a beautiful piece of music. Unfortunately it was badly damaged. He greatly wanted to find out what it would sound like if it were not. So he hired an engineer and a musician to solve the problem.. 2 7 Oct 2014
The Engineer and the Musician The engineer worked for many years. He spent much money and published many papers. Finally he had a somewhat scratchy restoration of the music.. The musician listened to the music carefully for a day, transcribed it, broke out his trusty keyboard and replicated the music. 3 7 Oct 2014
The Prize Who do you think won the princess? 4 7 Oct 2014
The search for building blocks What composes an audio signal? E.g. notes compose music 5 7 Oct 2014
The properties of building blocks Constructive composition A second note does not diminish a first note Linearity of composition Notes do not distort one another 6 7 Oct 2014
Looking for building blocks in sound ? Can we compute the building blocks from sound itself 7 7 Oct 2014
A property of spectrograms + = + = The spectrogram of the sum of two signals is the sum of their spectrograms This is a property of the Fourier transform that is used to compute the columns of the spectrogram The individual spectral vectors of the spectrograms add up Each column of the first spectrogram is added to the same column of the second Building blocks can be learned by using this property Learn the building blocks of the “composed” signal by finding what vectors were added to produce it 8 7 Oct 2014
Another property of spectrograms + = + = We deal with the power in the signal The power in the sum of two signals is the sum of the powers in the individual signals The power of any frequency component in the sum at any time is the sum of the powers in the individual signals at that frequency and time The power is strictly non-negative (real) 9 7 Oct 2014
Building Blocks of Sound The building blocks of sound are (power) spectral structures E.g. notes build music The spectra are entirely non-negative The complete sound is composed by constructive combination of the building blocks scaled to different non-negative gains E.g. notes are played with varying energies through the music The sound from the individual notes combines to form the final spectrogram 10 The final spectrogram is also non-negative
Building Blocks of Sound w 11 w 12 w 13 w 14 Each frame of sound is composed by activating each spectral building block by a frame-specific amount Individual frames are composed by activating the building blocks to different degrees E.g. notes are strummed with different energies to compose the frame 11 7 Oct 2014
Composing the Sound w 21 w 22 w 23 w 24 Each frame of sound is composed by activating each spectral building block by a frame-specific amount Individual frames are composed by activating the building blocks to different degrees E.g. notes are strummed with different energies to compose the frame 12 7 Oct 2014
Building Blocks of Sound w 31 w 32 w 33 w 34 Each frame of sound is composed by activating each spectral building block by a frame-specific amount Individual frames are composed by activating the building blocks to different degrees E.g. notes are strummed with different energies to compose the frame 13 7 Oct 2014
Building Blocks of Sound w 41 w 42 w 43 w 44 Each frame of sound is composed by activating each spectral building block by a frame-specific amount Individual frames are composed by activating the building blocks to different degrees E.g. notes are strummed with different energies to compose the frame 14 7 Oct 2014
Building Blocks of Sound Each frame of sound is composed by activating each spectral building block by a frame-specific amount Individual frames are composed by activating the building blocks to different degrees E.g. notes are strummed with different energies to compose the frame 15 7 Oct 2014
The Problem of Learning Given only the final sound, determine its building blocks From only listening to music, learn all about musical notes! 16 7 Oct 2014
In Math ... V w B w B w B 1 11 1 21 2 31 3 Each frame is a non-negative power spectral vector Each note is a non-negative power spectral vector Each frame is a non-negative combination of the notes 17 7 Oct 2014
Expressing a vector in terms of other vectors 2 3 4 B 1 2 V 5 3 B 2 18 7 Oct 2014
Expressing a vector in terms of other vectors 2 3 B 1 4 2 V a.B 1 b.B 2 5 3 B 2 19 7 Oct 2014
Expressing a vector in terms of other vectors 2. a + 5. b = 4 3. a + -3. b = 2 2 3 B 1 2 5 4 a 4 3 3 2 b 2 V 1 2 5 4 a 2 3 3 b a.B 1 1 . 04761905 a b.B 2 0 . 38095238 b 5 3 B 2 1 . 048 0 . 381 V B B 1 2 20 7 Oct 2014
Power spectral vectors: Requirements V aB bB V has only non-negative 1 2 components 2 3 Is a power spectrum 4 B 1 and B 2 have only non- 2 B 1 negative components V 5 Power spectra of building blocks of a .B 1 audio 1 b .B 2 E.g. power spectra of notes a and b are strictly non- B 2 negative Building blocks don’t subtract from one another 7 Oct 2014 11755/18797 21
Learning building blocks: Restating the problem Given a collection of spectral vectors (from the composed sound) … Find a set of “basic” sound spectral vectors such that … All of the spectral vectors can be composed through constructive addition of the bases We never have to flip the direction of any basis 22
Learning building blocks: Restating the problem V BW Each column of V is one “composed” spectral vector Each column of B is one building block One spectral basis Each column of W has the scaling factors for the building blocks to compose the corresponding column of V All columns of V are non-negative All entries of B and W must also be non- negative 23 7 Oct 2014
Non-negative matrix factorization : Basics NMF is used in a compositional model Data are assumed to be non-negative E.g. power spectra Every data vector is explained as a purely constructive linear composition of a set of bases V = S i w i B i The bases B i are in the same domain as the data I.e. they are power spectra Constructive composition: no subtraction allowed Weights w i must all be non-negative All components of bases B i must also be non-negative 24 7 Oct 2014
Interpreting non-negative factorization B 2 B 1 Bases are non-negative, lie in the positive quadrant Blue lines represent bases, blue dots represent vectors Any vector that lies between the bases (highlighted region) can be expressed as a non-negative combination of bases E.g. the black dot 25 7 Oct 2014
Interpreting non-negative factorization b B 2 a B 1 ap pr o xi Vectors outside the shaded enclosed area can only be expressed m as a linear combination of the bases by reversing a basis ati o I.e. assigning a negative weight to the basis n E.g. the red dot wi ll Alpha and beta are scaling factors for bases d Beta weighting is negative 26 7 Oct 2014
Interpreting non-negative factorization b B 2 a B 1 If we approximate the red dot as a non-negative combination of the bases, the approximation will lie in the shaded region On or close to the boundary The approximation has error 27 7 Oct 2014
The NMF representation The representation characterizes all data as lying within a compact convex region “Compact” enclosing only a small fraction of the entire space The more compact the enclosed region, the more it localizes the data within it Represents the boundaries of the distribution of the data better Conventional statistical models represent the mode of the distribution The bases must be chosen to Enclose the data as compactly as possible And also enclose as much of the data as possible Data that are not enclosed are not represented correctly 28 7 Oct 2014
Recommend
More recommend