machine learning for signal
play

Machine Learning for Signal Processing Fundamentals of Linear - PowerPoint PPT Presentation

Machine Learning for Signal Processing Fundamentals of Linear Algebra - 2 Class 3. 8 Sep 2016 Instructor: Bhiksha Raj 11-755/18-797 1 Overview Vectors and matrices Vector spaces Basic vector/matrix operations Various matrix


  1. Machine Learning for Signal Processing Fundamentals of Linear Algebra - 2 Class 3. 8 Sep 2016 Instructor: Bhiksha Raj 11-755/18-797 1

  2. Overview • Vectors and matrices • Vector spaces • Basic vector/matrix operations • Various matrix types • Projections • More on matrix types • Matrix determinants • Matrix inversion • Eigenanalysis • Singular value decomposition • Matrix Calculus 11-755/18-797 2

  3. The importance of Bases (x, y, z) z u 3 u 2 y u 1 x • Conventional 3D representation – Each point (vector) is just a triplet of coordinates – In reality, the coordinates are weights! – X = x. u 1 + y. u 2 + z. u 3 – u 1 = [0 0 1], u 2 = [0 1 0], u 3 = [1 0 0] • Unit vectors in each of the three directions 11-755/18-797 3

  4. The importance of Bases 1 𝒗 1 = 0 (x, y, z) 0 0 z 𝑦 1 0 0 𝒗 2 = 1 𝑧 𝒀 = = 𝑦 + 𝑧 + 𝑨 0 1 0 u 3 0 u 2 𝑨 0 0 1 0 y 𝒗 3 = 0 u 1 x 1 • Specialty of u 1 , u 2 , u 3 – Every point in the space can be expressed as some x. u 1 + y. u 2 + z. u 3 – All three “bases” u 1 , u 2 , u 3 are required 11-755/18-797 4

  5. The importance of Bases (a, b, c) v 2 cV 3 v 3 𝑌 = 𝑏𝑾 1 + 𝑐𝑾 2 + 𝑑𝑾 3 bV 2 v 1 aV 1 • Is there any other set v 1 , v 2 ,.. v n which share this property – Any point can be expressed as a. v 1 + b. v 2 + c. v 3 … – How many “ v ”s will we require 11-755/18-797 5

  6. Basis based representation u 3 v 3 v 1 v 2 u 1 u 2 • A “good” basis captures data structure • Here u 1 , u 2 and u 3 all take large values for data in the set • But in the ( v 1 v 2 v 3 ) set, coordinate values along v 3 are always small for data on the blue sheet – v 3 likely represents a “noise subspace” for these data 11-755/18-797 6

  7. Basis based representation u 3 v 3 v 1 v 2 u 1 u 2 • The most important challenge in ML: Find the best set of bases for a given data set 11-755/18-797 7

  8. Matrix as a Basis transform 𝐘 = 𝑏𝐰 1 + 𝑐𝐰 2 + 𝑑𝐰 3 , 𝐘 = 𝑦𝐯 1 + 𝑧𝐯 2 + 𝑨𝐯 3 𝑏 𝑦 𝑐 𝑧 = 𝐔 𝑑 𝑨 • A matrix transforms a representation in terms of a standard basis u 1 u 2 u 3 to a representation in terms of a different bases v 1 v 2 v 3 • Finding best bases: Find matrix that transforms standard representation to these bases 11-755/18-797 8

  9. • Going on to more mundane stuff.. 11-755/18-797 9

  10. Orthogonal/Orthonormal vectors     x u       A y B v       w   z      . 0 0 A B xu yv zw • Two vectors are orthogonal if they are perpendicular to one another – A.B = 0 – A vector that is perpendicular to a plane is orthogonal to every vector on • Two vectors are ortho normal if – They are orthogonal – The length of each vector is 1.0 – Orthogonal vectors can be made orthonormal by scaling their lengths to 1.0 11-755/18-797 10

  11. Orthogonal matrices All 3 at 90 o to    0 . 5 0 . 125 0 . 375 on another    0 . 5 0 . 125 0 . 375    0 0 . 75 0 . 5    • Ortho gonal Matrix : AA T = A T A = I – The matrix is square – All row vectors are orthonormal to one another • Every vector is perpendicular to the hyperplane formed by all other vectors – All column vectors are also orthonormal to one another – Observation: In an orthogonal matrix if the length of the row vectors is 1.0, the length of the column vectors is also 1.0 – Observation : In an orthogonal matrix no more than one row can have all entries with the same polarity (+ve or – ve) 11-755/18-797 11

  12. Orthogonal Matrices q Ax • Orthogonal matrices will retain the length and relative angles between transformed vectors – Essentially, they are combinations of rotations, reflections and permutations – Rotation matrices and permutation matrices are all orthogonal 11-755/18-797 12

  13. Orthogonal and Orthonormal Matrices    1 0 . 0675 0 . 1875     0 . 5 0 . 125 0 . 375    0 0 . 75 0 . 5   • If the vectors in the matrix are not unit length, it cannot be orthogonal – AA T != I, A T A != I – AA T = Diagonal or A T A = Diagonal, but not both – If all the entries are the same length, we can get AA T = A T A = Diagonal, though • A non-square matrix cannot be orthogonal – AA T =I or A T A = I, but not both 11-755/18-797 13

  14. Matrix Rank and Rank-Deficient Matrices P * Cone = • Some matrices will eliminate one or more dimensions during transformation – These are rank deficient matrices – The rank of the matrix is the dimensionality of the transformed version of a full-dimensional object 11-755/18-797 14

  15. Matrix Rank and Rank-Deficient Matrices Rank = 2 Rank = 1 • Some matrices will eliminate one or more dimensions during transformation – These are rank deficient matrices – The rank of the matrix is the dimensionality of the transformed version of a full-dimensional object 11-755/18-797 15

  16. Projections are often examples of rank-deficient transforms M = W =  P = W (W T W) -1 W T ; Projected Spectrogram = P*M  The original spectrogram can never be recovered  P is rank deficient  P explains all vectors in the new spectrogram as a mixture of only the 4 vectors in W  There are only a maximum of 4 linearly independent bases  Rank of P is 4 11-755/18-797 16

  17. Non-square Matrices   . 8 . 9  ˆ ˆ ˆ  . . x x x     1 2 N . . x x x   ˆ ˆ ˆ . 1 . 9 . .   y y y 1 2 N     1 2 N   ˆ ˆ ˆ  . .   . .  z z z  . 6 0  y y y 1 2 N 1 2 N X = 2D data P = transform PX = 3D, rank 2 • Non-square matrices add or subtract axes – More rows than columns  add axes • But does not increase the dimensionality of the data axes • May reduce dimensionality of the data 11-755/18-797 17

  18. Non-square Matrices   . . x x x  ˆ ˆ ˆ  . . x x x 1 2 N     1 2 N . 3 1 . 2   ˆ ˆ ˆ . . y y y  . .    y y y   1 2 N 1 2 N   . 5 1 1    . .  z z z 1 2 N X = 3D data, rank 3 P = transform PX = 2D, rank 2 • Non-square matrices add or subtract axes – More rows than columns  add axes • But does not increase the dimensionality of the data – Fewer rows than columns  reduce axes • May reduce dimensionality of the data 11-755/18-797 18

  19. The Rank of a Matrix   . 8 . 9     . 3 1 . 2 . 1 . 9         . 5 1 1 . 6 0   • The matrix rank is the dimensionality of the transformation of a full- dimensioned object in the original space • The matrix can never increase dimensions – Cannot convert a circle to a sphere or a line to a circle • The rank of a matrix can never be greater than the lower of its two dimensions 11-755/18-797 19

  20. The Rank of Matrix M =  Projected Spectrogram = P * M  Every vector in it is a combination of only 4 bases  The rank of the matrix is the smallest no. of bases required to describe the output  E.g. if note no. 4 in P could be expressed as a combination of notes 1,2 and 3, it provides no additional information  Eliminating note no. 4 would give us the same projection  The rank of P would be 3! 11-755/18-797 20

  21. Matrix rank is unchanged by transposition     0 . 9 0 . 5 0 . 8 0 . 9 0 . 1 0 . 42     0 . 1 0 . 4 0 . 9 0 . 5 0 . 4 0 . 44          0 . 42 0 . 44 0 . 86   0 . 8 0 . 9 0 . 86  • If an N-dimensional object is compressed to a K-dimensional object by a matrix, it will also be compressed to a K-dimensional object by the transpose of the matrix 11-755/18-797 21

  22. Matrix Determinant (r1+r2) (r2) (r1) (r2) (r1) • The determinant is the “volume” of a matrix • Actually the volume of a parallelepiped formed from its row vectors – Also the volume of the parallelepiped formed from its column vectors • Standard formula for determinant: in text book 11-755/18-797 22

  23. Matrix Determinant: Another Perspective Volume = V 1 Volume = V 2   0 . 8 0 0 . 7   1 . 0 0 . 8 0 . 8      0 . 7 0 . 9 0 . 7  • The determinant is the ratio of N-volumes – If V 1 is the volume of an N- dimensional sphere “O” in N -dimensional space • O is the complete set of points or vertices that specify the object – If V 2 is the volume of the N-dimensional ellipsoid specified by A*O, where A is a matrix that transforms the space – |A| = V 2 / V 1 11-755/18-797 23

  24. Matrix Determinants • Matrix determinants are only defined for square matrices – They characterize volumes in linearly transformed space of the same dimensionality as the vectors • Rank deficient matrices have determinant 0 – Since they compress full-volumed N-dimensional objects into zero- volume N-dimensional objects • E.g. a 3-D sphere into a 2-D ellipse: The ellipse has 0 volume (although it does have area) • Conversely, all matrices of determinant 0 are rank deficient – Since they compress full-volumed N-dimensional objects into zero-volume objects 11-755/18-797 24

Recommend


More recommend