Machine Learning for Signal Processing Fundamentals of Linear Algebra - 2 Class 3. 8 Sep 2016 Instructor: Bhiksha Raj 11-755/18-797 1
Overview • Vectors and matrices • Vector spaces • Basic vector/matrix operations • Various matrix types • Projections • More on matrix types • Matrix determinants • Matrix inversion • Eigenanalysis • Singular value decomposition • Matrix Calculus 11-755/18-797 2
The importance of Bases (x, y, z) z u 3 u 2 y u 1 x • Conventional 3D representation – Each point (vector) is just a triplet of coordinates – In reality, the coordinates are weights! – X = x. u 1 + y. u 2 + z. u 3 – u 1 = [0 0 1], u 2 = [0 1 0], u 3 = [1 0 0] • Unit vectors in each of the three directions 11-755/18-797 3
The importance of Bases 1 𝒗 1 = 0 (x, y, z) 0 0 z 𝑦 1 0 0 𝒗 2 = 1 𝑧 𝒀 = = 𝑦 + 𝑧 + 𝑨 0 1 0 u 3 0 u 2 𝑨 0 0 1 0 y 𝒗 3 = 0 u 1 x 1 • Specialty of u 1 , u 2 , u 3 – Every point in the space can be expressed as some x. u 1 + y. u 2 + z. u 3 – All three “bases” u 1 , u 2 , u 3 are required 11-755/18-797 4
The importance of Bases (a, b, c) v 2 cV 3 v 3 𝑌 = 𝑏𝑾 1 + 𝑐𝑾 2 + 𝑑𝑾 3 bV 2 v 1 aV 1 • Is there any other set v 1 , v 2 ,.. v n which share this property – Any point can be expressed as a. v 1 + b. v 2 + c. v 3 … – How many “ v ”s will we require 11-755/18-797 5
Basis based representation u 3 v 3 v 1 v 2 u 1 u 2 • A “good” basis captures data structure • Here u 1 , u 2 and u 3 all take large values for data in the set • But in the ( v 1 v 2 v 3 ) set, coordinate values along v 3 are always small for data on the blue sheet – v 3 likely represents a “noise subspace” for these data 11-755/18-797 6
Basis based representation u 3 v 3 v 1 v 2 u 1 u 2 • The most important challenge in ML: Find the best set of bases for a given data set 11-755/18-797 7
Matrix as a Basis transform 𝐘 = 𝑏𝐰 1 + 𝑐𝐰 2 + 𝑑𝐰 3 , 𝐘 = 𝑦𝐯 1 + 𝑧𝐯 2 + 𝑨𝐯 3 𝑏 𝑦 𝑐 𝑧 = 𝐔 𝑑 𝑨 • A matrix transforms a representation in terms of a standard basis u 1 u 2 u 3 to a representation in terms of a different bases v 1 v 2 v 3 • Finding best bases: Find matrix that transforms standard representation to these bases 11-755/18-797 8
• Going on to more mundane stuff.. 11-755/18-797 9
Orthogonal/Orthonormal vectors x u A y B v w z . 0 0 A B xu yv zw • Two vectors are orthogonal if they are perpendicular to one another – A.B = 0 – A vector that is perpendicular to a plane is orthogonal to every vector on • Two vectors are ortho normal if – They are orthogonal – The length of each vector is 1.0 – Orthogonal vectors can be made orthonormal by scaling their lengths to 1.0 11-755/18-797 10
Orthogonal matrices All 3 at 90 o to 0 . 5 0 . 125 0 . 375 on another 0 . 5 0 . 125 0 . 375 0 0 . 75 0 . 5 • Ortho gonal Matrix : AA T = A T A = I – The matrix is square – All row vectors are orthonormal to one another • Every vector is perpendicular to the hyperplane formed by all other vectors – All column vectors are also orthonormal to one another – Observation: In an orthogonal matrix if the length of the row vectors is 1.0, the length of the column vectors is also 1.0 – Observation : In an orthogonal matrix no more than one row can have all entries with the same polarity (+ve or – ve) 11-755/18-797 11
Orthogonal Matrices q Ax • Orthogonal matrices will retain the length and relative angles between transformed vectors – Essentially, they are combinations of rotations, reflections and permutations – Rotation matrices and permutation matrices are all orthogonal 11-755/18-797 12
Orthogonal and Orthonormal Matrices 1 0 . 0675 0 . 1875 0 . 5 0 . 125 0 . 375 0 0 . 75 0 . 5 • If the vectors in the matrix are not unit length, it cannot be orthogonal – AA T != I, A T A != I – AA T = Diagonal or A T A = Diagonal, but not both – If all the entries are the same length, we can get AA T = A T A = Diagonal, though • A non-square matrix cannot be orthogonal – AA T =I or A T A = I, but not both 11-755/18-797 13
Matrix Rank and Rank-Deficient Matrices P * Cone = • Some matrices will eliminate one or more dimensions during transformation – These are rank deficient matrices – The rank of the matrix is the dimensionality of the transformed version of a full-dimensional object 11-755/18-797 14
Matrix Rank and Rank-Deficient Matrices Rank = 2 Rank = 1 • Some matrices will eliminate one or more dimensions during transformation – These are rank deficient matrices – The rank of the matrix is the dimensionality of the transformed version of a full-dimensional object 11-755/18-797 15
Projections are often examples of rank-deficient transforms M = W = P = W (W T W) -1 W T ; Projected Spectrogram = P*M The original spectrogram can never be recovered P is rank deficient P explains all vectors in the new spectrogram as a mixture of only the 4 vectors in W There are only a maximum of 4 linearly independent bases Rank of P is 4 11-755/18-797 16
Non-square Matrices . 8 . 9 ˆ ˆ ˆ . . x x x 1 2 N . . x x x ˆ ˆ ˆ . 1 . 9 . . y y y 1 2 N 1 2 N ˆ ˆ ˆ . . . . z z z . 6 0 y y y 1 2 N 1 2 N X = 2D data P = transform PX = 3D, rank 2 • Non-square matrices add or subtract axes – More rows than columns add axes • But does not increase the dimensionality of the data axes • May reduce dimensionality of the data 11-755/18-797 17
Non-square Matrices . . x x x ˆ ˆ ˆ . . x x x 1 2 N 1 2 N . 3 1 . 2 ˆ ˆ ˆ . . y y y . . y y y 1 2 N 1 2 N . 5 1 1 . . z z z 1 2 N X = 3D data, rank 3 P = transform PX = 2D, rank 2 • Non-square matrices add or subtract axes – More rows than columns add axes • But does not increase the dimensionality of the data – Fewer rows than columns reduce axes • May reduce dimensionality of the data 11-755/18-797 18
The Rank of a Matrix . 8 . 9 . 3 1 . 2 . 1 . 9 . 5 1 1 . 6 0 • The matrix rank is the dimensionality of the transformation of a full- dimensioned object in the original space • The matrix can never increase dimensions – Cannot convert a circle to a sphere or a line to a circle • The rank of a matrix can never be greater than the lower of its two dimensions 11-755/18-797 19
The Rank of Matrix M = Projected Spectrogram = P * M Every vector in it is a combination of only 4 bases The rank of the matrix is the smallest no. of bases required to describe the output E.g. if note no. 4 in P could be expressed as a combination of notes 1,2 and 3, it provides no additional information Eliminating note no. 4 would give us the same projection The rank of P would be 3! 11-755/18-797 20
Matrix rank is unchanged by transposition 0 . 9 0 . 5 0 . 8 0 . 9 0 . 1 0 . 42 0 . 1 0 . 4 0 . 9 0 . 5 0 . 4 0 . 44 0 . 42 0 . 44 0 . 86 0 . 8 0 . 9 0 . 86 • If an N-dimensional object is compressed to a K-dimensional object by a matrix, it will also be compressed to a K-dimensional object by the transpose of the matrix 11-755/18-797 21
Matrix Determinant (r1+r2) (r2) (r1) (r2) (r1) • The determinant is the “volume” of a matrix • Actually the volume of a parallelepiped formed from its row vectors – Also the volume of the parallelepiped formed from its column vectors • Standard formula for determinant: in text book 11-755/18-797 22
Matrix Determinant: Another Perspective Volume = V 1 Volume = V 2 0 . 8 0 0 . 7 1 . 0 0 . 8 0 . 8 0 . 7 0 . 9 0 . 7 • The determinant is the ratio of N-volumes – If V 1 is the volume of an N- dimensional sphere “O” in N -dimensional space • O is the complete set of points or vertices that specify the object – If V 2 is the volume of the N-dimensional ellipsoid specified by A*O, where A is a matrix that transforms the space – |A| = V 2 / V 1 11-755/18-797 23
Matrix Determinants • Matrix determinants are only defined for square matrices – They characterize volumes in linearly transformed space of the same dimensionality as the vectors • Rank deficient matrices have determinant 0 – Since they compress full-volumed N-dimensional objects into zero- volume N-dimensional objects • E.g. a 3-D sphere into a 2-D ellipse: The ellipse has 0 volume (although it does have area) • Conversely, all matrices of determinant 0 are rank deficient – Since they compress full-volumed N-dimensional objects into zero-volume objects 11-755/18-797 24
Recommend
More recommend