Statistical Machine Learning Lecture 02: Linear Algebra Refresher Kristian Kersting TU Darmstadt Summer Term 2020 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 1 / 37
Today’s Objectives Make you remember Linear Algebra! I know this is mostly easy but some of you may have forgotten all of it... Covered Topics: Vectors, Matrices Linear Transformations K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 2 / 37
Outline 1. Vectors 2. Matrices 3. Operations and Linear Transformations 4. Wrap-Up K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 3 / 37
1. Vectors Outline 1. Vectors 2. Matrices 3. Operations and Linear Transformations 4. Wrap-Up K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 4 / 37
1. Vectors Vectors 37 37 10 25 66 72 Joe = 72 , Mary = 30 , Carol = 65 , Brad = 67 , Joe = 175 175 61 121 155 8 1946 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 5 / 37
1. Vectors What can you do with vectors? Multiplication by a scalar c v � 2 � 4 � � 2 = 1 2 − 3 − 15 = 5 4 20 1 5 v 1 c v 1 . . c v = c . . = . . v n c v n K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 6 / 37
1. Vectors What can you do with vectors? Addition of vectors v 1 + v 2 1 2 3 + = 2 1 3 1 3 4 � 2 � 0 � � � � � � 3 5 + + = 1 1 − 3 − 1 a 1 b 1 a 1 + b 1 . . . + = . . . . . . a n b n a n + b n K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 7 / 37
1. Vectors Linear Combination of Vectors By positive recombination we can obtain: u = c 1 v 1 + c 2 v 2 + . . . + c n v n Examples: � 1 � 2 � � and 1 2 � 1 � 2 � � and 1 1 � 1 � 2 � − 1 � � � , and 1 1 3 1 3 9 and 2 , 2 10 0 0 0 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 8 / 37
1. Vectors Inner Product and Length of a Vector Inner Product 3 1 v = − 1 , w = 2 2 1 v · w = v ⊺ w = ( 3 · 1 ) + ( − 1 · 2 ) + ( 2 · 1 ) = 3 Length of a vector (Frobenius norm) � v � = ( v · v ) 1 / 2 � c v � = | c | � v � � v 1 + v 2 � ≤ � v 1 � + � v 2 � (triangle inequality) K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 9 / 37
1. Vectors Angles between Vectors The angle between vectors is defined by � n i = 1 v i w i v · w cos θ = � v �� w � = � � n � 1 / 2 � � n � 1 / 2 i = 1 v 2 i = 1 w 2 i i Example: � 0 � 1 � � Find the angle between vectors v 1 = and v 2 = 1 1 √ v 1 · v 2 = 1 , � v 1 � = 1 , � v 2 � = 2 1 cos θ = 2 = 0 . 707 , θ = π/ 4 √ 1 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 10 / 37
1. Vectors Projections of Vectors: Basic Idea What is a projection of v onto w ? Formally = � v � cos θ x v · w = � v � � v � � w � v · w = � w � Note that x is a not a vector! K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 11 / 37
1. Vectors Vector Transpose, Inner and Outer Products Vector Transpose 3 � 3 , v ⊺ = 2 � v = 1 1 2 Inner Product 0 � 3 � 6 � 2 � = v ⊺ u = 1 4 1 Outer Product 1 3 1 2 wv ⊺ = � � 4 3 1 2 = 12 4 8 0 0 0 0 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 12 / 37
2. Matrices Outline 1. Vectors 2. Matrices 3. Operations and Linear Transformations 4. Wrap-Up K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 13 / 37
2. Matrices Matrices Examples � 3 � 4 5 M = , 2x3 matrix 1 0 1 3 0 0 N = 0 7 0 , 3x3 matrix 0 0 1 � 10 � − 1 P = , 2x2 matrix − 1 27 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 14 / 37
2. Matrices What can you do with Matrices? Multiplication by Scalars � 3 � 9 � � 4 5 12 15 3 · M = 3 = 1 0 1 3 0 3 Addition of Matrices � − 1 � 3 � 2 � � � 4 5 0 2 4 7 M + N = + = − 1 1 0 1 4 1 5 1 0 Addition is only defined for matrices with the same dimensions. Transpose of a Matrix � 3 3 1 � ⊺ 4 5 M ⊺ = = 4 0 1 0 1 5 2 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 15 / 37
2. Matrices Matrix-Vector multiplication Multiplication of a Vector by a Matrix � � 3 � 3 · 1 + 4 · 0 + 5 · 2 � 13 1 � � 4 5 = u = Wv = 0 = 1 · 1 + 0 · 0 + 1 · 2 1 0 1 3 2 Think of it as v 1 | | | . . w 1 . . . w n = v 1 w 1 + . . . + v n w n . | | | v n Dimensions: W ∈ R M × N , v ∈ R N × 1 , u ∈ R M × 1 Hence � 3 � 4 � 5 � 13 � � � � u = v 1 w 1 + v 2 w 2 + v 3 w 3 = 1 + 0 + 2 = 1 0 1 3 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 16 / 37
2. Matrices Matrix-Matrix multiplication Multiplication of a Matrix by a Matrix � � 1 1 2 2 3 = C = AB = 3 4 4 5 6 5 6 � 1 · 1 + 2 · 3 + 3 · 5 � 22 � � 1 · 2 + 2 · 4 + 3 · 6 28 = 4 · 1 + 5 · 3 + 6 · 5 4 · 2 + 5 · 4 + 6 · 6 49 64 Dimensions: A ∈ R M × N , B ∈ R N × K , C ∈ R M × K Verifying the right dimensions is an important sanity checker when working with matrices K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 17 / 37
2. Matrices Matrix Inverse Definition for square matrices W ∈ R n × n W − 1 W = WW − 1 = I 1 W − 1 = det WC ⊺ where C is the cofactor matrix of W . If W − 1 exists, we say W is nonsingular. K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 18 / 37
2. Matrices Matrix Inverse A condition for invertibility is that the determinant has to be different than zero. For an intuition consider the following linear transformation matrix � 1 � 0 A = , det A = 0 0 0 Applying this transformation to a vector gives � 1 � � v 1 � 1 � 0 � v 1 � v ′ � � � � � 0 v ′ = Av = 1 = v 1 + v 2 = = 0 0 v 2 0 0 0 v ′ 2 This transformation removes one dimension from v and projects it as a point along the first dimension. K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 19 / 37
2. Matrices Matrix Inverse � ⊺ recover the initial vector v ? � Can we from A and v ′ = v ′ v ′ 1 2 We have the following linear system of equations � 1 � � v 1 � v ′ � v 1 � � � 0 1 = = v ′ 0 0 v 2 0 2 While there is only one solution for v 1 , there are infinitely many solutions for v 2 . This means we cannot recover the initial value of v 2 . On the contrary, a nonsingular matrix, such as the identity matrix, admits one solution. K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 20 / 37
2. Matrices Matrix Inverse Example � 2 / 3 � � � 1 / 2 − 1 / 3 1 , W − 1 = W = − 1 1 2 / 3 2 / 3 Verify it! � � 2 / 3 � 1 � � � 1 / 2 − 1 / 3 1 0 WW − 1 = = − 1 1 2 / 3 2 / 3 0 1 � 2 / 3 � 1 � � � � − 1 / 3 1 / 2 1 0 W − 1 W = = 2 / 3 2 / 3 − 1 1 0 1 K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 21 / 37
2. Matrices Matrix Pseudoinverse How can we invert a matrix J ∈ R n × m that is not squared? Left-Pseudo Inverse J # J = ( J T J ) − 1 J T J = I m � �� � left multiplied Works if J has full column rank Right-Pseudo Inverse JJ # = J J T ( JJ T ) − 1 = I n � �� � right multiplied Works if J has full row rank K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 22 / 37
3. Operations and Linear Transformations Outline 1. Vectors 2. Matrices 3. Operations and Linear Transformations 4. Wrap-Up K. Kersting based on Slides from J. Peters · Statistical Machine Learning · Summer Term 2020 23 / 37
Recommend
More recommend