review of linear algebra
play

Review of Linear Algebra Fereshte Khani April 9, 2020 1 / 57 - PowerPoint PPT Presentation

Review of Linear Algebra Fereshte Khani April 9, 2020 1 / 57 Basic Concepts and Notation 1 Matrix Multiplication 2 Operations and Properties 3 Matrix Calculus 4 2 / 57 Basic Concepts and Notation 3 / 57 Basic Notation - By x R n ,


  1. Review of Linear Algebra Fereshte Khani April 9, 2020 1 / 57

  2. Basic Concepts and Notation 1 Matrix Multiplication 2 Operations and Properties 3 Matrix Calculus 4 2 / 57

  3. Basic Concepts and Notation 3 / 57

  4. Basic Notation - By x ∈ R n , we denote a vector with n entries.   x 1 x 2   x =  .  .  .   .  x n - By A ∈ R m × n we denote a matrix with m rows and n columns, where the entries of A are real numbers.    a T  · · · — — a 11 a 12 a 1 n 1   | | | a T a 21 a 22 · · · a 2 n — —     2  = a 1 a 2 a n A =  = · · ·  .  . . .   .  ... . . .  .     . . . . | | |   a T · · · — — a m 1 a m 2 a mn m 4 / 57

  5. The Identity Matrix The identity matrix , denoted I ∈ R n × n , is a square matrix with ones on the diagonal and zeros everywhere else. That is, � 1 i = j I ij = 0 i � = j It has the property that for all A ∈ R m × n , AI = A = IA . 5 / 57

  6. Diagonal matrices A diagonal matrix is a matrix where all non-diagonal elements are 0. This is typically denoted D = diag ( d 1 , d 2 , . . . , d n ) , with � d i i = j D ij = 0 i � = j Clearly, I = diag ( 1 , 1 , . . . , 1 ) . 6 / 57

  7. Vector-Vector Product - inner product or dot product   y 1 n y 2   � x T y ∈ R = � � · · ·  = x i y i . x 1 x 2 x n  .  .   .  i = 1 y n - outer product     x 1 x 1 y 1 x 1 y 2 · · · x 1 y n · · · x 2 x 2 y 1 x 2 y 2 x 2 y n xy T ∈ R m × n =     � � y 1 y 2 · · · y n =  . . . . .    ...  . . . .     . . . .    x m x m y 1 x m y 2 · · · x m y n 7 / 57

  8. Matrix-Vector Product - If we write A by rows, then we can express Ax as, a T a T     — — 1 x 1 a T a T — — 2 x     2 y = Ax =  x =  .  .   .  . .     . .   a T a T — — m x m - If we write A by columns, then we have:   x 1         | | | x 2   a n  a 1  x 1 +  a 2  x 2 + . . . +  a n  x n . a 1 a 2 · · · y = Ax =  =  .    .   . | | |  x n (1) y is a linear combination of the columns of A . 8 / 57

  9. Matrix-Vector Product It is also possible to multiply on the left by a row vector. - If we write A by columns, then we can express x ⊤ A as,  | | |  y T = x T A = x T  = a 1 a 2 a n � x T a 1 x T a 2 x T a n � · · · · · ·  | | | - expressing A in terms of rows we have: a T   — — 1 a T — — y T = x T A   2 � · · · � = x 1 x 2 x m  .  .   .   a T — — m � a T � � a T � � a T � = — — + x 2 — — + ... + x m — — x 1 1 2 m y T is a linear combination of the rows of A . 9 / 57

  10. Matrix-Matrix Multiplication (different views) 1. As a set of vector-vector products a T a T a T a T 1 b p    1 b 1 1 b 2 · · ·  — — 1   | | | a T a T 2 b 1 a T 2 b 2 a T 2 b p — — · · ·     2  = b p b 1 b 2 · · · C = AB =  .  .   . . .  ... .  . . .     . . . . | | |    a T a T a T a T m b p m b 1 m b 2 · · · — — m 10 / 57

  11. Matrix-Matrix Multiplication (different views) 2. As a sum of outer products b T   — — 1  | | |  b T n — —   2 � a i b T a 1 a 2 a n C = AB = · · ·  = . .     . i   . | | |  i = 1 b T — — n 11 / 57

  12. Matrix-Matrix Multiplication (different views) 3. As a set of matrix-vector products.     | | | | | |  =  . b 1 b 2 b p Ab 1 Ab 2 Ab p C = AB = A · · · · · · (2)   | | | | | | Here the i th column of C is given by the matrix-vector product with the vector on the right, c i = Ab i . These matrix-vector products can in turn be interpreted using both viewpoints given in the previous subsection. 12 / 57

  13. Matrix-Matrix Multiplication (different views) 4. As a set of vector-matrix products. a T a T     — — — 1 B — 1 a T a T — — — — 2 B     2 C = AB =  B =  . . .     . .     . .   a T a T — — — m B — m 13 / 57

  14. Matrix-Matrix Multiplication (properties) - Associative: ( AB ) C = A ( BC ) . - Distributive: A ( B + C ) = AB + AC . - In general, not commutative; that is, it can be the case that AB � = BA . (For example, if A ∈ R m × n and B ∈ R n × q , the matrix product BA does not even exist if m and q are not equal!) 14 / 57

  15. Operations and Properties 15 / 57

  16. The Transpose The transpose of a matrix results from “flipping” the rows and columns. Given a matrix A ∈ R m × n , its transpose, written A T ∈ R n × m , is the n × m matrix whose entries are given by ( A T ) ij = A ji . The following properties of transposes are easily verified: - ( A T ) T = A - ( AB ) T = B T A T - ( A + B ) T = A T + B T 16 / 57

  17. Trace The trace of a square matrix A ∈ R n × n , denoted tr A , is the sum of diagonal elements in the matrix: n � tr A = A ii . i = 1 The trace has the following properties: - For A ∈ R n × n , tr A = tr A T . - For A , B ∈ R n × n , tr ( A + B ) = tr A + tr B . - For A ∈ R n × n , t ∈ R , tr ( tA ) = t tr A . - For A , B such that AB is square, tr AB = tr BA . - For A , B , C such that ABC is square, tr ABC = tr BCA = tr CAB , and so on for the product of more matrices. 17 / 57

  18. Norms A norm of a vector � x � is informally a measure of the “length” of the vector. More formally, a norm is any function f : R n → R that satisfies 4 properties: 1. For all x ∈ R n , f ( x ) ≥ 0 (non-negativity). 2. f ( x ) = 0 if and only if x = 0 (definiteness). 3. For all x ∈ R n , t ∈ R , f ( tx ) = | t | f ( x ) (homogeneity). 4. For all x , y ∈ R n , f ( x + y ) ≤ f ( x ) + f ( y ) (triangle inequality). 18 / 57

  19. Examples of Norms The commonly-used Euclidean or ℓ 2 norm, � n � � � x 2 � x � 2 = i . � i = 1 The ℓ 1 norm, n � � x � 1 = | x i | i = 1 The ℓ ∞ norm, � x � ∞ = max i | x i | . In fact, all three norms presented so far are examples of the family of ℓ p norms, which are parameterized by a real number p ≥ 1, and defined as � n � 1 / p � | x i | p � x � p = . i = 1 19 / 57

  20. Matrix Norms Norms can also be defined for matrices, such as the Frobenius norm, � m n � � � � � A 2 � A � F = ij = tr ( A T A ) . � i = 1 j = 1 Many other norms exist, but they are beyond the scope of this review. 20 / 57

  21. Linear Independence A set of vectors { x 1 , x 2 , . . . x n } ⊂ R m is said to be (linearly) dependent if one vector belonging to the set can be represented as a linear combination of the remaining vectors; that is, if n − 1 � x n = α i x i i = 1 for some scalar values α 1 , . . . , α n − 1 ∈ R ; otherwise, the vectors are (linearly) independent . 21 / 57

  22. Linear Independence A set of vectors { x 1 , x 2 , . . . x n } ⊂ R m is said to be (linearly) dependent if one vector belonging to the set can be represented as a linear combination of the remaining vectors; that is, if n − 1 � x n = α i x i i = 1 for some scalar values α 1 , . . . , α n − 1 ∈ R ; otherwise, the vectors are (linearly) independent . Example:       1 4 2 x 1 = 2 x 2 = 1 x 3 = − 3       − 1 3 5 are linearly dependent because x 3 = − 2 x 1 + x 2 . 21 / 57

  23. Rank of a Matrix - The column rank of a matrix A ∈ R m × n is the size of the largest subset of columns of A that constitute a linearly independent set. 22 / 57

  24. Rank of a Matrix - The column rank of a matrix A ∈ R m × n is the size of the largest subset of columns of A that constitute a linearly independent set. - The row rank is the largest number of rows of A that constitute a linearly independent set. 22 / 57

  25. Rank of a Matrix - The column rank of a matrix A ∈ R m × n is the size of the largest subset of columns of A that constitute a linearly independent set. - The row rank is the largest number of rows of A that constitute a linearly independent set. - For any matrix A ∈ R m × n , it turns out that the column rank of A is equal to the row rank of A (prove it yourself!), and so both quantities are referred to collectively as the rank of A , denoted as rank ( A ) . 22 / 57

  26. Properties of the Rank - For A ∈ R m × n , rank ( A ) ≤ min ( m , n ) . If rank ( A ) = min ( m , n ) , then A is said to be full rank . 23 / 57

  27. Properties of the Rank - For A ∈ R m × n , rank ( A ) ≤ min ( m , n ) . If rank ( A ) = min ( m , n ) , then A is said to be full rank . - For A ∈ R m × n , rank ( A ) = rank ( A T ) . 23 / 57

  28. Properties of the Rank - For A ∈ R m × n , rank ( A ) ≤ min ( m , n ) . If rank ( A ) = min ( m , n ) , then A is said to be full rank . - For A ∈ R m × n , rank ( A ) = rank ( A T ) . - For A ∈ R m × n , B ∈ R n × p , rank ( AB ) ≤ min ( rank ( A ) , rank ( B )) . 23 / 57

Recommend


More recommend