numerical analysis of the matrix logarithm
play

Numerical Analysis of the Matrix Logarithm Nick Higham School of - PowerPoint PPT Presentation

Numerical Analysis of the Matrix Logarithm Nick Higham School of Mathematics The University of Manchester higham@ma.man.ac.uk http://www.ma.man.ac.uk/~higham/ Computational Methods with Applications Harrachov 2007 Defining f(A)


  1. Numerical Analysis of the Matrix Logarithm Nick Higham School of Mathematics The University of Manchester higham@ma.man.ac.uk http://www.ma.man.ac.uk/~higham/ Computational Methods with Applications Harrachov 2007

  2. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 2 / 42

  3. Defining f(A) Applications Theory Methods Matrix Logarithm A logarithm of A ∈ C n × n is any matrix X such that e X = A . Existence. Representation, classification. Computation. Conditioning. First, approach via theory of matrix functions. . . MIMS Nick Higham Matrix Logarithm 3 / 42

  4. Defining f(A) Applications Theory Methods Multiplicity of Definitions There have been proposed in the literature since 1880 eight distinct definitions of a matric function, by Weyr, Sylvester and Buchheim, Giorgi, Cartan, Fantappiè, Cipolla, Schwerdtfeger and Richter. — R. F. Rinehart, The Equivalence of Definitions of a Matric Function, Amer. Math. Monthly (1955) MIMS Nick Higham Matrix Logarithm 4 / 42

  5. Defining f(A) Applications Theory Methods Jordan Canonical Form   λ k 1 ...   λ k   Z − 1 AZ = J = diag ( J 1 , . . . , J p ) , J k =   ... ����   1 m k × m k λ k Definition f ( A ) = Zf ( J ) Z − 1 = Z diag ( f ( J k )) Z − 1 ,   f ( m k − 1 ) )( λ k ) f ′ ( λ k ) f ( λ k ) . . .   ( m k − 1 )!   .  ...  . f ( J k ) = .  f ( λ k ) .    ...   f ′ ( λ k ) f ( λ k ) MIMS Nick Higham Matrix Logarithm 5 / 42

  6. Defining f(A) Applications Theory Methods Interpolation Definition (Sylvester, 1883; Buchheim, 1886) Distinct e’vals λ 1 , . . . , λ s , n i = max size of Jordan blocks for λ i . Then f ( A ) = p ( A ) , where p is unique Hermite interpolating poly of degree < � s i = 1 n i satisfying p ( j ) ( λ i ) = f ( j ) ( λ i ) , j = 0 : n i − 1 , i = 1 : s . MIMS Nick Higham Matrix Logarithm 7 / 42

  7. Defining f(A) Applications Theory Methods Cauchy Integral Theorem Definition � 1 f ( z )( zI − A ) − 1 dz , f ( A ) = 2 π i Γ where f is analytic on and inside a closed contour Γ that encloses λ ( A ) . MIMS Nick Higham Matrix Logarithm 8 / 42

  8. Defining f(A) Applications Theory Methods Equivalence of Definitions Theorem The three definitions are equivalent , modulo analyticity assumption for Cauchy. MIMS Nick Higham Matrix Logarithm 9 / 42

  9. Defining f(A) Applications Theory Methods Composite Functions Theorem f ( t ) = g ( h ( t )) ⇒ f ( A ) = g ( h ( A )) , provided latter matrix defined. Corollary exp ( log ( A )) = A when log ( A ) is defined. MIMS Nick Higham Matrix Logarithm 10 / 42

  10. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 11 / 42

  11. Defining f(A) Applications Theory Methods Application: Markov Models Time-homogeneous continuous-time Markov process with transition probability matrix P ( t ) ∈ R n × n . Transition intensity matrix Q related to P by P ( t ) = e Qt . Elements of Q satisfy n � q ij ≥ 0 , i � = j , q ij = 0 . j = 1 Embeddability problem When does a given stochastic P have a real logarithm Q that is an intensity matrix ? MIMS Nick Higham Matrix Logarithm 13 / 42

  12. Defining f(A) Applications Theory Methods The Average Eye First order character of optical system characterized by � S � ∈ R 5 × 5 , where S ∈ R 4 × 4 is δ transference matrix T = 0 1 � � 0 I 2 symplectic: S T JS = J , where J = . − I 2 0 Average m − 1 � m i = 1 T i is not a transference matrix. Harris (2005) proposes the average exp ( m − 1 � m i = 1 log ( T i )) . MIMS Nick Higham Matrix Logarithm 14 / 42

  13. Defining f(A) Applications Theory Methods The Average Eye First order character of optical system characterized by � S � ∈ R 5 × 5 , where S ∈ R 4 × 4 is δ transference matrix T = 0 1 � � 0 I 2 symplectic: S T JS = J , where J = . − I 2 0 Average m − 1 � m i = 1 T i is not a transference matrix. Harris (2005) proposes the average exp ( m − 1 � m i = 1 log ( T i )) . For Hermitian pos def A and B , Arsigny et al. (2007) define the log-Euclidean mean E ( A , B ) = exp ( 1 2 ( log ( A ) + log ( B ))) . MIMS Nick Higham Matrix Logarithm 14 / 42

  14. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 15 / 42

  15. Defining f(A) Applications Theory Methods Logs of A = I 3   0 0 0  ,  B = 0 0 0 0 0 0     0 2 π − 1 1 0 2 π 1   ,   , C = − 2 π 0 0 D = − 2 π 0 0 − 2 π 0 0 0 0 0 e B = e C = e D = I 3 . Λ ( C ) = Λ ( D ) = { 0 , 2 π i , − 2 π i } . MIMS Nick Higham Matrix Logarithm 18 / 42

  16. Defining f(A) Applications Theory Methods Principal Log and p th Root Let A ∈ C n × n have no eigenvalues on R − . Principal log X = log ( A ) denotes unique X such that e X = A . � � − π < Im λ ( X ) < π . � � For next 2 slides only , allow Im λ ( X ) = π . Principal p th root For integer p > 0, X = A 1 / p is unique X such that X p = A . − π/ p < arg ( λ ( X )) < π/ p . MIMS Nick Higham Matrix Logarithm 19 / 42

  17. Defining f(A) Applications Theory Methods All Solutions of e X = A Theorem (Gantmacher) A ∈ C n × n nonsing with Jordan canonical form Z − 1 AZ = J = diag ( J 1 , J 2 , . . . , J p ) . All solutions to e X = A are given by − 1 Z − 1 , 2 , . . . , L ( j p ) X = Z U diag ( L ( j 1 ) 1 , L ( j 2 ) p ) U where L ( j k ) = log ( J k ( λ k )) + 2 j k π i I m k , k j k ∈ Z arbitrary, and U an arbitrary nonsing matrix that commutes with J. MIMS Nick Higham Matrix Logarithm 20 / 42

  18. Defining f(A) Applications Theory Methods All Solutions of e X = A : Classified Theorem A ∈ C n × n nonsing: p Jordan blocks, s distinct ei’vals. e X = A has a countable infinity of solutions that are primary functions of A : 2 , . . . , L ( j p ) X j = Z diag ( L ( j 1 ) 1 , L ( j 2 ) p ) Z − 1 , where λ i = λ k implies j i = j k . If s < p then e X = A has non-primary solutions − 1 Z − 1 , X j ( U ) = Z U diag ( L ( j 1 ) 1 , L ( j 2 ) 2 , . . . , L ( j p ) p ) U where j k ∈ Z arbitrary, U arbitrary nonsing with UJ = JU , and for each j ∃ i and k s.t. λ i = λ k while j i � = j k . MIMS Nick Higham Matrix Logarithm 21 / 42

  19. Defining f(A) Applications Theory Methods Logs of A = I 3     0 2 π − 1 1 0 2 π 1  ,  ,   C = − 2 π 0 0 D = − 2 π 0 0 − 2 π 0 0 0 0 0 e 0 = e C = e D = I 3 . Λ ( C ) = Λ ( D ) = { 0 , 2 π i , − 2 π i } .   1 α 0   , U = 0 1 α α ∈ C , 0 0 1   2 α 2 1 − 2 α X = U diag ( 2 π i , − 2 π i , 0 ) U − 1 = 2 π i   . 0 1 − α 0 0 1 MIMS Nick Higham Matrix Logarithm 22 / 42

  20. Defining f(A) Applications Theory Methods Two Facts on Commuting Matrices Theorem If A , B ∈ C n × n commute then ∃ a unitary U ∈ C n × n such that U ∗ AU and U ∗ BU are both upper triangular. MIMS Nick Higham Matrix Logarithm 23 / 42

  21. Defining f(A) Applications Theory Methods Two Facts on Commuting Matrices Theorem If A , B ∈ C n × n commute then ∃ a unitary U ∈ C n × n such that U ∗ AU and U ∗ BU are both upper triangular. Theorem For A , B ∈ C n × n , e ( A + B ) t = e At e Bt for all t if and only if AB = BA. MIMS Nick Higham Matrix Logarithm 23 / 42

  22. Defining f(A) Applications Theory Methods When Does log ( BC ) = log ( B ) + log ( C ) ? Theorem Let B , C ∈ C n × n commute and have no ei’vals on R − . If for every ei’val λ j of B and the corr. ei’val µ j of C , | arg λ j + arg µ j | < π , then log ( BC ) = log ( B ) + log ( C ) . MIMS Nick Higham Matrix Logarithm 24 / 42

  23. Defining f(A) Applications Theory Methods When Does log ( BC ) = log ( B ) + log ( C ) ? Theorem Let B , C ∈ C n × n commute and have no ei’vals on R − . If for every ei’val λ j of B and the corr. ei’val µ j of C , | arg λ j + arg µ j | < π , then log ( BC ) = log ( B ) + log ( C ) . Proof . log ( B ) and log ( C ) commute, since B and C do. Therefore e log ( B )+ log ( C ) = e log ( B ) e log ( C ) = BC . Thus log ( B ) + log ( C ) is some logarithm of BC . Then Im ( log λ j + log µ j ) = arg λ j + arg µ j ∈ ( − π, π ) , so log ( B ) + log ( C ) is the principal logarithm of BC . MIMS Nick Higham Matrix Logarithm 24 / 42

  24. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 25 / 42

  25. Defining f(A) Applications Theory Methods Henry Briggs (1561–1630) Arithmetica Logarithmica (1624) Logarithms to base 10 of 1–20,000 and 90,000–100,000 to 14 decimal places . MIMS Nick Higham Matrix Logarithm 26 / 42

  26. Defining f(A) Applications Theory Methods Henry Briggs (1561–1630) Arithmetica Logarithmica (1624) Logarithms to base 10 of 1–20,000 and 90,000–100,000 to 14 decimal places . Briggs must be viewed as one of the great figures in numerical analysis. —Herman H. Goldstine, A History of Numerical Analysis (1977) MIMS Nick Higham Matrix Logarithm 26 / 42

Recommend


More recommend