matrix computations and the secular equation
play

Matrix Computations and the Secular Equation Gene H. Golub - PowerPoint PPT Presentation

Matrix Computations and the Secular Equation Gene H. Golub Stanford University Introduction 1 / 35 What is the secular equation? The term secular (continuing through long ages OED2) recalls that one of the origins of spectral


  1. Matrix Computations and the Secular Equation Gene H. Golub Stanford University Introduction 1 / 35

  2. What is the secular equation? “The term secular (‘continuing through long ages’ OED2) recalls that one of the origins of spectral theory was in the problem of the long-run behavior of the solar system investigated by Laplace and Lagrange. [...] The 1829 paper in which Cauchy established that the roots of a symmetric determinant are real has the title, ‘Sur l’´ equation ` a l’aide de laquelle on d´ etermine les in´ egalit´ es s´ eculaires des mouvements des plan´ etes’; this signified only that Cauchy recognized that his problem, of choosing x to maximize x T Ax subject to x T x = 1 (to use modern notation), led to an equation like that studied in celestial mechanics. Sylvester’s title ‘On the Equation to the Secular Inequalities in the Planetary Theory’ [...] was even more misleading as to content. In this tradition the ‘S¨ akul¨ argleichung’ of Courant and Hilbert’s Methoden der Mathematischen Physik (1924) and the ‘secular equation’ of E. T. Browne’s ‘On the Separation Property of the Roots of the Secular Equation’ American Journal of Mathematics, 52, (1930), 843-850 refer to the characteristic equation of a symmetric matrix.” From http://members.aol.com/jeff570/e.html Introduction 1 / 35

  3. Outline 1 Introduction 2 Applications 3 Approximations 4 An example 5 Numerical Comparison 6 Conclusion Introduction 2 / 35

  4. Applications Applications 3 / 35

  5. Constrained Eigenvalue Problem A = A T x T A x max x � = 0 x T x = 1 s.t. c T x = 0 φ ( x ; λ, µ ) = x T A x − λ ( x T x − 1) + 2 µ x T c grad φ = 0 = ⇒ A x − λ x + µ c = 0 x = − µ ( A − λI ) − 1 c ⇒ c T ( A − λI ) − 1 c = 0 c T x = 0 = Constrained Eigenvalue Secular Equation A = Q Λ Q T , d = Q T c n d 2 � i ( λ i − λ ) = 0 i =1 Applications 4 / 35

  6. Rank One Change A x = λ x ( A + cc T ) y = µ y Rank One Change Secular Equation 1 + c T ( A − µI ) − 1 c = 0 Rank k -change ( A + CC T ) y = µ y I + C T ( A − µI ) − 1 C � � det = 0 Applications 5 / 35

  7. Another secular equation Consider � A � � x � � x � b = λ . b T c y y Then ( A − λI ) x = − y b and hence, ( c − λ − b ( A T − λI ) − 1 b ) y = 0 . Hence we must solve another secular equation when the matrix is expanded. Applications 6 / 35

  8. Quadratic Constraint A = A T , positive definite x T A x − 2 c T x min x x T x = α 2 s.t. φ ( x ; λ ) = x T A x − 2 c T x − λ ( x T x − α 2 ) grad φ = 0 = ⇒ ( A − λI ) x − c = 0 Quadratic Constraint Secular Equation c T ( A − λI ) − 2 c = α 2 Least Squares with a Quadratic Constraint b T A ( A T A − λI ) − 2 A T b = α 2 Applications 7 / 35

  9. Total Least Squares (TLS) ( A + E ) x = b + r , A : m × n � � x � � x � � � � A b + E r = 0 − 1 − 1 ( C + F ) z = 0 ; C : m × n + 1 Determine F and z so that rank ( C + F ) ≤ n and || F || F = min. Equivalently, find || C z || 2 min || z || 2 ≡ σ min ( C ) z Applications 8 / 35

  10. Total Least Squares (cont.) C T C z = ˆ σ 2 z Total Least Squares Secular Equation b T A ( A T A − ˆ σ 2 = 0 σ 2 I ) − 1 A T b − b T b − ˆ ˆ σ < σ min ( A ) x TLS = ( A T A − ˆ σ 2 I ) − 1 A T b Data Least Squares (DLS) ( A + E ) x = b b T A ( A T A − ˆ τ 2 I ) − 1 A T b − b T b = 0 ˆ τ < σ min ( A ) Applications 9 / 35

  11. Regularized total least squares (Fischer/G.) Note, that the TLS solution is equivalent to min � b − A x � 2 = min � C z � 2 2 2 = σ min ( C ) , 1 + � x � 2 � z � 2 2 2 where C = ( A, b ) and z n +1 = − 1 . For the regularized TLS we consider min � b − A x � 2 1 + x T V x , subject to x T V x = α 2 , 2 where V is a given symmetric positive definite matrix. Now, let � V � 0 = F T F W = 0 1 and observe that min � b − A x � 2 1 + x T V x = min � C z � 2 2 2 z T W z with � z � 2 2 = 1 + α 2 , z n +1 = − 1. Applications 10 / 35

  12. Least squares with linear and quadratic constraints With y = F z , B = F − T C T CF − 1 , c = e T n +1 F − 1 , γ 2 = 1 + α 2 , and β = − 1 we may rewrite our regularized TLS problem in terms of a least squares problem with linear and quadratic constraints min y T B y 2 = γ 2 , c T y = β. s. t. � y � 2 y T y , where γ and β are non-zero. Lagrange multipliers ψ ( y ; λ, µ ) = y T B y − λ ( y T y − γ 2 ) − 2 µ ( c T y − β ) . grad ψ = 0 when B y − λ y − µ c = 0 . Applications 11 / 35

  13. Introducing the projection matrix P = I − cc T c T c and d = β c c T c we arrive at ( PB − λI ) y = − λ d y T y γ 2 , = which leads to the secular equation λ 2 d T ( PB − λI ) − T ( PB − λI ) − 1 d = γ 2 . Instead, consider � ( PB − λI )( PB − λI ) T � � u � λ d = 0 . λ d T γ 2 ξ Note, ( PB − λI )( PB − λI ) T − λ 2 � � γ 2 dd T u = 0 . Thus, λ can be found as an eigenvalue of a quadratic eigenvalue problem with ˆ y = u /ξ . Applications 12 / 35

  14. Approximating the secular equation Approximations 13 / 35

  15. How do we approximate the secular equation for large n ? The problems we have described are closely associated with estimating a quadratic form u T F ( A ) u where u is a given vector and A is a symmetric matrix. Approximations 14 / 35

  16. Matrix Function to Integral A = Q Λ Q T u T F ( A ) u = u T F ( Q Λ Q T ) u = u T QF (Λ) Q T u = w T F (Λ) w w = Q T u � b n u T F ( A ) u = � F ( λ i ) w 2 i = F ( λ ) dw ( λ ) a i =1 Approximations 15 / 35

  17. Gauss-Radau Quadrature Rules � b L ≤ F ( λ ) dw ( λ ) ≤ U a � λ r dw ( λ ) µ r = ( r = 0 , 1 , . . . , 2 k + m − 1) � b F ( λ ) dw ( λ ) = I [ F ] + R [ F ] a k m � � I [ F ] = A i F ( t i ) + B j F ( z j ) i =1 j =1 { A i , t i } k unknown weights and nodes i =1 { z j } m prescribed nodes j =1 { B j } m calculated weights j =1 Approximations 16 / 35

  18. Gauss-Radau Quadrature Rules (cont.) I ( λ r ) = µ r k m � � A i t r B j z r µ r = i + j i =1 j =1 System of non-linear equations. � k � 2 � b m R [ F ] = F (2 k + m ) ( η ) � � ( λ − z j ) ( λ − t i ) dw ( λ ) (2 k + m )! a j =1 i =1 a < η < b m = 1 F (2 k +1) ( η ) ≤ 0 and z 1 = a R [ F ] ≤ 0 I [ F ] = U F (2 k +1) ( η ) ≤ 0 and z 1 = b R [ F ] ≥ 0 I [ F ] = L Approximations 17 / 35

  19. Gauss Quadrature � p r ( λ ) p s ( λ ) dα ( λ ) = 0 , r � = s, ( r, s = 0 , 1 , . . . , k ) p j +1 ( λ ) = ( λ − ξ j +1 ) p j ( λ ) − η 2 j p j − 1 ( λ ) p k ( t i ) = 0 , i = 1 , 2 , . . . , k   ξ 1 η 1 η 1 ξ 2 η 2     ... ...   J k = η 2     ... ...   η k − 1   η k − 1 ξ k µ 0 = 1 J k v j = t j v j , j = 1 , 2 , . . . , k A j = v 2 1 j , j = 1 , 2 , . . . , k Approximations 18 / 35

  20. Gauss-Radau (Inverse Eigenvalue Problem)   0 . .   J k . ¯ J k +1 =     η k   ¯ 0 · · · η k ξ k +1 0 = p k +1 ( t 0 ) = ( t 0 − ¯ ξ k +1 ) p k ( t 0 ) − η 2 k p k − 1 ( t 0 ) p k − 1 ( t 0 ) ¯ ξ k +1 = t 0 − η 2 k p k ( t 0 ) or ( J k − t 0 I ) δ = η 2 k e k ¯ ξ k +1 = t 0 + δ k Approximations 19 / 35

  21. Evaluate I [ F ] k � v 2 I [ F ] = 1 i F ( t i ) i =0 J k +1 = V TV T ¯ V T e 1 = � first component of V � 1 V F ( T ) V T e 1 I [ F ] = e T 1 F ( V TV T ) e 1 = e T 1 F ( ¯ = e T J k +1 ) e 1 Approximations 20 / 35

  22. Orthonormal polynomials w.r.t the measure w ( λ ) How do we build these polynomials? p j +1 ( λ ) = ( λ − ξ j +1 ) p j ( λ ) − η 2 j p j − 1 ( λ ) p j +1 ( A ) = ( A − ξ j +1 I ) p j ( A ) − η 2 j p j − 1 ( A ) p j +1 ( A ) u = ( A − ξ j +1 I ) p j ( A ) u − η 2 j p j − 1 ( A ) u Set w j = p j ( A ) u . We define ξ j +1 and η 2 j so that w T j +1 w j = 0 w T j +1 w j − 1 = 0 , and then w T j +1 w r = 0 for r < j − 1 ξ j +1 = ( w j , A w j ) ( w j , w j ) and η 2 j = ( w j , w j ) ( w j − 1 , w j − 1 ) Approximations 21 / 35

  23. Orthonormal polynomials w.r.t the measure w ( λ ) w T j +1 w r = 0 for r < j − 1 ξ j +1 = ( w j , A w j ) ( w j , w j ) and η 2 j = ( w j , w j ) ( w j − 1 , w j − 1 ) The Lanczos Process! To construct J k , begin the Lanczos process with u , then ( w j , w k ) = 0 = ( p j ( A ) u , p k ( A ) u ) = u T Qp j (Λ) Q T Qp k (Λ) Q T u = w T p j (Λ) p k (Λ) w � = p j ( λ ) p k ( λ ) dw ( λ ) Approximations 22 / 35

  24. Examples Approximations 23 / 35

  25. An example We need to solve b T ( A + µI ) − 2 b = α 2 Algorithm 1 Begin Lanczos process with u = b 2 Construct ¯ J k +1 1 ( ¯ 3 Solve e T J k +1 + µI ) − 2 e 1 = α 2 . An example 24 / 35

  26. Numerical Comparison Numerical Comparison 25 / 35

  27. Numerical Comparison with Total Least Squares � � min || E r || F E, r s.t. ( A + E ) x = b + r σ 2 ) = b T A ( A T A − ˆ σ 2 I ) − 1 A T b − b T b − ˆ σ 2 = 0 ψ (ˆ Algorithms Approximate b T A ( A T A − ˆ σ 2 I ) − 1 A T b using moment theory and Lanczos on A T A Approximate b T A ( A T A − ˆ σ 2 I ) − 1 A T b using moment theory and Lanczos bidiagonalization on A Solve a set of non-linear equations derived from the normal equations. (Bj¨ orck’s algorithm) Numerical Comparison 26 / 35

Recommend


More recommend