Lesson 21 C OMPUTATION OF EIGENVALUES
• We consider the problem of computation of eigenvalues • To make life (much!) simpler, let's assume we have a symmetric matrix A ∈ R n × n , which has decomposition A = Q Λ Q � • Unlike previous algorithms, which were deterministic, eigenvalue algorithms must be iterative • In this lecture, we overview classical iterative algorithms for calculating eigenvalues � First we look at algorithms that calculate a single eigenvalue, then we intro- duce the QR algorithm for calculating all eigenvalues
B AD IDEA : SOLVE CHARACTERISTIC EQUATION
• Probably in second(?) year you were asked to calculate eigenvalues for small ma- trices, by solving the characteristic equation: ��� ( A − λ I ) = 0 • However, there is no exact formula for polynomials of degree five or greater: we cannot solve this equation exactly • What about numerically? � Newton iteration requires good initial guesses for convergence, and it is dif- ficult to find all roots � Indeed, recall we reduced the polynomial root finding problem to calculating eigenvalues of a companion matrix
O NE EIGENVALUE ALGORITHM : P OWER ITERATION
• We introduce the Rayleigh quotient r ( x ) = x � A x x � x • Note that for a normalized eigenvector q k = Q e k , we have r ( q k ) = q � k A q k = � k � Thus, if we can find an eigenvector, we get the eigenvalue as well • For general , is the constant that "acts closest" to the eigenvalue, i.e., it is the that minimizes (Why?) • Expanding we get • Thus if we are near an eigenvector with we have quadratically small error:
• We introduce the Rayleigh quotient r ( x ) = x � A x x � x • Note that for a normalized eigenvector q k = Q e k , we have r ( q k ) = q � k A q k = � k � Thus, if we can find an eigenvector, we get the eigenvalue as well • For general x , r ( x ) is the constant that "acts closest" to the eigenvalue, i.e., it is the � that minimizes (Why?) � A x � � x � • Expanding we get � • Thus if we are near an eigenvector with we have quadratically small error:
• We introduce the Rayleigh quotient r ( x ) = x � A x x � x • Note that for a normalized eigenvector q k = Q e k , we have r ( q k ) = q � k A q k = � k � Thus, if we can find an eigenvector, we get the eigenvalue as well • For general x , r ( x ) is the constant that "acts closest" to the eigenvalue, i.e., it is the � that minimizes (Why?) � A x � � x � • Expanding x = � n k =1 x k q k we get � n � n k =1 � k x k x � q k k =1 � k x 2 k r ( x ) = = � x � 2 � x � 2 • Thus if we are near an eigenvector with we have quadratically small error:
• We introduce the Rayleigh quotient r ( x ) = x � A x x � x • Note that for a normalized eigenvector q k = Q e k , we have r ( q k ) = q � k A q k = � k � Thus, if we can find an eigenvector, we get the eigenvalue as well • For general x , r ( x ) is the constant that "acts closest" to the eigenvalue, i.e., it is the � that minimizes (Why?) � A x � � x � • Expanding x = � n k =1 x k q k we get � n � n k =1 � k x k x � q k k =1 � k x 2 k r ( x ) = = � x � 2 � x � 2 • Thus if we are near an eigenvector x = q j + h with � h � = � we have quadratically small error: 1 + � h 2 k � k / � j � 2 � � r ( x ) = � j = � j + O � x � 2
• Another simple approach is to use power iteration: for some initial guess for eigen- vector v 0 , we continually apply A and normalize: A k v 0 � � A v k � 1 v k = = � A k v 0 � � A v k � 1 � • Then we hope that for large , so that (here denotes largest eigenvalue)
• Another simple approach is to use power iteration: for some initial guess for eigen- vector v 0 , we continually apply A and normalize: A k v 0 � � A v k � 1 v k = = � A k v 0 � � A v k � 1 � • Then we hope that v k � q 1 = Q e 1 for large k , so that λ 1 � r ( v k ) = v � k A v k (here λ 1 denotes largest eigenvalue)
Convergence n = 20 0.01 10 - 5 Eigenvector • Consider the Hankel operator 10 - 8 1 1 1 1 · · · 10 - 11 4 9 16 ... Eigenvalue 1 1 1 1 10 - 14 4 9 16 25 ... 1 1 1 1 H = 9 16 25 36 5 10 15 ... 1 1 1 1 16 25 36 49 . ... ... ... ... . . 1.0 • We apply Power iteration to H n = P n HP n 0.5 0.2 0.4 0.6 0.8 1.0 - 0.5 - 1.0 Eigenvalues
Convergence n = 40 0.001 10 - 7 Eigenvector • Consider the Hankel operator 10 - 11 1 1 1 1 · · · 4 9 16 ... Eigenvalue 1 1 1 1 10 - 15 4 9 16 25 ... 1 1 1 1 H = 9 16 25 36 5 10 15 ... 1 1 1 1 16 25 36 49 . ... ... ... ... . . 1.0 • We apply Power iteration to H n = P n HP n 0.5 0.2 0.4 0.6 0.8 1.0 - 0.5 - 1.0 Eigenvalues
Convergence n = 50 • Consider a randoom symmetric Gaussian matrix 0.1 G 11 G 1 n · · · 10 - 4 1 . . ... . . G n = . . √ n G 1 n G nn 10 - 7 · · · where G ij are random Gaussian entries Eigenvector 10 - 10 Eigenvalue � mean zero and variance twice on the diagonal as on the off diagonal 10 - 13 50 100 150 200 250 300 1.0 0.5 Eigenvalues - 1.0 - 0.5 0.5 1.0 - 0.5 - 1.0
Convergence n = 100 • Consider a randoom symmetric Gaussian matrix 0.1 G 11 G 1 n · · · 10 - 4 1 . . ... . . Eigenvector G n = . . √ n 10 - 7 G 1 n G nn · · · 10 - 10 where G ij are random Gaussian entries Eigenvalue 10 - 13 � mean zero and variance twice on the diagonal as on the off diagonal 500 1000 1500 2000 1.0 0.5 Eigenvalues - 1.0 - 0.5 0.5 1.0 - 0.5 - 1.0
������� : Power iteration converges provided | λ 1 | > | λ 2 | and q � 1 v 0 � = 0 , at the rate �� k � �� 2 k � � � λ 2 λ 2 � � and � � � v k � ( ± q 1 ) � = O | r ( v k ) � λ 1 | = O � � � � λ 1 λ 1 � � � � • Note: for a random initial guess, the probability of is zero • We can expand so that • Since , we must have (by normalization) that , which shows the first bound • The second bound follows from the first since it's a Rayleigh quotient
������� : Power iteration converges provided | λ 1 | > | λ 2 | and q � 1 v 0 � = 0 , at the rate �� k � �� 2 k � � � λ 2 λ 2 � � and � � � v k � ( ± q 1 ) � = O | r ( v k ) � λ 1 | = O � � � � λ 1 λ 1 � � � � • Note: for a random initial guess, the probability of q � 1 v 0 = 0 is zero • We can expand v 0 = a 1 q 1 + · · · + a n q n so that v k = c k A k v 0 = c k a 1 λ k 1 q 1 + a 2 λ k 2 q 2 + · · · + a n λ k � � n q n � � • Since , we must have (by normalization) that , which shows the first bound • The second bound follows from the first since it's a Rayleigh quotient
������� : Power iteration converges provided | λ 1 | > | λ 2 | and q � 1 v 0 � = 0 , at the rate �� k � �� 2 k � � � λ 2 λ 2 � � and � � � v k � ( ± q 1 ) � = O | r ( v k ) � λ 1 | = O � � � � λ 1 λ 1 � � � � • Note: for a random initial guess, the probability of q � 1 v 0 = 0 is zero • We can expand v 0 = a 1 q 1 + · · · + a n q n so that v k = c k A k v 0 = c k a 1 λ k 1 q 1 + a 2 λ k 2 q 2 + · · · + a n λ k � � n q n = c k λ k a 1 q 1 + a 2 ( λ 2 / λ 1 ) k q 2 + · · · + a n ( λ n / λ 1 ) k q n � � 1 • Since a j ( λ j / λ 1 ) k � 0 , we must have (by normalization) that c k λ k 1 a 1 � 1 , which shows the first bound • The second bound follows from the first since it's a Rayleigh quotient
I NVERSE ITERATION • Suppose we have a good initial guess of an eigenvalue µ � λ k • Then A � µI has an eigenvalue that is very small • Then ( A � µI ) � 1 has largest eigenvalue ( λ k � µ ) � 1 , and power iteration will calculate this: ( A � µI ) � 1 v k v k +1 = � ( A � µI ) � 1 v k � • The eigenvectors of ( A � µI ) � 1 are the same as A , so we still have λ � v � k A v k where λ is the eigenvalue closest to µ
Convergence • Consider a randoom symmetric Gaussian matrix G 11 G 1 n · · · n = 100 1 . . 1 ... . . G n = . . √ n 0.001 G 1 n G nn · · · Eigenvector 10 - 6 where G ij are random Gaussian entries 10 - 9 � mean zero and variance twice on the diagonal as on the off diagonal 10 - 12 Eigenvalue • We take µ = 0 and find the smallest eigenvalue 10 - 15 100 200 300 400 500 1.0 0.5 Eigenvalues - 1.0 - 0.5 0.5 1.0 - 0.5 - 1.0
R AYLEIGH QUOTIENT ITERATION • Use current guess with inverse iteration: λ ( k ) = v � k � 1 A v k � 1 ( A − λ ( k ) I ) � 1 v k � 1 v k = � � � ( A − λ ( k ) I ) � 1 v k � 1 � • Convergence is cubic in the limit: and
Recommend
More recommend