1 ROOTS OF POLYNOMIALS AND QUADRATIC FORMS Andrew Ranicki General Audience Maths Edinburgh Seminar School of Mathematics University of Edinburgh 5th February, 2016
2 Introduction ◮ In 1829 Sturm proved a theorem calculating the number of real roots of a non-zero real polynomial P ( X ) ∈ R [ X ] in an interval [ a , b ] ⊂ R , using the Euclidean algorithm in R [ X ] and counting sign changes. ◮ In 1853 Sylvester interpreted Sturm’s theorem using continued fractions and the signature of a tridiagonal quadratic form. ◮ The survey paper of ´ Etienne Ghys and A.R. http://arxiv.org/abs/1512.09258 Signatures in algebra, topology and dynamics includes a modern interpretation of the results of Sturm and Sylvester in terms of the “Witt group” of stable isomorphism classes of invertible symmetric matrices.
3 Jacques Charles Fran¸ cois Sturm (1803-1855)
4 Sturm’s problem ◮ Problem How many real roots of P ( X ) ∈ R [ X ] are there in an interval [ a , b ] ⊂ R ? At the time, this was a major problem in analysis, algebra and numerical mathematics. ◮ Sturm’s formula The Euclidean algorithm in R [ X ] for finding the greatest common divisor of P 0 ( X ) = P ( X ) and P 1 ( X ) = P ′ ( X ) gives the Sturm sequences of polynomials ( P ∗ ( X ) , Q ∗ ( X )) = (( P 0 ( X ) , . . . , P n ( X )) , ( Q 1 ( X ) , . . . , Q n ( X ))) with remainders P j ( X ) and quotients Q j ( X ), such that deg( P j +1 ( X )) < deg( P j ( X )) � n − j (0 � j � n ) , P j − 1 ( X ) + P j +1 ( X ) = P j ( X ) Q j ( X ) (1 � j � n ) . ◮ Sturm’s formula expressed the number of real roots of P ( X ) in [ a , b ] in terms of the variation (= number of sign changes) in P ∗ ( a ) and P ∗ ( b ), assuming regularity.
5 The Euclidean algorithm ◮ The Euclidean algorithm for the greatest common divisor of integers π 0 � π 1 � 1 is the sequence pair π 0 � π 1 > · · · > π n > π n +1 = 0 , ρ 0 , ρ 1 , . . . , ρ n > 0 with π j − 1 = π j ρ j + π j +1 (1 � j � n ) , ρ j = ⌊ π j − 1 /π j ⌋ = quotient when dividing π j − 1 by π j , π j +1 = remainder , π n = g.c.d.( π 0 , π 1 ) . ◮ The sequences ( π 0 /π 1 , π 1 /π 2 , . . . , π n − 1 /π n ), ( ρ 1 , ρ 2 , . . . , ρ n ) determine each other by π j − 1 1 , ρ j = π j − 1 − π j +1 = ρ j + . 1 π j π j π j ρ j +1 + ρ j +2 + ... + 1 ρ n
6 Euclidean pairs ◮ Definition A sequence p ∗ = ( p 0 , p 1 , . . . , p n ) of p j ∈ R is regular if p j ̸ = 0 ∈ R for 0 � j � n . ◮ Definition A Euclidean pair ( p ∗ , q ∗ ) consists of two regular sequences p ∗ = ( p 0 , p 1 , . . . , p n ), q ∗ = ( q 1 , q 2 , . . . , q n ) in R satisfying the identities p j − 1 + p j +1 = p j q j ∈ R (1 � j � n , p n +1 = 0) . ◮ Example For integers π 0 � π 1 � 1 the Euclidean algorithm sequences ( π 0 , π 1 , . . . , π n ), ( ρ 1 , ρ 2 , . . . , ρ n ) determine a Euclidean pair ( p ∗ , q ∗ ) by p j = ( − 1) j ( j − 1) / 2 π j , q j = ρ j .
7 Variation and regularity ◮ Definition The variation of a regular sequence p ∗ = ( p 0 , p 1 , . . . , p n ) in R is var( p ∗ ) = number of changes of sign in p ∗ n ( ∑ ) = n − sign( p j − 1 / p j ) / 2 ∈ { 0 , 1 , . . . , n } . j =1 ◮ Definition A polynomial P ( X ) ∈ R [ X ] is regular if it has no repeated roots. ◮ Definition A point t ∈ R is regular for P ( X ) ∈ R [ X ] if P ∗ ( t ) = ( P 0 ( t ) , P 1 ( t ) , . . . , P n ( t )) , is a regular sequence in R .
8 Sturm’s Theorem (1829) ◮ Theorem The number of real roots of a regular P ( X ) ∈ R [ X ] in [ a , b ] ⊂ R for regular a < b is |{ x ∈ [ a , b ] | P ( x ) = 0 ∈ R }| = var( P ∗ ( a )) − var( P ∗ ( b )) . ◮ Idea of proof Let a = t 0 < t 1 < t 2 < · · · < t N − 1 < t N = b be the partition of [ a , b ] at the points t 1 < t 2 < · · · < t N − 1 which are not regular. For each i ∈ { 1 , 2 , . . . , N − 1 } there is a unique j i ∈ { 0 , 1 , . . . , n − 1 } such that P j i ( t i ) = 0 , P k ( t i ) ̸ = 0 for k ̸ = j i . The function [ a , b ] → { 0 , 1 , . . . , n } ; t �→ var( P ∗ ( a )) − var( P ∗ ( t )) is constant for t ∈ ( t i , t i +1 ). The jump is 1 at t i with j i = 0, i.e. at the real roots of P ( X ). The jump is 0 at t i with j i � 1, since p j i − 1 ( t i ) + p j i +1 ( t i ) = p j i ( t i ) q j i ( t i ) = 0 with the first two terms ̸ = 0.
9 James Joseph Sylvester (1814-1897)
10 Sylvester’s papers related to Sturm’s theorem ◮ On the relation of Sturm’s auxiliary functions to the roots of an algebraic equation. (1841) ◮ A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares. (1852) ◮ On a remarkable modification of Sturm’s Theorem (1853) ◮ On a theory of the syzygetic relations of two rational integral functions, comprising an application to the theory of Sturm’s functions, and that of the greatest algebraical common measure. (1853) ◮ Sylvester used continued fractions to express Sturm’s formula in terms of the signatures of tridiagonal symmetric forms. In fact, the signature was developed for just this purpose!
11 Cauchy’s Spectral Theorem (1829) ◮ Definition The transpose of an n × n matrix A = ( a ij ) is A ∗ = ( a ji ) . ◮ Definition The symmetric n × n matrices S , T in R are orthogonally congruent if T = A ∗ SA for an n × n matrix A which is orthogonal, A ∗ A = I n . ◮ Spectral Theorem (i) The eigenvalues of symmetric S are real. (ii) Symmetric S , T are orthogonally congruent if and only if they have the same eigenvalues.
12 Sylvester’s Law of Inertia ◮ Definition Let S be a symmetric n × n matrix in R . (i) The positive index τ + ( S ) � 0 of S is the dimension of a maximal subspace V + ⊆ R n such that S ( x , x ) > 0 for all x ∈ V + \{ 0 } . (ii) The negative index τ − ( S ) � 0 of S is the dimension of a maximal subspace V − ⊆ R n such that S ( x , x ) < 0 for all x ∈ V − \{ 0 } . ◮ Definition Symmetric n × n matrices S , T are linearly congruent if T = A ∗ SA for an invertible n × n matrix A . ◮ Law of Inertia (1852) S , T are linearly congruent if and only if ( τ + ( S ) , τ − ( S )) = ( τ + ( T ) , τ − ( T )) .
13 The signature ◮ Definition The signature of a symmetric n × n matrix S in R is τ ( S ) = τ + ( S ) − τ − ( S ) ∈ {− n , − n + 1 , . . . , n − 1 , n } . ◮ The following conditions on S are equivalent: ◮ S is invertible, ◮ τ + ( S ) + τ − ( S ) = n ◮ the eigenvalues constitute a regular sequence λ ∗ = ( λ 1 , λ 2 , . . . , λ n ), i.e. each λ j ̸ = 0. ◮ Proposition For invertible S n ∑ τ ( S ) = sign( λ j ) = n − 2 var( µ ∗ ) j =1 with µ j = λ 1 λ 2 . . . λ j (1 � j � n ) and µ 0 = 1.
14 The principal minors and the Sylvester-Jacobi-Gundelfinger-Frobenius Theorem ◮ Definition The principal minors of an n × n matrix S = ( s ij ) 1 � i , j � n in R are the determinants of the principal submatrices S ( k ) = ( s ij ) 1 � i , j � k µ k ( S ) = det( S ( k )) ∈ R (1 � k � n ) . For k = 0 set µ 0 ( S ) = 1. ◮ Theorem (Sylvester (1853), Jacobi (1857), Gundelfinger (1881), Frobenius (1895)) The signature of a symmetric n × n matrix S in R with the principal minors µ k = µ k ( S ) constituting a regular sequence µ ∗ = ( µ 0 , µ 1 , . . . , µ n ) is n ∑ τ ( S ) = sign( µ k /µ k − 1 ) = n − 2 var( µ ∗ ) . k =1 ◮ There is a proof in the survey, using “plumbing” of matrices.
15 The tridiagonal symmetric matrix ◮ Definition The tridiagonal symmetric matrix of a sequence q ∗ = ( q 1 , q 2 , . . . , q n ) in R is q 1 1 0 . . . 0 1 1 . . . 0 q 2 0 1 q 3 . . . 0 Tri( q ∗ ) = . . . . ... . . . . . . . . 0 0 0 . . . q n ◮ Sylvester observed that every continued fraction is the ratio of successive principal minors µ k = µ k (Tri( q ∗ )) 1 µ k /µ k − 1 = q k − 1 q k − 1 − q k − 2 − ... − 1 q 1 n ∑ and τ (Tri( q ∗ )) = sign( µ k /µ k − 1 ) = n − 2 var( µ ∗ ). k =1
16 Sylvester’s mathematical inspiration ◮ For a Euclidean pair ( p ∗ , q ∗ ) the regular sequences ( p 0 / p 1 , p 1 / p 2 , . . . , p n − 1 / p n ) and q ∗ determine each other by p j − 1 1 , q j = p j − 1 + p j +1 = q j − . 1 p j p j p j q j +1 − q j +2 − ... − 1 q n ◮ For his modification of Sturm’s theorem Sylvester needed an expression for τ (Tri( q ∗ )) in terms of p ∗ . He could not obtain it directly, so he reversed q ∗ = ( q 1 , q 2 , . . . , q n ) to define q ′ ∗ = ( q n , q n − 1 , . . . , q 1 ) = µ n − j +1 (Tri( q ′ with p j − 1 ∗ )) and τ (Tri( q ′ ∗ )) = n − 2 var( p ∗ ). p j µ n − j (Tri( q ′ ∗ )) He then observed that Tri( q ∗ ), Tri( q ′ ∗ ) are linearly congruent, so that τ (Tri( q ∗ )) = τ (Tri( q ′ ∗ )) = n − 2 var( p ∗ ).
Recommend
More recommend