Numerical linear algebra and some problems in computational - PowerPoint PPT Presentation

Numerical linear algebra and some problems in computational statistics Zdenˇ ek Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/˜strakos IASC2008, Yokohama, Japan, December 2008.

This lecture is devoted to the memory of Tomáš Havránek Z. Strakoš 2

Motivation – Computational mathematics E. Study (1862-1930, Leipzig, Marburg, Göttingen, Greifswald, Bonn, successor of Lipschitz) : Mathematics is neither the art of calculation nor the art of avoiding calculations. Mathematics, however, entails the art of avoiding superflous calculations and conducting the necessary ones skilfully. In this respect one could have learned from the older authors. Z. Strakoš 3

Motivation – Computational mathematics B.J.C. Baxter, A. Iserles, On the foundations of computational math., in Handbook of Numerical Analysis XI (P .G. Ciarlet and F . Cucker, eds), North-Holland, Amsterdam (2003), 3-34 : The purpose of computation is not to produce a solution with least error but to produce reliably, robustly and affordably a solution which is within a user-specified tolerance. It should be emphasized that there is really no boundary between computational mathematics and statistics see Section 2.5 of the paper above. Z. Strakoš 4

Examples ● Singular value decomposition, numerically stable algorithms for computing orthogonal decompositions and projections, various direct and iterative methods and algorithms applicable to problems in computational statistics. ● Linear regression and ordinary least squares, collinearity (Stewart, Marquardt, Belsley, Thisted, Hadi and Velleman (1987)). ● Errors-in-variables modeling, orthogonal regression and total least squares. ● Stochastic partial differential equations and their numerical solution. ● Statistical tools in solving ill-posed problems, information retrieval and data mining, signal processing. ● . . . . . . . . . 5 Z. Strakoš

Outline 1. Common roots: Moments 2. Linear approximation problems 3. Singular value decomposition and model reduction 4. Matching moments model reduction and Krylov subspace methods 5. Bidiagonalization and linear regression 6. Bidiagonalization and orthogonal regression 7. Back to the roots (matching moments and model reduction) 8. Conclusions Z. Strakoš 6

Chebyshev (1855), Heine (1861), Markov (1884) Let p ( λ ) be a nonnegative function in ( −∞ , ∞ ) . Given � ∞ � ∞ e − λ 2 λ k dλ , p ( λ ) λ k dλ = k = 0 , 1 , . . . , −∞ −∞ p ( λ ) = e − λ 2 or, as we say now, that the can we conclude that distribution characterized by the function � x p ( λ ) dλ −∞ is a normal one? See Shohat and Tamarkin (1943), Akhiezer (1965). 7 Z. Strakoš

Stieltjes (1883-4) Given a sequence of numbers ξ k , k = 0 , 1 , . . . , a non-decreasing distribution function ω ( λ ) is sought such that � ∞ λ k dω ( λ ) = ξ k , k = 0 , 1 , . . . , 0 where � ∞ λ k dω ( λ ) 0 represents the k -th (generalized) mechanical moment of the distribution λ ≥ 0 . of positive mass on the half line Stieltjes based his investigation on continued fractions; cf. Gantmacher and Krein (1950). Z. Strakoš 8

Another formulation Consider a non-decreasing distribution function ω ( λ ) , λ ≥ 0 with the moments given by the Riemann-Stieltjes integral � ∞ λ k dω ( λ ) , ξ k = k = 0 , 1 , . . . . 0 λ ( n ) Find the distribution function ω ( n ) ( λ ) with n points of increase i i = 0 , 1 , . . . , which matches the first 2 n moments for the distribution function ω ( λ ) , � ∞ n λ k dω ( n ) ( λ ) ≡ ) k = ξ k , ω ( n ) ( λ ( n ) � k = 0 , 1 , . . . , 2 n − 1 . i i 0 i =1 This moment problem plays in modern NLA a fundamental role. Z. Strakoš 9

1 : Gauss-Christoffel quadrature Clearly, � ∞ n λ k dω ( λ ) = ) k , ω ( n ) ( λ ( n ) � k = 0 , 1 , . . . , 2 n − 1 i i 0 i =1 represents the n -point Gauss-Christoffel quadrature, see . Gauss, Methodus nova integralium valores per approximationem C. F inveniendi, (1814), C. G. J. Jacobi, Über Gauss’ neue Methode, die Werthe der Integrale näherungsweise zu finden, (1826), and the description given in H. H. J. Goldstine, A History of Numerical Analysis from the 16th through the 19th Century, (1977). With no loss of generality we consider ξ 0 = 1 . Z. Strakoš 10

Outline 1. Common roots: Moments 2. Linear approximation problems 3. Singular value decomposition and model reduction 4. Matching moments model reduction and Krylov subspace methods 5. Bidiagonalization and linear regression 6. Bidiagonalization and orthogonal regression 7. Back to the roots 8. Conclusions Z. Strakoš 11

A x ≈ b Forms of ● LE: A A x = b . is square and numerically nonsingular, then ● OLS (linear regresion): A is generally error free rectangular N by M matrix and the right hand side (the observation vector) is significantly affected by errors. Then A x = b + r , min � r � . ● TLS (orthogonal regression): Significant errors contained both in the generally rectangular N by M matrix A and the right hand side b . Then min � [ r, E ] � F , ( A + E ) x = b + r , � · � F where means the Frobenius norm of the given matrix, see Rao and Toutenburg (1999), Section 3.12. Z. Strakoš 12

Miminum norm OLS solution Let b be orthogonally decomposed into parts b | R ( A ) in the range of A T , A and b | N ( A T ) in the nullspace of b = b | R ( A ) + b | N ( A T ) . Then the minimum norm solution x is given by x ∈ R ( A T ) , A x = b | R ( A ) , r = − b | N ( A T ) . The errors in b are assumed to be orthogonal to the subspace generated by the columns of A . If A has full column rank, A T A x = A T b . x = ( A T A ) − 1 A T b . In computational statistics Z. Strakoš 13

Damped OLS solution A Let represent a discrete ill-posed problem and the right hand side is significantly affected by errors (noise). Then the OLS solution is useless. Instead, min {� r � 2 + α 2 � x � 2 } , A x = b + r, which is nothing but the Tikhonov regularization (1963). Equivalently, � �� A b � � x − x = argmin � . � � � α I 0 � � Example - discretized Fredholm integral equations of the first order. Z. Strakoš 14

Ridge regression Using the normal equations, ( A T A + α 2 I ) x = A T b . In computational statistics this is known as the ridge regression (RR), Rao and Toutenburg (1999), Section 3.10.2, ˇ Cížek (2004), Section 8.1.6, x = ( A T A + α 2 I ) − 1 A T b . Caution. ’Ill-posed problems’ does not mean the same as ’ill-conditioned problems’. Z. Strakoš 15

Outline 1. Common roots: Moments 2. Linear approximation problems 3. Singular value decomposition and model reduction 4. Matching moments model reduction and Krylov subspace methods 5. Bidiagonalization and linear regression 6. Bidiagonalization and orthogonal regression 7. Back to the roots 8. Conclusions Z. Strakoš 16

Spectral decomposition of a symmetric matrix A = U Λ U T , If A is symmetric N by N matrix, then Λ = diag ( λ j ) , UU T = U T U = I , U = [ u 1 , . . . , u N ] . A λ 1 → u 1 u 1 λ 2 u 2 → u 2 . . . λ N → u N u N One dimensional mutually orthogonal invariant subspaces. Z. Strakoš 17

Singular value decomposition (SVD) with no loss of generality N ≥ M . Consider an N by M matrix A , Then r A = S Σ W T = S r Σ r W T � s ℓ σ ℓ w T = ℓ , r ℓ =1 SS T = S T S = I, W T W = WW T = I, Σ = diag ( σ 1 , . . . , σ r , 0) , σ 1 ≥ σ 2 ≥ . . . ≥ σ r > 0 , S = [ S r , . . . ] , W = [ W r , . . . ] , Σ r = diag ( σ 1 , . . . , σ r ) . Z. Strakoš 18

Singular value decomposition A T A R ( A T ) R ( A ) σ 1 σ 1 → → w 1 s 1 w 1 σ 2 σ 2 → → w 2 s 2 w 2 . . . . . . . . . . . . . . . σ r σ r → → w r s r w r   w r +1 s r +1     . .   N ( A T ) . . N ( A ) → 0 , → 0 , . .     w M s N   Z. Strakoš 19

Singular value decomposition Distance of the full rank matrix to a nearest singular matrix (rank-deficient matrix): � A − A M − 1 � = σ M � A − A M − 1 � = σ M , = 1 /κ ( A ) . � A � σ 1 Ill-conditioning means κ ( A ) large, ill-posedness means much more. Z. Strakoš 20

SVD model reduction and PCR When σ 1 ≥ σ 2 ≥ · · · ≥ σ k ≫ σ k +1 ≥ · · · ≥ σ r , we have a good rank- k approximation k � s i σ i w T A ≈ A k = i . 1 Please recall the Principal Components Regression (PCA, PCR), x where the solution of the OLS problem is approximated by the so called Truncated SVD approximation (TSVD) r k s T s T ℓ b ℓ b � � w ℓ ≈ x PCR x = = w ℓ , k ≪ r . k σ ℓ σ ℓ ℓ =1 ℓ =1 Z. Strakoš 21

x = ( A T A ) − 1 A T b ?? What is wrong with In theory almost nothing. If the computation is done that way, then, apart from some very special cases, almost everything. See the analysis of the formula using the SVD of A : x = ( A T A ) − 1 A T b = ( W Σ 2 W T ) − 1 W Σ S T b = W Σ − 1 S T b . If the normal matrix is formed and then inverted, things will not cancel out so nicely. Results computed by inverting the explicitly formed normal matrix are generally expensive and inaccurate; in the worst case they can be a total garbage. The requirements of Baxter and Iserles (2003). - reliably, robustly and affordably - are violated. Z. Strakoš 22

Numerical linear algebra and some problems in computational - PowerPoint PPT Presentation

Numerical linear algebra and some problems in computational statistics Zden ek Strako Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/strakos IASC2008, Yokohama, Japan, December 2008. This lecture is devoted to

Lecture 14: Dense Linear Algebra David Bindel 18 Oct 2010 Where we are This week: dense

A Numerical Linear Algebra Framework for Solving Problems with Multivariate Polynomials Kim

A Numerical Linear Algebra Framework for Solving Problems with Multivariate Polynomials Kim

Numerical Semigroup Algebra Joint with Kee, Mee-Kyoung International meeting on numerical

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Linear Algebra Linear algebra has become as basic and as applicable as calculus, and

PV Math Department MCL Vision Credit Options Credit General General/Post- College Honors

Linear algebra and analysis recalls Lectures for PHD course on Numerical optimization Enrico

MATRICES AND LINEAR ALGEBRA Linear Algebra Matrix manipulation is the original essence of

Matrices Basic Linear Algebra Overview Lecture will cover why matrices and linear algebra

Linear algebra explained in four pages Excerpt from the N O BULLSHIT GUIDE TO LINEAR ALGEBRA by

Expressive Linear Algebra in Haskell Henning Thielemann 2019-08-21 Expressive Linear Algebra in

Basic Concepts in Linear Algebra Department of Mathematics Boise State University September 14,

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Essence of Linear Algebra Linear combinations and Span Some cool intuitions The box game:

Presentation 7.3b: Multiple linear re- gression Murray Logan August 9, 2016 Table of contents

Presentation 7.3b: Multiple linear regression Murray Logan 09 Aug 2016 library (GGally) library

Section 1 : Regression Review Yotam Shem-Tov Fall 2014 1/33 Yotam Shem-Tov STAT 239/ PS 236A

A synthetic account of Huygens Principle Anders Kock University of Aarhus, Denmark 1.

Types of problems Given a combinatorial object what are the properties of the realization

Convex hull reduces to sorting Graham scan algorithm Sorting. Given N

m -ovoids of regular near polygons Jesse Lansdown joint work with John Bamberg and Melissa Lee

WI-FI AS A COMMERCIAL SERVICE: NEW TECHNOLOGY AND POLICY NEW TECHNOLOGY AND POLICY IMPLICATIONS

Numerical linear algebra and some problems in computational - PowerPoint PPT Presentation

Numerical linear algebra and some problems in computational statistics Zden ek Strako Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/strakos IASC2008, Yokohama, Japan, December 2008. This lecture is devoted to

Lecture 14: Dense Linear Algebra David Bindel 18 Oct 2010 Where we are This week: dense

A Numerical Linear Algebra Framework for Solving Problems with Multivariate Polynomials Kim

A Numerical Linear Algebra Framework for Solving Problems with Multivariate Polynomials Kim

Numerical Semigroup Algebra Joint with Kee, Mee-Kyoung International meeting on numerical

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Graphics 2014 Linear Algebra II Linear Maps &amp; Matrices Linear Maps &amp; Matrices CORE

Linear Algebra Linear algebra has become as basic and as applicable as calculus, and

PV Math Department MCL Vision Credit Options Credit General General/Post- College Honors

Linear algebra and analysis recalls Lectures for PHD course on Numerical optimization Enrico

MATRICES AND LINEAR ALGEBRA Linear Algebra Matrix manipulation is the original essence of

Matrices Basic Linear Algebra Overview Lecture will cover why matrices and linear algebra

Linear algebra explained in four pages Excerpt from the N O BULLSHIT GUIDE TO LINEAR ALGEBRA by

Expressive Linear Algebra in Haskell Henning Thielemann 2019-08-21 Expressive Linear Algebra in

Basic Concepts in Linear Algebra Department of Mathematics Boise State University September 14,

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Essence of Linear Algebra Linear combinations and Span Some cool intuitions The box game:

Presentation 7.3b: Multiple linear re- gression Murray Logan August 9, 2016 Table of contents

Presentation 7.3b: Multiple linear regression Murray Logan 09 Aug 2016 library (GGally) library

Section 1 : Regression Review Yotam Shem-Tov Fall 2014 1/33 Yotam Shem-Tov STAT 239/ PS 236A

A synthetic account of Huygens Principle Anders Kock University of Aarhus, Denmark 1.

Types of problems Given a combinatorial object what are the properties of the realization

Convex hull reduces to sorting Graham scan algorithm Sorting. Given N

m -ovoids of regular near polygons Jesse Lansdown joint work with John Bamberg and Melissa Lee

WI-FI AS A COMMERCIAL SERVICE: NEW TECHNOLOGY AND POLICY NEW TECHNOLOGY AND POLICY IMPLICATIONS

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE