ILAS 2013 Using ILU to estimate the diagonal of the inverse of a - PowerPoint PPT Presentation

ILAS 2013 Using ILU to estimate the diagonal of the inverse of a matrix Andreas Stathopoulos, Lingfei Wu, Jesse Laeuchli College of William and Mary Vasilis Kalantzis, Stratis Gallopoulos University of Patras, Greece Acks: NSF, DOE SciDAC

The problem Given a large, N × N matrix A and a function f find trace of f ( A ) : Tr ( f ( A )) Common functions: f ( A ) = A − 1 f ( A ) = log ( A ) i A − 1 R j f ( A ) = R T Applications: UQ, Data Mining, Quantum Monte Carlo, Lattice QCD Our focus: f ( A ) = A − 1 [ 2 ]

The methods Currently all methods are based on Monte Carlo (Hutchinson 1989) If x is a vector of random Z 2 variables � 1 with probability 1 / 2 x i = − 1 with probability 1 / 2 then E ( x T A − 1 x ) = Tr ( A − 1 ) Monte Carlo Trace for i=1: n x = randZ2( N ,1) sum = sum + x T A − 1 x trace = sum / n [ 3 ]

The methods Currently all methods are based on Monte Carlo (Hutchinson 1989) If x is a vector of random Z 2 variables � 1 with probability 1 / 2 x i = − 1 with probability 1 / 2 then E ( x T A − 1 x ) = Tr ( A − 1 ) Monte Carlo Trace 2 problems for i=1: n Large number of samples x = randZ2( N ,1) How to compute x T A − 1 x sum = sum + x T A − 1 x trace = sum / n [ 4 ]

The methods Currently all methods are based on Monte Carlo (Hutchinson 1989) If x is a vector of random Z 2 variables � 1 with probability 1 / 2 x i = − 1 with probability 1 / 2 then E ( x T A − 1 x ) = Tr ( A − 1 ) Monte Carlo Trace for i=1: n x = randZ2( N ,1) sum = sum + x T A − 1 x Find quadrature x T A − 1 x Solve Ay = x with CG compute y T x with Lanczos trace = sum / n (Golub’69, Bai’95, Meurant’06,’09, Strakos’11, ...) [ 5 ]

The methods Currently all methods are based on Monte Carlo (Hutchinson 1989) If x is a vector of random Z 2 variables � 1 with probability 1 / 2 x i = − 1 with probability 1 / 2 then E ( x T A − 1 x ) = Tr ( A − 1 ) Monte Carlo Trace for i=1: n x = randZ2( N ,1) sum = sum + x T A − 1 x O ( 100 − 1000 s ) statistically independent RHS trace = sum / n Recycling (de Sturler), Deflation (Morgan, AS’07, ...) speed up Krylov methods [ 6 ]

Selecting the vectors in x T A − 1 x Random x ∈ Z N best variance for real matrices (Hutchinson 1989) 2 x = randn ( N , 1 ) worse variance than Z 2 variance depends only on diag ( A − 1 ) x = e i single large element? x = F T e i mixing of diagonal elements: (Toledo et al. 2010) F = DFT or F = Hadamard Deterministic x = H T e i , i = 1 ,..., 2 k Hadamard in natural order (Bekas et al. 2007) � 1 i ∈ C m x m i = Probing. Assumes multicolored graph (Tang et al. 2011) 0 else Random-deterministic Hierarchical Probing for lattices (A.S, J.L. 2013) [ 7 ]

Variance of the estimators Rademacher vectors x i ∈ Z N 2 s ∑ i � = j ( A − 1 Tr = 1 s ∑ s i A − 1 x i Var ( Tr ) = 2 s � ˜ A − 1 � 2 F = 2 i = 1 x T i j ) 2 Diagonal x = e j ( i ) Var ( Tr ) = N 2 i = 1 A − 1 Tr = N s ∑ s s Var ( diag ( A − 1 )) j ( i ) , j ( i ) magnitude −1 A variance [ 8 ]

Approximating diag( A − 1 ) from ILU Consider an incomplete LU of A : [ L , U ] = ILU ( A ) If U − 1 L − 1 good approximation to A − 1 then compute trace from: M = diag ( U − 1 L − 1 ) Computing M needs only one pass over L , U (Erisman, Tienny, ’75) E = U − 1 L − 1 − A − 1 In some cases, Tr ( E ) can be sufficiently close to zero However, what if | Tr ( E ) | is not small? [ 9 ]

ILU gives more info Observation: even if Tr ( E ) large, M may approximate the pattern of diag( A − 1 ) and/or E may have smaller variance Ex. small Laplacian and DW2048 Comparing traceA, traceILU and fitILU with sorted order 10 Inverse of A 9 Inverse of ILU Difference 8 7 6 Diagonal element 5 4 3 2 1 0 −1 1400 1500 1600 1700 1800 1900 2000 Sorted index of diagonal element [ 10 ]

Capture pattern better by fitting p ( M ) to diag( A − 1 ) Find p () : min � p ( M ) − diag ( A − 1 ) � on a set of m indices • Induce smoothness on M by sorting • Use m equispaced indices to capture the range of values • Compute A − 1 j j of these indices • Fit M j to A − 1 j j using MATLAB’s LinearModel.stepwise When ILU is a good preconditioner, Tr ( p ( M )) can be accurate to O(1E-3) ! [ 11 ]

Examples of fitting TOLS4000, DW2048, af23560, conf6 Comparing diag(A^−1), diag((LU)^−1) and diag((LU)^−1)_fit Comparing diag(A^−1), diag((LU)^−1) and diag((LU)^−1)_fit 0.1 Inverse of A Inverse of ILU 50 0 Fitting of ILU Inverse of A Inverse of ILU −0.1 0 Fitting of ILU Diagonal element Diagonal element −0.2 −50 −0.3 −0.4 −100 −0.5 −150 −0.6 0 500 1000 1500 2000 2500 3000 3500 4000 0 500 1000 1500 2000 Sorted index of diagonal element Sorted index of diagonal element Comparing diag(A^−1), diag((LU)^−1) and diag((LU)^−1)_fit Comparing diag(A^−1), diag((LU)^−1) and diag((LU)^−1)_fit −5 2 x 10 0.02 Inverse of A Inverse of A Inverse of ILU 1.5 Fitting of ILU Inverse of ILU 0 Imagery part of diagonal element Fitting of ILU 1 Diagonal element −0.02 0.5 −0.04 0 −0.5 −0.06 −1 −0.08 −1.5 −0.1 −2 0 0.5 1 1.5 2 2.5 −0.0549 −0.0549 −0.0549 −0.0548 −0.0548 −0.0548 −0.0548 −0.0548 −0.0548 Real part of diagonal element Sorted index of diagonal element 4 x 10 [ 12 ]

Improving on the Tr ( M ) and Tr ( p ( M )) • MC on E = M − diag ( A − 1 ) potentially smaller variance on the diagonal • MC on E 2 = p ( M ) − diag ( A − 1 ) m inversions for fitting, s − m inversions for MC further variance improvement • MC with importance sampling based on M or p ( M ) Or is traditional Hutchinson better? • MC with Z N 2 on A − 1 • MC with Z N 2 on E Depends on approximation properties of ILU [ 13 ]

AF23560, conf6 ( k c − 10 − 8 ) Experiments Monte carlo, Matrix: Bai/af23560 Monte carlo, Matrix: Bai/af23560 −680 −720 MC on diag(A −1 ) MC on diag(E)_fit MC on A^−1 −700 MC on diag(E) MC on E MC on diag(E2) −740 Trace of A^−1 −720 −1 Trace of A −740 Trace estimate Trace estimate −760 −760 −780 −780 −800 −800 −820 −840 −820 0 200 400 600 800 1000 0 200 400 600 800 1000 Number of samples Number of samples Monte carlo, Matrix: conf6.0−QCD Monte carlo, Matrix: conf6.0−QCD MC on diag(E2) −126.32 −126.375 MC on diag(E) MC on E −126.34 −1 Trace of A −126.376 −126.36 Trace estimate Trace estimate −126.38 −126.377 −126.4 MC on diag(A −1 ) −126.42 MC on diag(E) −126.378 −126.44 MC on diag(E2) MC on A −1 −126.46 −126.379 MC on E −126.48 −1 Trace of A 0 200 400 600 800 1000 0 200 400 600 800 1000 Number of samples Number of samples [ 14 ]

TOLS4000 ILU( A + 10 I ), DW2048 ILU ( A + 0 . 01 I ) Experiments Monte carlo, Matrix: Bai/tols4000 Monte carlo, Matrix: Bai/tols4000 −60 −116 −80 −118 −100 −120 −120 Trace estimate −122 Trace estimate −140 −124 −160 −126 MC on diag(A^−1) −180 MC on diag(E) −128 −200 MC on diag(E2) MC on diag(E2) −130 MC on A^−1 MC on A^−1 −220 MC on E MC on E −132 Trace of A^−1 −240 Trace of A^−1 −134 200 400 600 800 1000 0 200 400 600 800 1000 1200 Number of samples Number of samples Monte carlo, Matrix: Bai/dw2048 Monte carlo, Matrix: Bai/dw2048 7500 MC on diag(A −1 ) MC on diag(E2) 15000 MC on A −1 MC on diag(E) 7000 MC on diag(E2) MC on E 6500 −1 MC on A −1 Trace of A 10000 MC on E 6000 Trace estimate Trace estimate −1 Trace of A 5500 5000 5000 4500 0 4000 3500 −5000 3000 200 400 600 800 1000 200 400 600 800 1000 Number of samples Number of samples [ 15 ]

Fitting allows ILU( A + σ I ) Experiments Often ILU on A is not possible, ill-conditioned, or too expensive Better results if we use a better conditioned ILU( A + σ I ) and allow the fitting to fix the diagonal Matrix tols4000: Statistical error varying shifts 2 10 0 10 Standard derivation −2 10 −4 10 Std(diag(E2)) Std(diag(A −1 )) Std(diag(E)) −6 Std(A −1 ) 10 Std(E) −10 −5 0 10 10 10 Shifts [ 16 ]

Sometimes Z N Experiments 2 better QCD matrix (49K) close to k c EPB3 Monte carlo, Matrix: matb5 Monte carlo, Matrix: epb3 6 x 10 MC on diag(E2) MC on diag(E2) 2.3 MC on A −1 MC on A −1 −9015 2.28 MC on E MC on E 2.26 −9020 Trace estimate Trace estimate 2.24 2.22 −9025 2.2 2.18 −9030 2.16 −9035 2.14 2.12 −9040 0 200 400 600 800 1000 0 200 400 600 800 1000 Number of samples Number of samples Here E had smaller off diagonal variance — Not easily predictable by theory [ 17 ]

Dynamically identifying smallest variance For every fitting point i = 1 ,..., m Compute a i = A − 1 e i • Based on a ii = A − 1 ii update estimates for var(diag(A)) var(diag(E)) ( a ii − M i ) var(diag(E2)) ( a ii − p ( M i ) ) • Use � a i � 2 − a 2 ii to update estimate for var(MC on A ) = � A � 2 F • Compute µ i = U − 1 L − 1 e i and update estimate for var(MC on E ) ( � a i − µ i � 2 − a 2 ii − µ 2 ii ) Large differences in various methods would show after a few points [ 18 ]

ILAS 2013 Using ILU to estimate the diagonal of the inverse of a - PowerPoint PPT Presentation

ILAS 2013 Using ILU to estimate the diagonal of the inverse of a matrix Andreas Stathopoulos, Lingfei Wu, Jesse Laeuchli College of William and Mary Vasilis Kalantzis, Stratis Gallopoulos University of Patras, Greece Acks: NSF, DOE SciDAC The

Diagonal asymptotics for products of combinatorial classes Or: the diagonal method is still not

0.00 xx xy xz 0.0 0.1 0.2 0.3 0.4 0.5 External

Welcome to the Vil ilas Avenue and Campbell Street Reconstruction Public In Involvement Meeting

1 Diagonal Complete Latin Squares (Author: Jenny Zhang) Definition : Right-diagonal complete

Use of Best Estimate Methods in licensing of VVER reactors Jiri Macek UJV Rez, a. s. VVER

Orthogonal similarity reduction of any symmetric matrix into a diagonal-plus-semiseparable one

Diagonal-Free Proofs of the Diagonal Lemma Saeed Salehi University of Tabriz & IPM WORMSHOP

Overview Yesterday we studied how real 2 2 matrices act on C . Just as the action of a diagonal

Various Single Feed CP MSAs L 2 (a) (b) (c) L 1 L L (d) (e) (f) L L L (a) Diagonal fed

The Frobenius and the Tate diagonal eCHT Reading Seminar on THH Yuqing Shi University of Utrecht

Ministry of Finance, the Economy and Investment The world economy Real GDP Source: Eurostat

twenty six diagonal cracking shear f c http://www.bam.de concrete construction:

Estimate/Bid Workshop M.B. Kahn Construction Co., Inc. 101 Flintlake Road, Columbia, SC 29202

Quasi-Cartan companions of cluster-tilted quivers Ahmet Seven Middle East Technical University,

Compact spaces with a P -diagonal T a sc eil n agam K. P. Hart Faculty EEMCS TU

Integrated System Transmission and Ancillary Services Rates 2015 Estimate & 2013 True-up

Methods for estimating the diagonal of matrix functions Jesse Laeuchli Andreas Stathopoulos CSC

5.3 Diagonalization The goal here is to develop a useful factorization A = PDP 1 , when A is n

How many strips of bacon are there? 17 Or 20 What do you think is the better estimate:

CASUALTY ACTUARIAL SOCIETY Using Life Expectancy to Inform the 2013 ANNUAL MEETING Estimate of

The Hilbert Scheme of the Diagonal in a Product of Projective Spaces Dustin Cartwright UC

Estimating CIs on proportions How confident can I be in my estimate? (e.g., 0 of 10 vs. 0

TCP Adaptive Retransmission Algorithm - Original TCP Theory Estimate RTT Multiply by

Towards Model Fusion in Traditional Methods . . . Geophysics: How to How to Estimate . . . How

ILAS 2013 Using ILU to estimate the diagonal of the inverse of a - PowerPoint PPT Presentation

ILAS 2013 Using ILU to estimate the diagonal of the inverse of a matrix Andreas Stathopoulos, Lingfei Wu, Jesse Laeuchli College of William and Mary Vasilis Kalantzis, Stratis Gallopoulos University of Patras, Greece Acks: NSF, DOE SciDAC The

Diagonal asymptotics for products of combinatorial classes Or: the diagonal method is still not

0.00 xx xy xz 0.0 0.1 0.2 0.3 0.4 0.5 External

Welcome to the Vil ilas Avenue and Campbell Street Reconstruction Public In Involvement Meeting

1 Diagonal Complete Latin Squares (Author: Jenny Zhang) Definition : Right-diagonal complete

Use of Best Estimate Methods in licensing of VVER reactors Jiri Macek UJV Rez, a. s. VVER

Orthogonal similarity reduction of any symmetric matrix into a diagonal-plus-semiseparable one

Diagonal-Free Proofs of the Diagonal Lemma Saeed Salehi University of Tabriz &amp; IPM WORMSHOP

Overview Yesterday we studied how real 2 2 matrices act on C . Just as the action of a diagonal

Various Single Feed CP MSAs L 2 (a) (b) (c) L 1 L L (d) (e) (f) L L L (a) Diagonal fed

The Frobenius and the Tate diagonal eCHT Reading Seminar on THH Yuqing Shi University of Utrecht

Ministry of Finance, the Economy and Investment The world economy Real GDP Source: Eurostat

twenty six diagonal cracking shear f c http://www.bam.de concrete construction:

Estimate/Bid Workshop M.B. Kahn Construction Co., Inc. 101 Flintlake Road, Columbia, SC 29202

Quasi-Cartan companions of cluster-tilted quivers Ahmet Seven Middle East Technical University,

Compact spaces with a P -diagonal T a sc eil n agam K. P. Hart Faculty EEMCS TU

Integrated System Transmission and Ancillary Services Rates 2015 Estimate &amp; 2013 True-up

Methods for estimating the diagonal of matrix functions Jesse Laeuchli Andreas Stathopoulos CSC

5.3 Diagonalization The goal here is to develop a useful factorization A = PDP 1 , when A is n

How many strips of bacon are there? 17 Or 20 What do you think is the better estimate:

CASUALTY ACTUARIAL SOCIETY Using Life Expectancy to Inform the 2013 ANNUAL MEETING Estimate of

The Hilbert Scheme of the Diagonal in a Product of Projective Spaces Dustin Cartwright UC

Estimating CIs on proportions How confident can I be in my estimate? (e.g., 0 of 10 vs. 0

TCP Adaptive Retransmission Algorithm - Original TCP Theory Estimate RTT Multiply by

Towards Model Fusion in Traditional Methods . . . Geophysics: How to How to Estimate . . . How

Diagonal-Free Proofs of the Diagonal Lemma Saeed Salehi University of Tabriz & IPM WORMSHOP

Integrated System Transmission and Ancillary Services Rates 2015 Estimate & 2013 True-up