Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST
Agenda Numerical Linear Algebra in Big Data Era Revisit of Ulam-von Neumann Scheme Breakdown-Free BCG Randomized SVD BCG with Adaptive Deflation Results and Applications Summary
Numerical Linear Algebra for Numerical Linear Algebra in Big Data Era Characterizing by Large Matrices Million x Million to Billion x Billion Most likely sparse Operations Approximating Matrix-Matrix/Matrix-Vector Multiplications Solving Linear Systems with Large Coefficient Matrices Finding Extreme Eigenvalues/Eigenvectors Top- k Eigenvalues/Eigenvectors Low- k Eigenvalues/Eigenvectors Estimating the Determinant
Characteristics of Numerical Linear Algebra in Big Data Era (1) Matrices may not fit in the Matrices may not exist main memory explicitly Core Memory CPU = ● Large Matrix Large Matrix Storage Accessing Matrix Elements Becomes the New Bottleneck
Characteristics of Numerical Linear Algebra in Big Data Era (2) Matrices may be incomplete Some elements may be missing Chance of memory errors increases Large Matrix
Characteristics of Numerical Linear Algebra in Big Data Era (3) Solution precision requirement is usually not very high 10 -2 even 10 -1 is enough, sometimes Vice versa, computational speed (response time) is usually more important, instead
Characteristics of Numerical Linear Algebra in Big Data Era (4) Modern Parallel/Distributed Computing Paradigms/Architectures Intel Xeon Phi Map/Reduce Paradigm Nvidia GPGPU
A Little Bit of History Conjugate Gradient (CG) Methods Developed in 1950’s Not considered as efficient methods compared to direct methods Start to gain attention in 1970’s Bigger matrices with sparsity Efficient matrix-vector multiplication Powerful iterative method for sparse linear systems Catalyst for subsequent work Krylov subspace methods Preconditioning Parallel computing etc. [Golub & O’Leary, SIAM Review, 1989]
Can History Repeat Itself? Monte Carlo Methods for Solving Linear Systems Developed in 1950s by Ulam and von Neumann Statistical sampling Not considered as efficient methods compared to deterministic methods Can Monte Carlo Methods be Resurrected in the Big Data Era? Sampling matrices (reduced visits to matrix elements) Fast computation with low relative accuracy of 10 -1 or 10 -2 residual error Natural parallel
Ulam-von Neumann Scheme Ulam-von Neumann Scheme Construct Markov chains to sample the Neumann Series Consider a Linear System x = Hx + b Neumann Series I + H + H 2 + H 3 + … If ρ ( H ) < 1 I + H + H 2 + H 3 + … = (I – H) -1 [Hammersley & Handscomb, 1964]
Ulam-von Neumann Scheme (cont.) Transition Matrix P 0 ; P ij 1 ; P ij j 0 0 H P ij ij Termination Probability T i 1 T P i ij j Random walk γ : ... r r r r 0 1 2 k Then ... H H H is an unbiased estimator of x r r r r r r ( ) / X 0 1 1 2 k 1 k b T r 0 r r ... P P P k k r r r r r r 0 1 1 2 1 k k
An Improved Monte Carlo Algorithm Monte Carlo Almost Optimal (MAO) An alternative transition matrix P | | H ij P ij | | H ik k Adaptive termination 1 ; W 0 H r r ...; W W 1 k k k k 1 P r r 1 k k An alternative estimator of u r 0 ( ) X W b k kr k k Better accuracy (smaller variance) [Dimov et al., 1998]
For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;
For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜾 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;
For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 • Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; • ℎ𝑡𝑣𝑛 = 0.55 • Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 • While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 • Generate an uniformly distributed 0.28125 0.5625 0.15625 random number 𝜊 ∈ ) 0,1 ; • Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 • While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; • • End • 𝑋 = 𝑋 ∗ 𝑡𝑗𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; • 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; • 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; • End • return 𝑌 ;
For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;
For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 1 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;
For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 2 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;
Recommend
More recommend