scale linear systems
play

Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M - PowerPoint PPT Presentation

Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST


  1. Revisit of Monte Carlo Methods on Solving Large- Scale Linear Systems Y A O H A N G L I , P H . D . D E P A R T M E N T O F C O M P U T E R S C I E N C E O L D D O M I N I O N U N I V E R S I T Y yaohang@cs.odu.edu Nov. 4, 2014 @ NIST

  2. Agenda  Numerical Linear Algebra in Big Data Era  Revisit of Ulam-von Neumann Scheme  Breakdown-Free BCG  Randomized SVD  BCG with Adaptive Deflation  Results and Applications  Summary

  3. Numerical Linear Algebra for  Numerical Linear Algebra in Big Data Era  Characterizing by Large Matrices  Million x Million to Billion x Billion  Most likely sparse  Operations  Approximating Matrix-Matrix/Matrix-Vector Multiplications  Solving Linear Systems with Large Coefficient Matrices  Finding Extreme Eigenvalues/Eigenvectors  Top- k Eigenvalues/Eigenvectors  Low- k Eigenvalues/Eigenvectors  Estimating the Determinant

  4. Characteristics of Numerical Linear Algebra in Big Data Era (1) Matrices may not fit in the Matrices may not exist main memory explicitly Core Memory CPU = ● Large Matrix Large Matrix Storage Accessing Matrix Elements Becomes the New Bottleneck

  5. Characteristics of Numerical Linear Algebra in Big Data Era (2)  Matrices may be incomplete  Some elements may be missing  Chance of memory errors increases Large Matrix

  6. Characteristics of Numerical Linear Algebra in Big Data Era (3)  Solution precision requirement is usually not very high  10 -2 even 10 -1 is enough, sometimes  Vice versa, computational speed (response time) is usually more important, instead

  7. Characteristics of Numerical Linear Algebra in Big Data Era (4)  Modern Parallel/Distributed Computing Paradigms/Architectures Intel Xeon Phi Map/Reduce Paradigm Nvidia GPGPU

  8. A Little Bit of History  Conjugate Gradient (CG) Methods  Developed in 1950’s  Not considered as efficient methods compared to direct methods  Start to gain attention in 1970’s  Bigger matrices with sparsity  Efficient matrix-vector multiplication  Powerful iterative method for sparse linear systems  Catalyst for subsequent work  Krylov subspace methods  Preconditioning  Parallel computing  etc. [Golub & O’Leary, SIAM Review, 1989]

  9. Can History Repeat Itself?  Monte Carlo Methods for Solving Linear Systems  Developed in 1950s by Ulam and von Neumann  Statistical sampling  Not considered as efficient methods compared to deterministic methods  Can Monte Carlo Methods be Resurrected in the Big Data Era?  Sampling matrices (reduced visits to matrix elements)  Fast computation with low relative accuracy of 10 -1 or 10 -2 residual error  Natural parallel

  10. Ulam-von Neumann Scheme  Ulam-von Neumann Scheme  Construct Markov chains to sample the Neumann Series  Consider a Linear System x = Hx + b  Neumann Series I + H + H 2 + H 3 + …  If ρ ( H ) < 1 I + H + H 2 + H 3 + … = (I – H) -1 [Hammersley & Handscomb, 1964]

  11. Ulam-von Neumann Scheme (cont.)  Transition Matrix P  0 ; P ij   1 ; P ij j    0 0 H P ij ij  Termination Probability T i    1 T P i ij j  Random walk γ      : ... r r r r 0 1 2 k  Then ... H H H is an unbiased estimator of x   r r r r r r  ( ) / X 0 1 1 2 k 1 k b T r 0 r r ... P P P k k r r r r r r  0 1 1 2 1 k k

  12. An Improved Monte Carlo Algorithm  Monte Carlo Almost Optimal (MAO)  An alternative transition matrix P | | H  ij P  ij | | H ik k  Adaptive termination  1 ; W 0 H  r r  ...; W W 1 k k  k k 1 P r r  1 k k  An alternative estimator of u r 0  (   ) X W b k kr k k  Better accuracy (smaller variance) [Dimov et al., 1998]

  13. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  14. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜾 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  15. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 • Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; • ℎ𝑡𝑣𝑛 = 0.55 • Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 • While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 • Generate an uniformly distributed 0.28125 0.5625 0.15625 random number 𝜊 ∈ ) 0,1 ; • Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 • While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 1 0.225 𝑘 = 𝑘 + 1 ; • • End • 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; • 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; • 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; • End • return 𝑌 ;

  16. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  17. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 1 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

  18. For 𝑦 1 , 𝜁 = 0.002 Perform one trajectory: 0.1 0.45 0.225 0.225 H = 𝑐 = −0.15 0.1 −0.3 1.35 Set initial values −0.18 0.36 0.1 0.72 0.775 𝑌 = 0, 𝑋 = 1, 𝑗𝑜𝑒𝑓𝑦 = 𝑗; ℎ𝑡𝑣𝑛 = 0.55 Calculate 𝑌 = 𝑌 + 𝑋𝑐 𝑗𝑜𝑒𝑓𝑦 ; 0.64 0.129032 0.580645 0.290323 While ( 𝑋 > 𝜁) 𝑄 = 0.272727 0.181818 0.545455 Generate an uniformly distributed random 0.28125 0.5625 0.15625 number 𝜊 ∈ ) 0,1 ; Set 𝑘 = 1 ; 𝜊 index W X j 𝑘 While (𝜊 > 𝑡=1 𝑞 𝑗𝑜𝑒𝑓𝑦,𝑡 ) 1 0.3937 1 0.225 2 𝑘 = 𝑘 + 1 ; End 𝑋 = 𝑋 ∗ 𝑡𝑗𝑕𝑜 ℎ 𝑗𝑜𝑒𝑓𝑦,𝑘 ∗ ℎ𝑡𝑣𝑛(𝑗𝑜𝑒𝑓𝑦) ; 𝑌 = 𝑌 + 𝑋 ∗ 𝑐 𝑘 ; 𝑗𝑜𝑒𝑓𝑦 = 𝑘 ; End return 𝑌 ;

Recommend


More recommend