matrix vector multiplication in sub quadratic time some
play

Matrix-Vector Multiplication in Sub-Quadratic Time (Some - PowerPoint PPT Presentation

Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required) Ryan Williams Carnegie Mellon University 0-0 Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing 1 Introduction


  1. Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required) Ryan Williams Carnegie Mellon University 0-0

  2. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing 1

  3. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? 1-a

  4. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? Θ( n 2 ) steps just to read the matrix! 1-b

  5. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? Θ( n 2 ) steps just to read the matrix! Main Result: If we allow O ( n 2+ ε ) preprocessing, then matrix-vector multiplication over any finite semiring can be done in O ( n 2 / ( ε log n ) 2 ) . 1-c

  6. Better Algorithms for Matrix Multiplication Three of the major developments: 2

  7. Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives 2-a

  8. Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives log 7 log 2 = O ( n 2 . 81 ) operations • Strassen (1969): n Asymptotically fast, but overhead in the big-O Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006) 2-b

  9. Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives log 7 log 2 = O ( n 2 . 81 ) operations • Strassen (1969): n Asymptotically fast, but overhead in the big-O Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006) • Coppersmith and Winograd (1990): O ( n 2 . 376 ) operations Not yet practical 2-c

  10. Focus: Combinatorial Matrix Multiplication Algorithms 3

  11. Focus: Combinatorial Matrix Multiplication Algorithms • Also called non-algebraic ; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t 3-a

  12. Focus: Combinatorial Matrix Multiplication Algorithms • Also called non-algebraic ; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t More Non-Subtractive Boolean Matrix Mult. Algorithms: • Atkinson and Santoro: O ( n 3 / log 3 / 2 n ) on a (log n ) -word RAM • Rytter and Basch-Khanna-Motwani: O ( n 3 / log 2 n ) on a RAM • Chan: Four Russians can be implemented on O ( n 3 / log 2 n ) on a pointer machine 3-b

  13. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” 4

  14. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) 4-a

  15. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line 4-b

  16. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine 4-c

  17. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine This Talk: The Boolean case 4-d

  18. Preprocessing Phase: The Boolean Case Partition the input matrix A into blocks of ⌈ ε log n ⌉ × ⌈ ε log n ⌉ size: A 1 , 1 A 1 , 2 A 1 , · · · n ε log n . . ε log n . A 2 , 1 A = A i,j ε log n . . . . . . A A n · · · · · · ε log n , 1 n n ε log n , ε log n 5

  19. Preprocessing Phase: The Boolean Case Build a graph G with parts P 1 , . . . , P n/ ( ε log n ) , Q 1 , . . . , Q n/ ( ε log n ) P 1 2 ε log n 2 ε log n Q 1 Each part has 2 ε log n vertices, one for each possible ε log n vector P 2 2 ε log n 2 ε log n Q 2 . . . . . . . . . . . . P 2 ε log n Q 2 ε log n n n ε log n ε log n 6

  20. Preprocessing Phase: The Boolean Case Edges of G : Each vertex v in each P i has exactly one edge into each Q j 2 ε log n Q j P i 2 ε log n v A j,i v 7

  21. Preprocessing Phase: The Boolean Case Edges of G : Each vertex v in each P i has exactly one edge into each Q j 2 ε log n Q j P i 2 ε log n v A j,i v Time to build the graph: ε log n · 2 ε log n · ( ε log n ) 2 = O ( n 2+ ε ) n n ε log n · number number number matrix-vector mult of Q j of P i of A j,i and v of nodes in P i 7-a

  22. How to Do Fast Vector Multiplications Let v be a column vector. Want: A · v . 8

  23. How to Do Fast Vector Multiplications Let v be a column vector. Want: A · v . (1) Break up v into ε log n sized chunks:   v 1   v 2   v =   .  .  .     v n ε log n 8-a

  24. How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . 9

  25. How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . P 1 Q 1 2 ε log n 2 ε log n v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) Takes ˜ O ( n ) time. 9-a

  26. How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . P 1 Q 1 2 ε log n 2 ε log n v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) Takes ˜ O ( n ) time. 10

  27. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. 11

  28. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , 1 · v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , 1 · v 1 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) A ε log n , 1 · v 1 n 11-a

  29. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , 2 · v 2 P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , 2 · v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n A ε log n , 2 · v 2 n v n/ ( ε log n ) 12

  30. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , ε log n · v n/ ( ε log n ) n P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , ε log n · v n/ ( ε log n ) n . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) �� � 2 � A ε log n · v n/ ( ε log n ) n n ε log n , n Takes O ε log n 13

  31. How to Do Fast Vector Multiplications (4) For each Q j , define v ′ j as the OR of all marked vectors in Q j P 1 Q 1 2 ε log n 2 ε log n v 1 v ′ ⇒ ∨ 1 P 2 2 ε log n 2 ε log n Q 2 v 2 v ′ ⇒ ∨ 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v ′ ⇒ ∨ v n/ ( ε log n ) n/ ( ε log n ) Takes ˜ O ( n 1+ ε ) time 14

  32. How to Do Fast Vector Multiplications (4) For each Q j , define v ′ j as the OR of all marked vectors in Q j P 1 Q 1 2 ε log n 2 ε log n v 1 v ′ ⇒ ∨ 1 P 2 2 ε log n 2 ε log n Q 2 v 2 v ′ ⇒ ∨ 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v ′ ⇒ ∨ v n/ ( ε log n ) n/ ( ε log n ) Takes ˜ O ( n 1+ ε ) time 15

  33. How to Do Fast Vector Multiplications   v ′ 1   v ′   2 (5) Output v ′ := Claim: v ′ = A · v . .   . .   .     v ′ n ε log n 16

  34. How to Do Fast Vector Multiplications   v ′ 1   v ′   2 (5) Output v ′ := Claim: v ′ = A · v . .   . .   .     v ′ n ε log n j = � n/ ( ε log n ) Proof: By definition, v ′ A j,i · v i . i =1 16-a

Recommend


More recommend