On fast multiplication of a matrix by its transpose Jean-Guillaume Dumas Cl´ ement Pernet Alexandre Sedoglavic Luminy, 3 Mars 2020 Centre de Recherche en Informatique, Signal et Automatique de Lille
Strassen-Winograd fast multiplication algorithm Outline Strassen-Winograd fast multiplication algorithm 1 Fast matrix product by its transpose 2 Skew orthogonal matrices 3 Complexity bounds for block algorithms 4 Space and time efficient implementation 5 Minimality 6 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 2 / 23
Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 Classical Algorithm 8 multiplications, 4 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23
Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 [Strassen 1969] 7 multiplications, 18 additions Classical Algorithm 8 multiplications, 4 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23
Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 [Strassen 1969] 7 multiplications, 18 additions [Winograd 1973? 1977] Classical Algorithm 7 multiplications, 15 additions 8 multiplications, 4 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23
Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 [Strassen 1969] 7 multiplications, 18 additions [Winograd 1973? 1977] Classical Algorithm 7 multiplications, 15 additions 8 multiplications, 4 additions [ Hopcroft-Kerr 1969] : 7 multiplications minimum [ Bshouty 1995] : 15 additions minimum (for a bilin. alg. with 7 mult.) Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23
Strassen-Winograd fast multiplication algorithm Matrix multiplication by its transpose A ¨ A ⊺ � � � A ⊺ A ⊺ � � ( A 11 A ⊺ 11 + A 12 A ⊺ � � � A 11 A 12 12 ) C ⊺ C 11 11 21 21 = = ˆ A ⊺ A ⊺ ( A 21 A ⊺ 11 + A 22 A ⊺ ( A 21 A ⊺ 21 + A 22 A ⊺ A 21 A 22 12 ) 22 ) C 21 C 22 12 22 Divide & Conquer Algorithm 6 multiplications, 3 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 4 / 23
Strassen-Winograd fast multiplication algorithm Matrix multiplication by its transpose A ¨ A ⊺ � � � A ⊺ A ⊺ � � ( A 11 A ⊺ 11 + A 12 A ⊺ � � � A 11 A 12 12 ) C ⊺ C 11 11 21 21 = = ˆ A ⊺ A ⊺ ( A 21 A ⊺ 11 + A 22 A ⊺ ( A 21 A ⊺ 21 + A 22 A ⊺ A 21 A 22 12 ) 22 ) C 21 C 22 12 22 Divide & Conquer Algorithm here (over C , over any finite field) 6 multiplications, 3 additions 5 multiplications, 7.5 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 4 / 23
Fast matrix product by its transpose Outline Strassen-Winograd fast multiplication algorithm 1 Fast matrix product by its transpose 2 Skew orthogonal matrices 3 Complexity bounds for block algorithms 4 Space and time efficient implementation 5 Minimality 6 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 5 / 23
Fast matrix product by its transpose From Strassen-Winograd fast multiplication algorithm � � Require: A = [ a 11 a 12 b 11 b 12 a 21 a 22 ] and B = ; b 21 b 22 Ensure: C = A ¨ B 8 additions: 1 s 1 Ð a 11 ´ a 21 , s 2 Ð a 21 + a 22 , s 3 Ð s 2 ´ a 11 , s 4 Ð a 12 ´ s 3 , t 1 Ð b 22 ´ b 12 , t 2 Ð b 12 ´ b 11 , t 3 Ð b 11 + t 1 , t 4 Ð b 21 ´ t 3 . 7 recursive multiplications: 2 p 1 Ð a 11 ¨ b 11 , p 2 Ð a 12 ¨ b 21 , p 3 Ð a 22 ¨ t 4 , p 4 Ð s 1 ¨ t 1 , p 5 Ð s 3 ¨ t 3 , p 6 Ð s 4 ¨ b 22 , p 7 Ð s 2 ¨ t 2 . 7 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , c 5 Ð c 2 + p 7 , c 6 Ð c 1 + p 7 , c 7 Ð c 6 + p 6 . return C = [ c 3 c 7 c 4 c 5 ] . 4 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 6 / 23
Fast matrix product by its transpose Matrix product by its transpose � a ⊺ 11 a ⊺ � a 21 a 22 ] with A ⊺ = Require: A = [ a 11 a 12 21 ; a ⊺ 12 a ⊺ 22 Ensure: C = A ¨ A ⊺ 6 additions: 1 ✭✭✭✭✭✭ ❤❤❤❤❤❤ ✭ s 1 Ð a 11 ´ a 21 , s 2 Ð a 21 + a 22 , s 3 Ð s 2 ´ a 11 , s 4 Ð a 12 ´ s 3 , ❤ ❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭ t 1 Ð a ⊺ 22 ´ a ⊺ t 2 Ð a ⊺ 21 ´ a ⊺ t 3 Ð a ⊺ t 4 Ð a ⊺ 21 , 11 , 11 + t 1 , 12 ´ t 3 . ❤ 6 multiplications (2 recursive, 4 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ 11 , 12 , p 3 Ð a 22 ¨ t 4 , p 4 Ð s 1 ¨ t 1 , ❳❳❳❳❳ ✘ ✘✘✘✘✘ p 6 Ð s 4 ¨ a ⊺ p 7 Ð s 2 ¨ s ⊺ p 5 Ð s 3 ¨ t 3 , 22 , 1 . ❳ 5 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , ❤❤❤❤❤ ✭ ❤❤❤❤❤ ✭ ✭✭✭✭✭ ✭✭✭✭✭ c 5 Ð c 2 ´ p 7 , c 6 Ð c 1 ´ p 7 , c 7 Ð c 6 + p 6 . ❤ ❤ � c 3 ✚ ❩ c 7 � return C = . 4 c 4 c 5 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 7 / 23
Fast matrix product by its transpose Matrix product by its transpose � a ⊺ 11 a ⊺ � a 21 a 22 ] with A ⊺ = Require: A = [ a 11 a 12 21 ; a ⊺ 12 a ⊺ 22 all variants have sign discrepancies Ensure: C = A ¨ A ⊺ 6 additions: 1 ✭✭✭✭✭✭ ❤❤❤❤❤❤ ✭ s 1 Ð a 11 ´ a 21 , s 2 Ð a 21 + a 22 , s 3 Ð s 2 ´ a 11 , s 4 Ð a 12 ´ s 3 , ❤ ❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭ t 1 Ð a ⊺ 22 ´ a ⊺ t 2 Ð a ⊺ 21 ´ a ⊺ t 3 Ð a ⊺ t 4 Ð a ⊺ 21 , 11 , 11 + t 1 , 12 ´ t 3 . ❤ 6 multiplications (2 recursive, 4 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ 11 , 12 , p 3 Ð a 22 ¨ t 4 , p 4 Ð s 1 ¨ t 1 , ❳❳❳❳❳ ✘ ✘✘✘✘✘ p 6 Ð s 4 ¨ a ⊺ p 7 Ð s 2 ¨ s ⊺ p 5 Ð s 3 ¨ t 3 , 22 , 1 . ❳ 5 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , ❤❤❤❤❤ ✭ ❤❤❤❤❤ ✭ ✭✭✭✭✭ ✭✭✭✭✭ c 5 Ð c 2 ´ p 7 , c 6 Ð c 1 ´ p 7 , c 7 Ð c 6 + p 6 . ❤ ❤ � c 3 ✚ ❩ c 7 � return C = . 4 c 4 c 5 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 7 / 23
Fast matrix product by its transpose Parameterized matrix product by its transpose! a 21 a 22 ] and Y s.t. YY ⊺ = ´ I n ; Require: A = [ a 11 a 12 Ensure: C = A ¨ A ⊺ 4 additions and 2 multiplications by Y : 1 s 1 Ð ( a 21 ´ a 11 ) Y , s 2 Ð a 22 ´ a 21 Y , s 3 Ð ´ a 11 Y ´ s 2 . s 4 Ð s 3 + a 12 , ❤❤❤❤❤❤❤ ✭ ❤❤❤❤❤❤❤❤ ✭✭✭✭✭✭✭✭ ✭ ❳❳❳❳❳❳ ✘✘✘✘✘✘ ✭✭✭✭✭✭✭ t 1 Ð Y ⊺ a ⊺ 21 ´ a ⊺ t 3 Ð ´ Y ⊺ a ⊺ 11 ´ t 1 t 4 Ð t 3 ´ a ⊺ ❤ ❤ 22 12 5 multiplications (3 recursive, 2 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ p 3 Ð a 22 ¨ s ⊺ p 4 Ð s 1 ¨ s ⊺ 11 , 12 , 4 , 2 , ✘✘✘✘✘ ❳❳❳❳❳ ✘ p 5 Ð s 3 ¨ s ⊺ p 7 Ð s 2 ¨ s ⊺ 3 . ❳ 1 5 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , c 5 Ð c 2 + p ⊺ 4 . return C = [ c 3 c 4 c 5 ] . 4 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 8 / 23
Fast matrix product by its transpose Fast Matrix product by its transpose, using symmetries a 21 a 22 ] and Y s.t. YY ⊺ = ´ I n ; Require: A = [ a 11 a 12 Ensure: C = A ¨ A ⊺ 4 additions and 2 multiplications by Y : 1 s 1 Ð ( a 21 ´ a 11 ) Y , s 2 Ð a 22 ´ a 21 Y , s 3 Ð ´ a 11 Y ´ s 2 . s 4 Ð s 3 + a 12 , 5 multiplications (3 recursive, 2 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ p 3 Ð a 22 ¨ s ⊺ p 4 Ð s 1 ¨ s ⊺ 11 , 12 , 4 , 2 , p 5 Ð s 3 ¨ s ⊺ 3 . 2 complete and 3 symmetric additions : 3 Low ( c 1 ) Ð Low ( p 1 ) + Low ( p 5 ) , c 2 Ð c 1 + p 4 , Low ( c 3 ) Ð Low ( p 1 ) + Low ( p 2 ) , Low ( c 5 ) Ð Low ( c 2 ) + Low ( p ⊺ 4 ) , c 4 Ð c 2 + p 3 . return C = [ c 3 c 4 c 5 ] . 4 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 9 / 23
Skew orthogonal matrices Outline Strassen-Winograd fast multiplication algorithm 1 Fast matrix product by its transpose 2 Skew orthogonal matrices 3 Complexity bounds for block algorithms 4 Space and time efficient implementation 5 Minimality 6 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 10 / 23
Recommend
More recommend