Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Shunsuke Horii Waseda University June 5, 2020 Shunsuke Horii | Waseda University | June 5, 2020 1 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Background The amount of data processed by machine learning/statistical analysis has been increasing Too large to process with a single computer/processor Distributed computing is becoming increasingly important Shunsuke Horii | Waseda University | June 5, 2020 2 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values A bottleneck in distributed computing Map-Reduce Intermediate file Output Node 1 Node 1 Input Node 2 Node 2 Node K Node K Shunsuke Horii | Waseda University | June 5, 2020 3 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values A bottleneck in distributed computing Map-Reduce Intermediate file Output Node 1 Node 1 Input Node 2 Node 2 Node K Node K The more nodes, the more traffic Shunsuke Horii | Waseda University | June 5, 2020 4 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Contribution There is a trade-off relationship between the computational load of each node and the amount of communication among the nodes Coded Map-Reduce: A framework to improve the trade-off curve by utilizing coding technique [S. Li et al. 2016] Contributions of our work Utilizing linear dependence of intermediate files to further improve the trade-off curve Shunsuke Horii | Waseda University | June 5, 2020 5 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Map-Reduce input files: w 1 , . . . , w N ∈ F 2 F output functions: φ 1 , . . . , φ Q : ( F 2 F ) N → F 2 B , q ∈ { 1 , . . . , Q } number of nodes: K $ ! % & ! '! ! " # " ! " ( Input files compute ! ! " # " ! " $ # % & # '! ! " # " ! " ( ! computing nodes Shunsuke Horii | Waseda University | June 5, 2020 6 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Map-Reduce Map functions: g q,n : F 2 F → ( F 2 T ) , q ∈ { 1 , . . . , Q } , n ∈ { 1 , . . . , N } Intermediate values: v q,n = g q,n ( w n ) ∈ F 2 T Ruduce functions: h q : ( F 2 T ) N → F 2 B Output values: u q = h q ( v q, 1 , . . . , v q,N ) " !#! # $ !#! %! ! & ! ! ! " ! ' ! # ( ! " !#! ) * ) " !#" # ! " $#! # $ $#! %! ! & +# , ! %! ! ) * ) ! " & ' $ # ( $ " $#! ) * ) " $#" # " " !#" # $ !#" %! " & +# , $ %! ! ) * ) ! " & ! " # ! " " $#" # $ $#" %! " & Map functions Reduce functions Shunsuke Horii | Waseda University | June 5, 2020 7 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Coded Map-Reduce Coding: Not used, Communication: Not used Node 1 Node 2 Files 1 2 3 4 5 6 Files 1 2 3 4 5 6 Map Map computation load: 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 has has 1 2 3 4 5 6 1 2 3 4 5 6 � K k =1 |M k | = 24 1 2 3 4 5 6 1 2 3 4 5 6 needs 6 needs N Node 3 Node 4 communication load: Files 1 2 3 4 5 6 Files 1 2 3 4 5 6 � K k =1 b k Map Map = 0 1 2 3 4 5 6 1 2 3 4 5 6 QNT 1 2 3 4 5 6 1 2 3 4 5 6 has has 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 needs needs Shunsuke Horii | Waseda University | June 5, 2020 8 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Coded Map-Reduce Coding: Not used, Communication: Used Node 1 Node 2 Files 1 2 3 Files 1 4 5 Map Map sends sends computation load: 1 2 3 1 4 5 5 1 4 5 2 1 2 3 has has 1 1 2 3 4 1 4 5 � K k =1 |M k | = 12 1 2 3 1 4 1 4 5 needs 4 5 6 needs 6 N 2 3 6 Node 3 Node 4 communication load: Files 2 4 6 Files 3 5 6 � K = 12 k =1 b k Map Map 2 4 6 24 sends sends 3 5 6 QNT 2 4 6 2 5 3 5 6 has has 2 4 6 6 3 5 6 3 3 3 5 6 2 4 6 6 needs 1 3 5 needs 1 2 4 Shunsuke Horii | Waseda University | June 5, 2020 9 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Coded Map-Reduce Coding: Used, Communication: Used [S. Li, et al. 2017] Node 1 Node 2 Files 1 2 3 Files 1 4 5 Map Map sends sends computation load: 1 2 3 1 4 5 2 ! 1 4 ! 1 1 4 5 1 2 3 has has 1 2 3 1 4 5 � K 3 ! 1 ! 5 1 k =1 |M k | = 12 1 2 3 1 4 5 3 ! 2 5 ! 4 needs 4 5 6 needs N 6 2 3 6 Node 3 Node 4 communication load: Files 2 4 6 Files 3 5 6 � K k =1 b k = 6 Map Map 2 4 6 QNT 24 sends sends 3 5 6 2 4 6 3 5 6 ! ! 4 5 2 3 has has 2 4 6 3 5 6 6 ! 2 ! 3 6 3 5 6 2 4 6 ! 5 4 ! 6 6 needs 1 3 5 needs 1 2 4 Shunsuke Horii | Waseda University | June 5, 2020 10 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Coded Map-Reduce L ∗ CDC ( r ) : The communication load of the coded Map-Reduce for the computation load r Theorem [Li et al. 2017] If r is an integer, L ∗ CDC ( r ) is given by CDC ( r ) = 1 � � 1 − r L ∗ (1) . r K When r is not an integer, it is given by the lower convex envelope of �� � �� � 1 − r r, 1 : r ∈ { 1 , . . . , K } . r K Shunsuke Horii | Waseda University | June 5, 2020 11 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Coded Map-Reduce The trade-off curves between computation load and communication load uncoded coded 0.8 communication load 0.6 0.4 0.2 0.0 2 4 6 8 10 computation load Shunsuke Horii | Waseda University | June 5, 2020 12 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Motivation Previous studies have not made any assumptions about the intermediate values (outputs of the Map functions). In some cases, they have some structures, and the trade-off curve may be further improved by utilizing this structure. We focus on the linear dependency of the intermediate values. Files 1 2 3 Map 1 2 3 1 2 3 linearly dependent 1 2 3 1 2 3 Shunsuke Horii | Waseda University | June 5, 2020 13 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Illustrative example A problem to count the number of appearance of the numbers in a sequence 1212231 2111121 2312131 3112132 1131414 1141231 � �� � � �� � � �� � � �� � � �� � � �� � w 1 w 2 w 3 w 4 w 5 w 6 Node 1 Files ! ! ! " ! # 3 ! " 1 ! 2 ! # 3 ! " Map sends 1 2 3 2 ! 1 1 2 3 has 1 2 3 3 ! 1 1 2 3 3 ! 2 needs 4 5 6 Shunsuke Horii | Waseda University | June 5, 2020 14 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Illustrative example A problem to count the number of appearance of the numbers in a sequence 1212231 2111121 2312131 3112132 1131414 1141231 � �� � � �� � � �� � � �� � � �� � � �� � w 1 w 2 w 3 w 4 w 5 w 6 Node 1 Files ! ! ! " ! # 3 ! " 1 ! 2 ! # 3 ! " Map sends 1 2 3 3 ! 1 ! 3 ! 2 2 ! 1 1 2 3 has 1 2 3 3 ! 1 1 2 3 3 ! 2 needs 4 5 6 Shunsuke Horii | Waseda University | June 5, 2020 15 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Illustrative example A problem to count the number of appearance of the numbers in a sequence 1212231 2111121 2312131 3112132 1131414 1141231 � �� � � �� � � �� � � �� � � �� � � �� � w 1 w 2 w 3 w 4 w 5 w 6 Node 1 Files ! ! ! " ! # 3 ! " 1 ! 2 ! # 3 ! " Map sends 1 2 3 3 ! 2 3 ! 1 ! 2 ! 1 1 2 3 has 1 2 3 3 ! 1 Expressed as two basis vectors and 1 2 3 linear combination coefficients 3 ! 2 needs 4 5 6 Shunsuke Horii | Waseda University | June 5, 2020 16 / 26
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values Illustrative example A problem to compute Ax 1 , . . . , Ax N by Map-Reduce "#$ ! % % % & % ' " # $ ! & Shunsuke Horii | Waseda University | June 5, 2020 17 / 26
Recommend
More recommend