Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics Shuhao Liu, Li Chen , Baochun Li University of Toronto July 12, 2018
What is a Coflow ? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4
What is a Coflow ? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 Map tasks
What is a Coflow ? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 Map tasks Reduce tasks
What is a Coflow ? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 all-to-all shuffle
What is a Coflow ? One stage in a data analytic job Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3 Map 4 Reduce 4 Coflow: considered done only when all flows finish
Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 2 Job 1
Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 2 Job 1
Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 2 Job 1
Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 2 Job 1
Coflow Scheduling Objective: minimizing average coflow completion time Network model: datacenter networking Big switch abstraction network core is congestion-free Job 2 Job 1
Coflow Scheduling Objective: minimizing average coflow completion time 1 1 2 2 Non-blocking 3 3 Switch 3 2 1 Coflow 2 Job 2 Coflow 1 Job 1
Coflow Scheduling Objective: minimizing average coflow completion time 1 1 2 2 3 1 2 2 Non-blocking 3 3 3 Switch 3 2 1 Coflow 2 Job 2 Coflow 1 Job 1
Coflow Scheduling Objective: minimizing average coflow completion time 1 1 3 2 2 3 2 2 1 2 2 Non-blocking 3 1 3 3 3 Switch 3 2 1 Coflow 2 Job 2 Coflow 1 Job 1
Wide-Area Data Analytics
Wide-Area Data Analytics Map 1 Map 2 Map 3 Map 4 Data 1 Reduce 1 Reduce 2 Datacenter 1 Wide Area Network Data 2 Data 3 Data 4 Datacenter 3 Datacenter 2 Datacenter 4
Wide-Area Data Analytics Map 1 Data 1 Datacenter 1 Wide Area Network Reduce 1 Reduce 2 Map 2 Map 3 Map 4 Data 2 Data 3 Data 4 Datacenter 2 Datacenter 3 Datacenter 4
With tasks placed in different datacenter, what about their generated inter-datacenter coflows ?
Challenges Dumb bell network model: inter-datacenter links are the only bottleneck Inter-datacenter link Datacenter A Datacenter B
Challenges Constantly changing available bandwidth CA-EU US-EU 55 68.75 82.5 96.25 110 Measured Bandwidth (Mbps) in a 100s interval
Can existing heuristics work? Link 1 Link 2 8 7 4 5 6 0 1 2 3 Estimated Flow Completion Time
Coflow scheduling should consider the distribution of available bandwidth.
Monte Carlo Simulation
Monte Carlo Simulation Scheduling Decision Tree
Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0]
Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0] B C A C A B
Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0] B C A C A B 9.3 15.5 16.2 12.5 20.2 13.1
Monte Carlo Simulation Scheduling Decision Tree A [0/0] B [0/0] C [0/0] B C A C A B 9.3 15.5 16.2 12.5 20.2 13.1
Monte Carlo Simulation Scheduling Decision Tree A [1/1] A [0/0] B [0/1] B [0/0] C [0/1] C [0/0] B C A C A B 9.3 15.5 16.2 12.5 20.2 13.1
Monte Carlo Simulation Scheduling Decision Tree A [29/100] B [68/100] C [3/100] B C A C A B
Monte Carlo Simulation Scheduling Decision Tree A [29/100] B [68/100] C [3/100] B C A C A B Complexity? 100 * O(n!)
Reduced Simulation Complexity
Reduced Simulation Complexity Θ ( t × n d ) Bounded Search Depth
Reduced Simulation Complexity Θ ( t × n d ) Bounded Search Depth Reduced Search Breath (Early termination)
<latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> <latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> <latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> <latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> Reduced Simulation Complexity Θ ( t × n d ) Bounded Search Depth Reduced Search Breath O ( t × n d ) (Early termination)
<latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> <latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> <latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> <latexit sha1_base64="SRO9rJSliFs1sIUx4d32DxWNFw=">AB/XicjVDLSgNBEOz1GeMrPm5eBoMQL2FXBD0GvXgzgnlAsobZ2UkyZHZ2mekV4hL8FS8eFPHqf3jzb5w8DioKFjQUVd1U0EihUHX/XDm5hcWl5ZzK/nVtfWNzcLWdt3EqWa8xmIZ62ZADZdC8RoKlLyZaE6jQPJGMDgf+41bro2I1TUOE+5HtKdEVzCKVuoUdi9JiSBpo4i4IeomC0fksFMoemV3AvI3KcIM1U7hvR3GLI24QiapMS3PTdDPqEbBJB/l26nhCWUD2uMtSxW1YX42+X5EDqwSkm6s7SgkE/XrRUYjY4ZRYDcjin3z0xuLv3mtFLunfiZUkiJXbBrUTSXBmIyrIKHQnKEcWkKZFvZXwvpU4a2sPz/SqgflT237F0dFytnszpysAf7UAIPTqACF1CFGjC4gwd4gmfn3nl0XpzX6eqcM7vZgW9w3j4BRU6T0A=</latexit> Reduced Simulation Complexity Θ ( t × n d ) Bounded Search Depth Reduced Search Breath O ( t × n d ) (Early termination) Online Incremental Search
Recommend
More recommend