Sol Fast Distributed Computation Over Slow Networks Fan Lai , Jie You, Xiangfeng Zhu Harsha V. Madhyastha, Mosharaf Chowdhury 1
Distributed Data Processing is Ubiquitous • Distributed computation in Local-Area Networks (LAN) • To accelerate executions within a single cluster Efforts for Computation in LAN 2
Distributed Data Processing is Ubiquitous • Distributed computation in Local-Area Networks (LAN) • To accelerate executions within a single cluster • Computation over Wide-Area Networks (WAN) • To reduce data transfers, mitigate privacy risks Tetrium Iridium CLARINEt Azure Cosmos DB Google Spanner Efforts for Computation in LAN Efforts for Computation over WAN 2
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing 3
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing Execution Planner 3
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing Execution Planner Job 1 Job 2 Typical job execution plans 3
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing Execution Planner Execution Engine … Coordinator Worker 1 Worker 2 Worker N 3
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing Execution Planner Execution Engine … Coordinator Worker 1 Worker 2 Worker N Resource Scheduler Storage System 3
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing Execution Planner Efforts for Computation in LAN Execution Engine Tetrium CLARINEt Iridium … Coordinator Worker 1 Worker 2 Worker N Azure Cosmos DB Google Spanner Resource Scheduler Storage System Efforts for Computation over WAN 3
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing Iridium CLARINEt Execution Planner Efforts for Computation in LAN Execution Engine … Coordinator Worker 1 Worker 2 Worker N Resource Scheduler Storage System Efforts for Computation over WAN Tetrium Azure 3 Cosmos DB Google Spanner
Execution Engine: Core of Big Data Stack Select * WordCount, K-means, … FROM …; TopKCount SVM SQL Stream … AI/ML Queries Processing Iridium CLARINEt Execution Planner While network conditions Efforts for Computation in LAN are diverse in real, execution Execution Engine engines remain the same … Coordinator Worker 1 Worker 2 Worker N Resource Scheduler Storage System Efforts for Computation over WAN Tetrium Azure 3 Cosmos DB Google Spanner
Outline • Today’s Execution Engines • Sol Architecture • Control Plane Design • Data Plane Design • Evaluation 4
Impact of Networks on Latency-sensitive Jobs 1 . 00 CDF across Queries 0 . 75 CDF 0 . 50 10 Gbps, O(1) ms 1 Gbps, O(1) ms 0 . 25 10 Gbps, O(100) ms 1 Gbps, O(100) ms 0 . 00 0 50 100 150 Query Completion Time (s) Job Completion Time (s) Queries from 100 GB TPC Benchmarks 5
Impact of Networks on Latency-sensitive Jobs 1 . 00 CDF across Queries 0 . 75 CDF 0 . 50 10 Gbps, O(1) ms 1 Gbps, O(1) ms 0 . 25 10 Gbps, O(100) ms 1 Gbps, O(100) ms 0 . 00 0 50 100 150 Query Completion Time (s) Job Completion Time (s) Queries from 100 GB TPC Benchmarks 5
Impact of Networks on Latency-sensitive Jobs 1 . 00 CDF across Queries 0 . 75 4.9X CDF 0 . 50 10 Gbps, O(1) ms 1 Gbps, O(1) ms 0 . 25 10 Gbps, O(100) ms 1 Gbps, O(100) ms 0 . 00 0 50 100 150 Query Completion Time (s) Job Completion Time (s) Queries from 100 GB TPC Benchmarks 5
Impact of Networks on Latency-sensitive Jobs 1 . 00 CDF across Queries 0 . 75 4.9X Problem #1 CDF 0 . 50 10 Gbps, O(1) ms Slow job execution in 1 Gbps, O(1) ms 0 . 25 10 Gbps, O(100) ms high-latency networks 1 Gbps, O(100) ms 0 . 00 0 50 100 150 Query Completion Time (s) Job Completion Time (s) Queries from 100 GB TPC Benchmarks 5
Control Plane Inefficiency Due to High Latency Worker Coordinator Launch( ■ ) Tasks Problem #1 Busy Time Slow job execution in high-latency networks O(1) ms 6
Control Plane Inefficiency Due to High Latency Worker Coordinator Launch( ■ ) Tasks Problem #1 Busy Complete( ■ ) Time Slow job execution in Launch( ■ ) Tasks Busy high-latency networks Complete( ■ ) O(1) ms 6
Control Plane Inefficiency Due to High Latency Coordinator Worker Late-binding of tasks postpones scheduling L a u n c h ■ ( ) Tasks Busy Problem #1 ■ ) ( e t e l p m o C Time Slow job execution in Idle L a u n c high-latency networks Tasks h ■ ( ) Busy Complete( ■ ) 7 O(100) ms
Impact of Networks on Bandwidth-intensive Jobs Stage 3 Stage 1 Stage 2 Data transfers over networks Query25 on 1TB TPC benchmark 8
Impact of Networks on Bandwidth-intensive Jobs Occupied CPUs CPU Util. B/w Util. Stage 3 Stage 1 Stage 2 Percentage of the Total (%) 100 75 50 25 0 0 50 100 150 200 250 Data transfers Time (s) Stage 1 Stage 2 Stage 3 over networks Time (s) Query25 on 1TB TPC benchmark Resource utilization throughout the job 8
Impact of Networks on Bandwidth-intensive Jobs Occupied CPUs CPU Util. B/w Util. Stage 3 Stage 1 Stage 2 Percentage of the Total (%) 100 75 Low CPU util. 50 25 0 0 50 100 150 200 250 Data transfers Time (s) Stage 1 Stage 2 Stage 3 over networks Time (s) Query25 on 1TB TPC benchmark Resource utilization throughout the job 8
Data Plane Inefficiency Due to Low Bandwidth Tasks hog CPUs throughout the lifespan Occupied CPUs CPU Util. B/w Util. Occupied CPUs CPU Util. B/w Util. Stage 3 Stage 1 Stage 2 Percentage of the Total (%) Percentage of the Total (%) 100 100 75 75 50 50 Problem #2 25 25 CPU underutilization in 0 0 0 50 100 150 200 250 0 50 100 150 200 250 low-bandwidth networks Data transfers Time (s) Stage 1 Time (s) Stage 1 Stage 2 Stage 2 Stage 3 Stage 3 over networks Time (s) Time (s) Query25 on 1TB TPC benchmark Resource utilization throughout the job Resource utilization throughout the job 9
Outline • Today’s Execution Engines Problem #1 High latency → Idleness of workers • Sol Architecture • Control Plane Design Problem #2 • Data Plane Design Low b/w → CPU underutilization • Evaluation 10
Outline Sol • Today’s Execution Engines • Sol Architecture A federated execution engine for diverse network conditions w/ • Control Plane Design • faster job execution • Data Plane Design • higher resource utilization • Evaluation 11
Sol : A Federated Execution Engine WAN • Central Coordinator Sol Coordinator Task Arrivals • Coordinate inter-site executions O(100) ms LAN O(100) ms Site 1 Site 2 Site 3 WAN Sol Architecture 12
Sol : A Federated Execution Engine WAN • Central Coordinator Sol Coordinator Task Arrivals • Coordinate inter-site executions O(100) ms • Site Manager LAN O(100) ms LAN • Coordinate local workers Site Manager • Manage queued tasks Site 2 Site 3 WAN Sol Architecture 12
Sol : A Federated Execution Engine WAN • Central Coordinator Sol Coordinator Task Arrivals • Coordinate inter-site executions O(100) ms • Site Manager LAN O(100) ms LAN • Coordinate local workers Site Manager • Manage queued tasks Site 2 Site 3 WAN Sol Architecture 12
Sol : A Federated Execution Engine WAN LAN • Central Coordinator Sol Coordinator Task Arrivals • Coordinate inter-site executions O(100) ms • Site Manager LAN O(100) ms LAN • Coordinate local workers Site Manager • Manage queued tasks Worker Worker Task Task Manager Manager • Task Manager • Manage worker resource Site 2 Site 3 WAN Sol Architecture 12
Outline • Today’s Execution Engines Problem #1 High latency → Idleness of workers • Sol Architecture Push tasks proactively to • Control Plane Design reduce worker idle time • Data Plane Design • Evaluation 13
Task Early-binding in Control Plane Coordinator Worker Launch( ■ ) Tasks Busy Time ■ ) ( e t e l p m o C Idle Launch( ■ ) Tasks O(100) ms Existing designs 14
Task Early-binding in Control Plane Coordinator Site Manager Worker Tasks Time O(1) ms O(100) ms 15
Task Early-binding in Control Plane Coordinator Site Manager Worker Launch( ■ ■ ) Tasks Time O(1) ms O(100) ms 15
Task Early-binding in Control Plane Coordinator Site Manager Worker Launch( ■ ) Launch( ■ ■ ) Tasks Complete( ■ ) Busy Time O(1) ms O(100) ms 15
Task Early-binding in Control Plane Coordinator Site Manager Worker • Coordinator ⟷ Site Manager • Inter-site operations are early-binding Launch( ■ ) Launch( ■ ■ ) Tasks → Guarantee high utilization Complete( ■ ) Busy Time ■ Idle ) ( e t e l p m Launch( ■ ) o C Busy Tasks Launch( ■ ) O(1) ms O(100) ms 15
Recommend
More recommend