exploiting the spatial dimension
play

Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. - PowerPoint PPT Presentation

Saath: Speeding up CoFlows by Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. Charlie Hu Chengkok-Koh 1 Analytics Jobs in Big Data Analytics jobs in data-centers Process huge amount of data Distributed in nature


  1. Saath: Speeding up CoFlows by Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. Charlie Hu Chengkok-Koh 1

  2. Analytics Jobs in Big Data • Analytics jobs in data-centers – Process huge amount of data – Distributed in nature – Have multiple stages that communicate with each other 2

  3. Example – Map Reduce Jobs Two compute stages: Reduce Stage Map and Reduce Shuffle (Communication) Map communicates with Map Stage reduce in shuffle phase 3

  4. Impact of communication on job performance Reduce Stage Shuffle (Communication) Facebook jobs spend 25% time in communication![1] Map Stage [1] Based on information from full facebook trace used in Aalo. Aalo slides. 4

  5. CoFlow abstraction CoFlow: Collection of all flows that share Reduce Stage same goal CoFlow Shuffle Implication: CoFlow finishes when all its flows Map Stage are over CoFlow Completion Time (CCT): Completion time of its last flow [2] Hotnets 2012, CoFlow: a networking abstraction for cluster application, Mosharaf Choudhary, Ion Stoica 5

  6. CoFlow Scheduling Problem Reduce Stage Green Job 2 mappers and 2 reducers Orange Job 4 mappers and 2 CoFlow 2 CoFlow 1 reducers They share datacenter network Goal: minimize average CoFlow completion time (CCT) of all Map Stage CoFlows 6

  7. CoFlow Scheduling Problem • CoFlow scheduling problem • Minimize average CoFlow Completion Time (CCT) • CoFlows have 2-dimensions • Time – Length of individual flows • Space – Many flows or ports • CoFlow scheduling problem is NP Hard [3] [3] M. Chowdhury, Y. Zhong, and I. Stoica. Efficient coflow scheduling with Varys. In SIGCOMM, 2014. 7

  8. Outline • Background of Aalo (State-of-the-art CoFlow scheduler) • Limitations of Aalo • Design of Saath • Evaluation 8

  9. Background of Aalo (State-of-the-art CoFlow scheduler) – Shortest job first for sequential jobs – Online approximation of SJF – Aalo: Online SJF + Spatial dimension (many distributed tasks) 9

  10. Scheduling 101 Shortest-Job-First (SJF): optimal in minimizing average completion time Last First Process P 1 < P 2 < P 3 P 3 P 2 P 1 Scheduling 10

  11. Outline • Background of Aalo (State-of-the-art CoFlow scheduler) – Shortest job first for sequential jobs – Online approximation of SJF – Aalo: Online SJF + Spatial dimension (many distributed tasks) 11

  12. Online Approximation to SJF using Priority queues Process durations - Unknown Priority queues (Higher Priority = more CPU time) P1 Low Q 2 P2 P2 Q 1 P2 P3 P3 Q 0 P1 P2 P3 High Shorter processes finish in High priority queues 12

  13. Outline • Background of Aalo (State-of-the-art CoFlow scheduler) – Shortest job first for sequential jobs – Online approximation of SJF – Aalo: Online SJF + Spatial dimension (many distributed tasks) 13

  14. Datacenter Network abstraction: Non-blocking switch • The entire datacenter fabric is one non- blocking switch – Makes analysis simple – Recent works like CONGA[Sigcomm ’ 14], VL2 [Sigcomm’09] make the abstraction practical Sender Receiver Node-1 Node-1 • Only source of contention are end-hosts DC Network • Implication: The CoFlow scheduling problem boils down to ordering them at Receiver Sender sending hosts/ports Node-2 Node-2 [4] Sigcomm 2014, CONGA, M. Alizadeh et.al ;[5] Sigcomm 2009, VL2, A Greenberg et.al 14

  15. Aalo: Online CoFlow Scheduler A CoFlow has many flows -- How to approximate SJF? Global 1. Replicates priority queues at Co-ordinator each node Low 2. A CoFlow moved across priority queues based on total bytes Q1 Sender Receiver sent at all its ports Node-1 Node-1 Q0 C2 C1 DC Network 3. Different ports send High independently Low 4. Intra-queue: Use FIFO Q1 C2 Receiver Sender C3 Node-2 Node-2 Q0 C3 C2 High 15 [6] Sigcomm 2015, Aalo, Choudhary et.al.

  16. Outline • Background of Aalo (State-of-the-art CoFlow scheduler • Limitations of Aalo • Design of Saath • Evaluation 16

  17. Aalo Drawback 1: Out-of-Sync Global Ports schedule independent of Co-ordinator each other Q1 Sender Node-1 DC Network Q0 C3 C2 Flows of a CoFlow may get scheduled at different times at Q1 different ports Sender Q0 Node-2 C2 C1 17

  18. Aalo Drawback 2: Contention Oblivion Contention of a CoFlow – Number of other CoFlows it blocks Global Global Co-ordinator Co-ordinator Q1 Q1 Sender Sender Node-1 Node-1 Q0 DC Network Q0 DC Network C2 C3 C3 C2 • C1 – 1 – C2 • C2 – 2 – C1 & C3 • C3 – 1 – C2 Q1 Q1 Sender Sender Node-2 Node-2 Q0 Q0 C2 C1 C1 C2 Average CCT =(1+2+1)/3 = 4/3 Average CCT = (2+1+2)/3= 5/3 18

  19. Aalo is not taking arrangement of CoFlows across Space into account 19

  20. Outline • State-of-the-art CoFlow scheduler - Aalo • Limitations of Aalo • Design of Saath • Evaluation 20

  21. Saath: Speeding up CoFlows by exploiting the Spatial Dimension 21

  22. Saath • Saath is an online scheduler. • Takes spatial dimension into account while scheduling CoFlows. • Spatial dimension: Arrangement of flows of CoFlows across ports 22

  23. Saath Key Ideas • All-or-none • Least-Contention-First within a queue • Faster CoFlow-queue transition 23

  24. Key idea 1: All-or-none • Either schedule all flows of a CoFlow or schedule none. – Not scheduling a CoFlow for which a subset of flow was being scheduled has no effect on CCT. – By freeing up some ports we potentially improve CCT for others 24

  25. Challenges in All-or-none CCT: CCT: C1 = t , C2 = 2t, C3 = C4 = t C1 = t , C2 = 2t, C3 = C4 = t P1 C1 C2 P1 C1 C2 P2 C2 C3 P2 C2 C3 P3 C3 P3 C1 C1 C3 3t t 2t t 2t Time Time Saath handles low port utilization by carefully designed work conservation 25

  26. Key idea 2: Least-Contention-First within a queue • Contention of a CoFlow – Number of other CoFlows it blocks • Saath sorts CoFlows in each queue in increasing order of Contention • Allows more CoFlows to be scheduled in parallel. 26

  27. Key idea 2: Least-Contention-First within a queue Contention of a CoFlow – Number of other CoFlows it blocks Global Global Co-ordinator Co-ordinator Q1 Q1 Sender Sender Node-1 Node-1 DC Network DC Network Q0 Q0 C3 C2 C2 C3 • C1 – 1 – C2 • C2 – 2 – C1 and C3 • C3 – 1 – C2 Q1 Q1 Sender Sender Node-2 Node-2 Q0 Q0 C2 C1 C2 C1 Average CCT = (1+2+1)/3 = 4/3 Average CCT = (1+2+2)/3 = 5/3 27

  28. Key idea 3: Faster CoFlow-queue transition Assume queue transition threshold for • Both Aalo and Saath use Aalo is portBandwidth × 4t priority-queue structure to Sender Ports 2t P1 move CoFlows across queues P2 P1 C2 C1 P3 • Aalo uses total bytes by all P4 flows Aalo P2 C2 transits C2 C1 • Saath uses bytes per flow t P1 P3 C3 C2 • Saath has fast transition of P2 P3 longer CoFlows to lower P4 P4 C4 C2 priority queue C2 - Fast transition - Saath Setup 28

  29. Recap: Saath Scheduling Ideas • All-or-none • Least-Contention-First within a queue • Faster CoFlow-queue transition 29

  30. Outline • State-of-the-art CoFlow scheduler - Aalo • Limitations of Aalo • Design of Saath • Evaluation 30

  31. Evaluation Methodology 1. Large scale trace driven simulations 2. Large scale testbed evaluation - 150 nodes 3. Implemented Saath in 5.2 KLoC in C++ 31

  32. Trace 1. FB Trace[7] 1. Collected from Facebook’s cluster. 2. 526 CoFlows, 150 ports 2. OSP 1. Collected from Microsoft’s cluster. 2. Ο (1000) CoFlows, Ο (100) ports [7] : https://github.com/coflow/coflow-benchmark 32

  33. Overall CCT improvement • Saath approaches offline SEBF • 1.53x for FB and 1.42x for OSP median speedup as compared to Aalo 33

  34. CCT improvement – Design Components • Each design component has considerable contribution in CCT improvement. 34

  35. Things are In-Sync now • Most of the equal flow coflow now have very small deviation in FCTs 35

  36. Testbed CCT CCT Speedup 1.88x on Average and 1.43x P50 36

  37. Scheduling Overhead SAATH Aalo Average P90 Average P90 CPU % 37.8 42.7 33.5 35.5 Global Memory(MB) 229 284 267 374 Coordinator Total time (msec) 0.57 2.85 0.1 0.2 (LCoF/All-or-none) (0.02/0.24) (0.03/0.7) (msec) Local CPU % 5.6 5.7 5.5 5.7 Node Memory(MB) 1.68 1.7 1.75 1.78 37

  38. Conclusion • CoFlow sheduling holds promise to optimize communication in Big Data jobs • Limitation of prior-art Aalo: – Ignores spatial arrangement – Has no coordination across ports • Flows can be out of sync • CoFlow contention oblivious • Saath: – Fuses spatial dimension in CoFlow scheduling – Coordination across ports – Evaluation: CCT improvement: 1.53x (P50) and 4.5x (P90) for FB trace and 1.42x (P50) and 37x (P90) 38

  39. Thank you! 39

Recommend


More recommend