in big data analytic systems
play

in Big-Data Analytic Systems Rui Li , Peizhen Guo, Bo Hu, Wenjun Hu - PowerPoint PPT Presentation

Libra and the Art of Task Sizing in Big-Data Analytic Systems Rui Li , Peizhen Guo, Bo Hu, Wenjun Hu Yale University Background Stage 0 Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 0 Stage 4 Stage 1 Stage 2 Stage 3 Stage 4


  1. Libra and the Art of Task Sizing in Big-Data Analytic Systems Rui Li , Peizhen Guo, Bo Hu, Wenjun Hu Yale University

  2. Background

  3. Stage 0 Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6

  4. Stage 0 Stage 4 Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6

  5. stage input data Stage 0 Stage 4 Stage 1 Stage 2 stage output data Stage 3 Stage 4 Stage 5 Stage 6

  6. stage input data Stage 0 Stage 4 Stage 1 Stage 2 stage output data Stage 3 Stage 4 Stage 5 Stage 6

  7. stage input data Stage 0 Stage 4 Stage 1 Stage 2 stage output data Stage 3 Stage 4 How to set task size? Stage 5 Stage 6

  8. stage input data Stage 0 Stage 4 Stage 1 Stage 2 stage output data Stage 3 Stage 4 How to set task size? Stage 5 -- User experience -- System default value Stage 6

  9. The importance of task sizing

  10. Observation 1: diff jobs have diff optimal task sizes Normalized stage completion time vs task size

  11. Observation 2: diff stages have diff optimal task sizes PageRank stage completion time vs task size

  12. 1. Proper task sizing is important

  13. 1. Proper task sizing is important 2. U-curve pattern

  14. Analysis of U-curve pattern

  15. Observation 3: tasks have similar scheduling delay and system overhead regardless of task sizes Per-task overhead for PageRank stage 1

  16. Observation 4: small size => fail to do batch processing large size => memory swapping # of IO ops for different stages of PageRank

  17. Small task size => high aggregated overhead, no batch processing Large task size => memory swapping

  18. System design • Strawman solution

  19. Refinement 1: ADAM optimization

  20. Refinement 2: noise filtering Task processing rate fluctuation for stage 1 of PageRank

  21. Refinement 2: noise filtering Task processing rate fluctuation for stage 1 of PageRank

  22. Refinement 3: contention avoidance PageRank over two machines

  23. Refinement 3: contention avoidance PageRank over two machines

  24. Evaluation • 8 m4.xlarge VMs from EC2 • Workloads generated from HiBench

  25. Initial task size effect PageRank completion time over diff. initial task size

  26. Libra performance PageRank completion time with diff. input data size

  27. Q&A

Recommend


More recommend