computing
play

Computing Shanjiang Tang , Bu-Sung Lee, Bingsheng He School of - PowerPoint PPT Presentation

Speedup for Multi-Level Parallel Computing Shanjiang Tang , Bu-Sung Lee, Bingsheng He School of Computer Engineering Nanyang Technological University 21 st May 2012 OutLine Background & Motivation Multi-Level Parallel Speedup


  1. Speedup for Multi-Level Parallel Computing Shanjiang Tang , Bu-Sung Lee, Bingsheng He School of Computer Engineering Nanyang Technological University 21 st May 2012

  2. OutLine • Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion

  3. Multi-Level Computing Architecture and Paradigm

  4. Multi-Level Computing Architecture and Paradigm • MPI+OpenMP • MPI+CUDA • MPI+OpenMP+CUDA … ..

  5. Multi-Level Parallel Computing Model L L1 L2 L3 Lm L PE 3,1 PE 3,2 L L PE 2,1 PE 3,3 PE 3,4 L PE 1,1 L PE 3,5 PE 3,6 L PE 2,2 L PE 3,7 PE 3,8 L Notes: Sequential Part Parallel Part

  6. Parallel Speedup • Definition SequentialExecutionTime = Speedup ParallelExecutionTime • Classification Ø Absolute Speedup BestSequentialALGExecutionTime = Speedup ParallelALGExecutionTime Ø Relative Speedup ParallelALGSequentialExecutionTime = Speedup ParallelALGExecutionTime

  7. Relative Speedup Model • Fixed-size Speedup Ø Amdahl’s Law sequential Time 1 Speedup = = parallelTi me α 1 − α + p p where is parallel fraction workload of the program, is the α number of processors. • Fixed-time Speedup Ø Gustafson’s Law sequential Time 1 p − α + α Speedup 1 p = = = − α + α parallelTi me 1 − α + α

  8. Motivation Example—NAS Benchmark (MPI+OpenMP)

  9. Motivation Example—NAS Benchmark (MPI+OpenMP) Amdahl’s Law is UNSUITABLE for Multi-Level Parallel Computing

  10. OutLine • Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion

  11. E-Amdahl’s Law • Awareness of Different Grained-Level Parallelism L L1 L2 L3 Lm 1 ⎧ ( i m ) = L ⎪ f ( m ) 1 f ( m ) PE 3,1 − + PE 3,2 ⎪ L p ( m ) ⎪ L ⎪ PE 2,1 sp ( i ) = ⎨ PE 3,3 PE 3,4 PE 1,1 L 1 ⎪ ( 1 i m ) ≤ < ⎪ L f ( i ) 1 f ( i ) ⎪ − + PE 3,5 PE 3,6 p ( i ) sp ( i 1 ) ⎪ + L PE 2,2 ⎩ L PE 3,7 PE 3,8 L Notes Sequential Parallel : Part Part

  12. E-Amdahl’s Law • Two-Level Parallelism Speedup Model (MPI+OpenMP) 1 sp ( , , p , t ) α β = β ( 1 ) α − β + t 1 − α + p where is the parallel fraction of coarse-grained (MPI-level) parallelism. α is the parallel fraction of fine-grained (OpenMP-level) parallelism. β is the number of processes spawned. p t is the number of threads spawned per process.

  13. E-Gustafson’s Law • Awareness of Different Grained-Level Parallelism L L1 L2 L3 Lm L 1 f ( m ) f ( m ) p ( m ) ( i m ) − + = ⎧ PE 3,1 sp ( i ) PE 3,2 = ⎨ L 1 f ( i ) f ( i ) p ( i ) sp ( i 1 ) ( 1 i m ) − + + ≤ < L ⎩ PE 2,1 PE 3,3 PE 3,4 PE 1,1 L L PE 3,5 PE 3,6 L PE 2,2 L PE 3,7 PE 3,8 L Notes Sequential Parallel : Part Part

  14. OutLine • Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion

  15. Experiment Setup • Platform and Configuration Ø A linux cluster consisting of eight computing nodes each with two quad-core chips Ø Configuration: One thread per CPU core • Benchmarks NAS Parallel Benchmark (NPB) Multi-Zone (MZ) Version: Ø BT-MZ (Unbalanced Workload Partitioning) Ø SP-MZ (balanced Workload Partitioning) Ø LU-MZ (balanced Workload Partitioning)

  16. Performance Prediction

  17. Prediction Result Comparison

  18. OutLine • Background & Motivation • Multi-Level Parallel Speedup • Evaluation • Conclusion

  19. Conclusion • Traditional speedup models are unsuitable for multi-level parallelism – Unable to be awareness of different granularities of parallelism for multi-level parallel computing. • Multi-level Parallelism Model – A guidance model for multi-level optimization. – A prediction model for multi-level parallelism.

  20. Argument Estimation

  21. Speedup Under E-Amdahl’s Law

  22. Speedup Under E-Gustafson’s Law

Recommend


More recommend