PetaBricks and Julia Kathleen C. Alexander Massachusetts Institute of Technology December 11th, 2013
Motivation
Motivation Background Approach Results Recommendations Index The Programmer’s Dilemma a personal example— energy landscapes K.C. Alexander (MIT) PetaBricks and Julia 1 / 15
Motivation Background Approach Results Recommendations Index The Programmer’s Dilemma which algorithm is best? K.C. Alexander (MIT) PetaBricks and Julia 1 / 15
Motivation Background Approach Results Recommendations Index The Programmer’s Dilemma which algorithm is best? Goal: determine the best algorithm for the application– which may be machine dependent K.C. Alexander (MIT) PetaBricks and Julia 1 / 15
Motivation Background Approach Results Recommendations Index Parallel Programming • many parts of these al- gorithms can be written in parallel • often they can be paral- lelized in many different ways • optimizing these options is a challenge Determine the best way to parallelize the program– which will be machine dependent K.C. Alexander (MIT) PetaBricks and Julia 2 / 15
Motivation Background Approach Results Recommendations Index Parallel Programming • many parts of these al- gorithms can be written in parallel • often they can be paral- lelized in many different ways • optimizing these options is a challenge Determine the best way to parallelize the program– which will be machine dependent K.C. Alexander (MIT) PetaBricks and Julia 2 / 15
Background
Motivation Background Approach Results Recommendations Index Petabricks – Algorithmic Choice PetaBricks was developed to alleviate some of the optimiza- tion responsibility from the programmer the transform K.C. Alexander (MIT) PetaBricks and Julia 3 / 15
Motivation Background Approach Results Recommendations Index Petabricks – Algorithmic Choice PetaBricks was developed to alleviate some of the optimiza- tion responsibility from the programmer the transform compiling framework Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 3 / 15
Motivation Background Approach Results Recommendations Index Petabricks – Autotuning the autotuner determines the best configuration for the ma- chine under the tuning constraints K.C. Alexander (MIT) PetaBricks and Julia 4 / 15
Motivation Background Approach Results Recommendations Index Petabricks – Autotuning Sort Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 4 / 15
Motivation Background Approach Results Recommendations Index Petabricks – Autotuning Eigen Problem Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 4 / 15
Motivation Background Approach Results Recommendations Index Petabricks – Autotuning Matrix Multiply Ansel, et al. ACM SIGPLAN Conference (2009). K.C. Alexander (MIT) PetaBricks and Julia 4 / 15
Motivation Background Approach Results Recommendations Index Julia • Julia was developed to bridge the gap between interpreted and compiled scientific computing • streamlining parallelization techniques has been a priority K.C. Alexander (MIT) PetaBricks and Julia 5 / 15
Motivation Background Approach Results Recommendations Index Julia http://forio.com/julia/julia K.C. Alexander (MIT) PetaBricks and Julia 5 / 15
Motivation Background Approach Results Recommendations Index Julia http://forio.com/julia/julia Question: is there room for overlap between the PetaBricks and Julia approaches? K.C. Alexander (MIT) PetaBricks and Julia 5 / 15
Approach
Motivation Background Approach Results Recommendations Index Options for Implementation Julia in PetaBricks • can utilize PetaBricks autotuner and compiler • PetaBricks compiler needs to interpret Julia K.C. Alexander (MIT) PetaBricks and Julia 6 / 15
Motivation Background Approach Results Recommendations Index Options for Implementation Julia in PetaBricks • can utilize PetaBricks autotuner and compiler • PetaBricks compiler needs to interpret Julia PetaBricks in Julia • can run PetaBricks binaries inside Julia • no PetaBricks shared object files, functions require disk i/o • doesn’t take advantage of JuliaLang K.C. Alexander (MIT) PetaBricks and Julia 6 / 15
Motivation Background Approach Results Recommendations Index Options for Implementation Julia in PetaBricks • can utilize PetaBricks autotuner and compiler • PetaBricks compiler needs to interpret Julia PetaBricks in Julia • can run PetaBricks binaries inside Julia • no PetaBricks shared object files, functions require disk i/o • doesn’t take advantage of JuliaLang Julia + OpenTuner • apply PetaBricks framework to Julia • utilize OpenTuner to optimize Julia K.C. Alexander (MIT) PetaBricks and Julia 6 / 15
Motivation Background Approach Results Recommendations Index Approach Used Here PetaBricks in Julia • can run PetaBricks binaries inside Julia • no PetaBricks shared object files, functions require disk i/o • doesn’t take advantage of JuliaLang K.C. Alexander (MIT) PetaBricks and Julia 7 / 15
Motivation Background Approach Results Recommendations Index Approach Used Here PetaBricks in Julia • can run PetaBricks binaries inside Julia • no PetaBricks shared object files, functions require disk i/o • doesn’t take advantage of JuliaLang ⇒ most naive approach possible: → compile PetaBricks executable, exe → julia ¿ run(‘$exe $in $out‘) K.C. Alexander (MIT) PetaBricks and Julia 7 / 15
Motivation Background Approach Results Recommendations Index Approach Used Here PetaBricks in Julia • can run PetaBricks binaries inside Julia • no PetaBricks shared object files, functions require disk i/o • doesn’t take advantage of JuliaLang ⇒ most naive approach possible: → compile PetaBricks executable, exe → julia ¿ run(‘$exe $in $out‘) ⇒ compare with PetaBricks and Julia alone → lower bound of performance improvement → is there proof of benefit? K.C. Alexander (MIT) PetaBricks and Julia 7 / 15
Results
Motivation Background Approach Results Recommendations Index PetaBricks- Tuning Improvements performance improvement— tuned and untuned PetaBricks Matrix Multiply 250 tuned untuned Wall-Clock Time [s] 200 150 100 50 0 0 1000 2000 3000 4000 Size K.C. Alexander (MIT) PetaBricks and Julia 8 / 15
Motivation Background Approach Results Recommendations Index Comparing PetaBricks with Julia - Apples to Apples PetaBricks Julia → functions read in ASCII → JIT for each independent files and output same execution → determines parallelization → can addprocs(n), but may during autotuning not parallelize → autotuning can take days → can be used interactively K.C. Alexander (MIT) PetaBricks and Julia 9 / 15
Motivation Background Approach Results Recommendations Index Comparing PetaBricks with Julia - Apples to Apples PetaBricks PetaBricks → functions read in ASCII → JIT for each independent files and output same executable → determines parallelization → can addprocs(n), but may during autotuning not parallelize → autotuning can take days → can be used interactively → make both programs do i/o → run both programs from shell → try addprocs(n) in Julia, with no other instructions → subtract ’hello world’ start-up time from Julia wall-clock K.C. Alexander (MIT) PetaBricks and Julia 9 / 15
Motivation Background Approach Results Recommendations Index Comparing PetaBricks to Julia - EigenSolve EigenSolve Julia 10 Julia-Scaled Wall-Clock Time [s] → Julia seems to do the 8 PetaBricks best for large matrices → however, the results 6 were not comparable → this test was not a good 4 apples-to-apples perfor- mance test 2 0 0 200 400 600 800 1000 Size K.C. Alexander (MIT) PetaBricks and Julia 10 / 15
Motivation Background Approach Results Recommendations Index Comparing PetaBricks with Julia - Sort Sort Julia 8 Wall-Clock Time [s] Julia-Scaled → Julia and PetBricks con- PetaBricks verge for large vectors 6 → PetaBricks is better with shorter vectors 4 → effect of i/o not consid- 2 ered wrt performance 0 0 20 40 60 80 100 Size [10 4 ] K.C. Alexander (MIT) PetaBricks and Julia 11 / 15
Motivation Background Approach Results Recommendations Index Comparing PetaBricks with Julia Matrix Multiply i5-3339 (4 CPU) i7-3770 (8 CPU) Julia Julia 80 8 Julia-Scaled Julia-Scaled Wall-Clock Time [s] Wall-Clock Time [s] PetaBricks Julia-Scaled-8p 6 60 PetaBricks 4 40 2 20 0 0 0 200 400 600 800 1000 0 1000 2000 3000 4000 Size Size → Julia and PetBricks converge moderate matrix sizes on fewer cores → PetaBricks is better with smaller lists and larger matrices → using addprocs(n) with no other instruction does not utilize parallel func- tionality in Julia K.C. Alexander (MIT) PetaBricks and Julia 12 / 15
Recommend
More recommend