Mimer and Schedeval: Tools for Comparing Static Schedulers for Streaming Applications on Manycore Architectures Nicolas Melot Johan Janzén Christoph Kessler Dept. of Computer and Inf. Science Linköping, Sweden November 27, 2015 1 Linköping University
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Outline 1 Introductjon Streaming Evaluatjon 2 Mimer evaluatjon framework 3 Schedeval streaming framework Descriptjon Testjng overhead Evaluatjng computatjon tjme Studying energy consumptjon 4 Mimer & Schedeval environment Data structures Programming Experiment 5 Conclusion Conclusion Future work Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 1 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Introductjon High-Performance Computjng on Streams Optjmize for tjme and energy Scheduling Problems Numerous publicatjons On-chip Pipelining (Keller et al. [2012]) Scheduling sequentjal tasks (Pruhs et al. P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 [2008]) G 8 G 9 G 10 G 11 G 12 G 13 G 14 G 15 G 4 G 5 G 6 G 7 Crown Scheduling (Melot et al. [2015]) G 2 G 3 G 1 Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 2 / 21
3 / 21 November 27, 2015 Mimer and Schedeval Melot, Janzén, Kessler HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B < sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B < sv 44 FF2B HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B < sv 44 FF2B sv 44 FF2B sv 44 FF2B < sv 44 FF2B sv 44 FF2B HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B < < HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B < < HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B < sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B < sv 44 FF2B HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B < < HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B < < HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 HI56882-27 sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B sv 44 FF2B FX ab-78123 ab-78123 FX FX ab-78123 FX ab-78123 FX ab-78123 FX ab-78123 off-chip memory accesses performance botuleneck communicatjon for reduced Main memory bandwidth is Trade core to core Easy to implement [2012]). on-chip memories (Keller et al. memory communicatjon through small through shared off-chip main On-chip pipelining : Straightgorward : communicatjon Streaming References Conclusion Mimer & Schedeval environment Schedeval Mimer Introductjon
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Optjmize for tjme and energy Scheduling for performance Compromise between tjme and energy V · f 2 P dyn ≈ Many features to take into account f 3 ≈ Voltage and frequency E dyn ≈ P dyn · p · t Cores grouped in islands Impact dependent on stalls due to f 3 · p · t ≈ memory accesses Off-chip memory accesses On-chip communicatjons SCC die tile tile tile tile tile tile R R R R R R Statjc/dynamic power tile tile tile tile tile tile DIMM DIMM MC R R R R R R MC tile tile tile tile tile tile Behavior differences between executjon R R R R R R tile tile tile tile tile tile DIMM DIMM platgorms MC R R R R R R MC Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 4 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Testjng Scheduling techniques Difficulty to test scheduling techniques Simulator: imperfect, slow Real platgorm: high development efforts Difficulty to compare between techniques Access to experimental datasets Tailor-made for a paper (Xu et al. [2012]) Lack of adaptability (Kasahara [2004]) Access to raw results Access to result processing and representatjon tools Mimer: Testjng and interpretjng framework, raw and structured data Schedeval: run a streaming applicatjon on real architectures Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 5 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Mimer Flexible formats Evaluation GraphML (XML) Task graphs Platforms Schedules Evaluators statistics Schedulers XML Schedules 2 - Assess Scheduler T askgraphs statistics 1 - Schedule Analyser & AMPL-base platgorms Field list Flexible C++ backend API Input data Output data 3 - Analyze User-provided Executable or settings Plotting Open-data Structured Graphs script data [ ] [ ] Intermediate data Keep raw data 4 - Plot Benchmark phase Publicatjon of data processing scripts (R) 4 steps Structure data in CSV Schedule Analyze (Comma-Separated Values) Evaluate Plot Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 6 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Schedeval Schedeval: run a streaming applicatjon on a real architecture Co-developed with Janzén [2014] as Master thesis work. Integrates as a schedule evaluator in Mimer workflow Schedeval Platforms Evaluation Scheduler Schedule statistics Or T askgraphs Analytic evaluator Evaluation Platforms Schedules Evaluators statistics Schedulers 2 - Assess Scheduler T askgraphs statistics 1 - Schedule Analyser & Field list Input data Output data 3 - Analyze User-provided Executable or settings Plotting Structured Graphs script data [ ] [ ] Intermediate data 4 - Plot Benchmark phase Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 7 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Testjng the Schedeval framework Framework overhead: ping-pong applicatjon Vary the number of tasks With a single pair With several pairs Ping Pong Vary the mapping Mapped to several cores Mapped to several tjles Mapped to a single core Monitor Average ping round trip tjme Proportjon of non data-ready task instances scheduled Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 8 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Schedeval overhead Ping Pong 9 100% 8 80% 7 6 60% Time [us] 5 Message round trip delay (ms) 4 40% Percentage not data ready 3 2 20% 1 0 0 Local Tile Remote RCCE (tile) Roundtrip tjme penalized by distance and overhead Latency hidden with increasing ping-pong pairs Decreasing rate of non data-ready task scheduled Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 9 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Schedeval overhead Ping Pong Local Remote RCCE for comparison 50.00% 9 45.00% 8 Messages round trip time [us] Data-ready task proportion 40.00% 7 35.00% 6 30.00% 5 25.00% 4 20.00% 3 15.00% 2 10.00% 1 5.00% 0.00% 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Number of simultaneous pingpongs Number of simultaneous pingpongs Roundtrip tjme penalized by distance and overhead Latency hidden with increasing ping-pong pairs Decreasing rate of non data-ready task scheduled Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 10 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Evaluatjng the Schedeval framework Computatjon tjme: mergesort Schedules over 6 cores (Melot et al. [2012]) Mapped per level (simple) Mapped per block (reduce communicatjons) Test schedule with a unique core Test a variant with no frequency scaling mechanism Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 11 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Schedeval computatjon tjme Results described by Janzén [2014] Vary implementatjons Block mapping, depth- fi rst Level mapping, extra presort Block mapping, extra presort Simpler variant Depth-first task executjon Distribute presort phases Observatjons Low performance difference Single core (1/6) in streaming phase Block mapping, depth- fi rst Level mapping Simpler variant High performance difference for initjal sort Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 12 / 21
Introductjon Mimer Schedeval Mimer & Schedeval environment Conclusion References Studying energy consumptjon Usefulness for energy consumptjon studies V · f 2 P dyn ≈ Use a fixed applicatjon: StreamIt implementatjon of FFT f 3 ≈ (Thies et al. [2002]) E dyn ≈ P · p · t Run with 11 different schedules f 3 · p · t ≈ Vary deadline tjghtness to constrain frequency Monitor Energy consumptjon Schedeval Platforms Evaluation Scheduler Schedule statistics Or Compare with simple energy models T askgraphs Analytic evaluator 7 8 9 10 11 18 19 20 21 22 23 FFTReorderSimple 0 1 CombineDFT 24 25 Source & split 2 3 4 5 6 12 13 14 15 16 17 Join & sink Melot, Janzén, Kessler Mimer and Schedeval November 27, 2015 13 / 21
Recommend
More recommend