Asymmetry-Aware Work-Stealing Runtimes Christopher Torng, Moyang Wang, and Christopher Batten School of Electrical and Computer Engineering Cornell University 43rd Int’l Symp. on Computer Architecture, June 2016
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Dynamic Static Asymmetry Asymmetry Single-ISA Dynamic Voltage Heterogeneous and Frequency Architectures Scaling How can we use asymmetry awareness to improve the performance and energy efficiency of a work-stealing runtime? Cornell University Christopher Torng 2 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Task Queues Task B Spawn Task B Work in Task A Progress Core 0 Core 1 Core 2 Core 3 Cornell University Christopher Torng 3 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Task Queues Dequeue Task B Work in Task B Progress Core 0 Core 1 Core 2 Core 3 Cornell University Christopher Torng 3 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Task Queues Task C Spawn Task C Work in Task B Progress Core 0 Core 1 Core 2 Core 3 Cornell University Christopher Torng 3 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Task Task C Queues Task D Spawn Task D Work in Task B Progress Core 0 Core 1 Core 2 Core 3 Cornell University Christopher Torng 3 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Task Queues Steal Task D Steal Task C Work in Task B Task D Task C Progress Core 0 Core 1 Core 2 Core 3 Cornell University Christopher Torng 3 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Task Queues Task E Task F Spawn Task E Spawn Task F Work in Task D Task C Progress Core 0 Core 1 Core 2 Core 3 Cornell University Christopher Torng 3 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Work-Stealing Runtimes Task Queues Steal Task E Steal Task F Work in Task E Task D Task C Task F Progress Core 0 Core 1 Core 2 Core 3 ◮ Work stealing has good performance, space requirements, and communication overheads in both theory and practice ◮ Supported in many popular concurrency platforms including: Intel’s Cilk Plus, Intel’s C++ TBB, Microsoft’s .NET Task Parallel Library, Java’s Fork/Join Framework, and OpenMP Cornell University Christopher Torng 3 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation Static Asymmetry vs. Dynamic Asymmetry Integrated Voltage Regulation 150 ns 1.4 1.3 Voltage (V) 1.2 1.1 1.0 0.9 0.8 120 ns 0.7 100 150 200 250 300 350 400 Time (ns) Cell Power Mux Load Control Samsung Exynos Octa Mobile Processor Little Big ARM Cores ARM Cores A7 A7 A15 A15 Test Chip with Four Integrated Voltage Regulators A15 A7 A7 A15 From W, Godycki, C. Torng, I. Bukreyev, A. Apsel, C. Batten. L2$ L2$ “Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable Power Distribution Networks” MICRO, 2014 Cornell University Christopher Torng 4 / 21
• Motivation • First-Order Modeling Asymmetry-Aware Work-Stealing Runtimes Evaluation How can we use asymmetry awareness to improve the performance and energy efficiency of a work-stealing runtime? Bender et al. Ribic et al. "Online Scheduling "Energy-Efficient of Parallel Programs on Work-Stealing Work-Stealing Heterogeneous Sys ..." Language Runtimes" Runtimes SPAA 2002 ASPLOS 2014 Static Dynamic Asymmetry Asymmetry Single-ISA Dynamic Voltage Heterogeneous and Frequency Architectures Scaling Azizi et al. "Energy-performance Tradeoffs in Processor Architecture and Circuit Design: A Marginal Cost Analysis" ISCA 2010 Cornell University Christopher Torng 5 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation Talk Outline Work-Stealing Runtimes Motivation First-Order Modeling Static Dynamic Asymmetry Asymmetry Asymmetry-Aware Work-Stealing Runtimes Work-Pacing L Evaluation L B B Work-Mugging Work-Sprinting Cornell University Christopher Torng 6 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation Building Intuition by Exploring a 1 Big 1 Little System System with 1 big 1 little 8 Four-Way B L Big Core 7 Normalized Power 6 (2.0, 6.0) 5 3.0 7.0 4 L L 3 Little Core 2 B B (1.0, 1.0) 1 0.5 1.0 1.5 2.0 2.5 3.0 r S e w P I o Normalized IPS P Cornell University Christopher Torng 7 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation Building Intuition by Exploring a 1 Big 1 Little System System with 1 big 1 little 8 Four-Way B L Big Core 7 Normalized Power 6 (2.0, 6.0) 5 3.0 7.0 4 L L L 3 L Little Core 2 B B B B (1.0, 1.0) 1 0.5 1.0 1.5 2.0 2.5 3.0 r S e w P I o Normalized IPS P Cornell University Christopher Torng 7 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation Building Intuition by Exploring a 1 Big 1 Little System System with 1 big 1 little 8 Four-Way B L Big Core 7 Normalized Power 6 (2.0, 6.0) 10% Performance Increase Same Power 5 3.0 7.0 4 L L L L L 3 L Little Core 2 B B B B B B (1.0, 1.0) 1 0.5 1.0 1.5 2.0 2.5 3.0 r S e w P I o Normalized IPS P Cornell University Christopher Torng 7 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation The Law of Equi-Marginal Utility British Economist 8 Alfred Marshall (1824 - 1924) 7 Slope "Other things being equal, a consumer Normalized Power gets maximum satisfaction when he 1.0 V allocates his limited income to the 6 purchase of different goods in such a d y,cost way that the Marginal Utility derived 5 from the last unit of money spent on each item of expenditure 4 d x,utility tend to be equal ." 3 Balance the ratio of utility (IPS) to cost (power) 2 Slope 1 1.0 V 0.5 1.0 1.5 2.0 2.5 3.0 Normalized IPS Cornell University Christopher Torng 8 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation The Law of Equi-Marginal Utility British Economist 8 Alfred Marshall (1824 - 1924) 7 "Other things being equal, a consumer Normalized Power gets maximum satisfaction when he allocates his limited income to the 6 purchase of different goods in such a way that the Marginal Utility derived 5 Slope from the last unit of money spent on 0.9 V each item of expenditure 4 Slope tend to be equal ." 3 Balance the ratio of 1.3 V utility (IPS) to cost (power) 2 Arbitrage 1 "Buy Low, Sell High" 0.5 1.0 1.5 2.0 2.5 3.0 Normalized IPS Cornell University Christopher Torng 8 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation Systematic Approach for Balancing Marginal Utility Pareto-Optimal Frontier 1.4 1 Big 1 Little System Normalized Energy Efficiency isopower at Nominal voltage 1.3 Individual (V B , V L ) pair 1.2 Assumptions 1.1 Perfectly parallel application 1.0 Energy efficiency Ideal load balancing at expense of 0.9 performance 0.8 Performance at expense of 0.7 energy efficiency 0.6 0.8 1.0 1.2 1.4 Normalized IPS Cornell University Christopher Torng 9 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation Systematic Approach for Balancing Marginal Utility Pareto-Optimal Frontier 1.4 1 Big 1 Little System Normalized Energy Efficiency isopower at Nominal voltage 1.3 Individual (V B , V L ) pair Improve both 1.2 performance and Assumptions energy efficiency 1.1 Perfectly parallel application 1.0 Ideal load balancing 0.9 0.8 0.7 0.6 0.8 1.0 1.2 1.4 Normalized IPS Cornell University Christopher Torng 9 / 21
Motivation • First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes Evaluation Systematic Approach for Balancing Marginal Utility Pareto-Optimal Frontier 1.4 1 Big 1 Little System Normalized Energy Efficiency isopower at Nominal voltage 1.3 Individual (V B , V L ) pair 1.2 Assumptions 1.1 Perfectly parallel application 1.0 Ideal load balancing 0.9 Marginal Utility-Based 0.8 Optimization Problem 0.7 Constraint: isopower line Objective: maximize performance 0.6 0.8 1.0 1.2 1.4 Solved numerically Normalized IPS Cornell University Christopher Torng 9 / 21
Motivation First-Order Modeling • Asymmetry-Aware Work-Stealing Runtimes • Evaluation Talk Outline Work-Stealing Runtimes Motivation First-Order Modeling Static Dynamic Asymmetry Asymmetry Asymmetry-Aware Work-Stealing Runtimes Work-Pacing L Evaluation L B B Work-Mugging Work-Sprinting Cornell University Christopher Torng 10 / 21
Recommend
More recommend