Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers J. Mair, K. Leung, Z. Huang October 21, 2010 Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Overview Why save energy? Tortoise and Hare Energy per Target (EPT) Saving energy with parallelism Inter/Intra-application Parallelism Small Cluster Experiment Conclusion Future Work Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Why save Energy? Power consumption is increasing at an alarming rate - 5 billion KWh 2000 vs. estimated 55 billion KWh in 2010 (Germany) for servers, routers and PCs Leads to increasing total cost to run the system over its lifetime Environmental impact - 70% of energy being generated in developed countries e.g. USA, is through burning fossil fuels, releasing greenhouse gases, contributing to global warming warming Want to save energy, not power as they are not the same energy = power × time Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Tortoise and Hare Tortoise: Based on the idea that slow and steady wins the race Energy is saved by using the fewest resources possible to finish Use DVFS to run at lowest possible frequency Takes much longer to complete Hare: Tries to complete the race as quickly as possible Use the largest number of resources possible Run at the highest frequency Enter an energy saving mode after finishing (sleep) Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Energy per Target (EPT) Is it more energy efficient to use less power and take longer or use more power and finish faster? Set a deadline for execution to finish by Compare energy usage for all possible configurations over this time window EPT = power busy × time busy + power idle × time idle where time busy + time idle = time target Using a time window gives a fair comparison between configurations Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Tortoise vs. Hare: Halt State 160000 2.5 GHz 1.8 GHz 140000 1.3 GHz 120000 800 MHz 100000 EPT 80000 60000 40000 20000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Number of cores Figure: EPT of Raytrace using halt state Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Tortoise vs. Hare: Sleep State 160000 2.5 GHz 1.8 GHz 140000 1.3 GHz 120000 800 MHz 100000 EPT 80000 60000 40000 20000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Number of cores Figure: EPT of Raytrace using sleep state Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Tortoise vs. Hare: Results CPU halt seems to make very little difference (2%) More significant savings need to be made during idle time for such a policy to be warented Not using DVFS allows for energy savings Parallelism on its own allows for energy savings of upto 44.99% The Hare Policy allows for savings of upto 72% over the Tortoise In simple systems with low utilization and long-run deadlines it is more energy efficient not to use complex scheduling methods Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Saving Energy with Parallelism: hardware A common way to save energy is to use more energy efficient hardware Multicore commodity processors are now almost standard Hardware support for DVFS and sleep states Parallelism can make use of idle resources on systems with low utilization by parallelising applications which were traditionally sequential - Some systems have utilization levels between 10% and 50%, or less than 5% in some cases Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Saving Energy with Parallelism: software DVFS allows CPU frequency to be controlled by software Can parallelism be used to offset performance loss of scaling down a CPU? When sleeping nodes or components the amount of predicted idle time has to be of a certain size to gaurantee the overhead of sleeping does not increase power consumption Parallelism can be incorporated into middleware giving the benefits of: Lower energy usage Allows commodity components to be used New techniques can be applied to existing systems Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Two types of Parallelism A scheduler has two main forms of parallelism it can choose from depending on how light/heavy the system load is and the characteristics of the applications making up the workload Intra-application Parallelism Application level parallelism through the use of threads Inter-application Parallelism Multiple applications are executed in parallel on the same system Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Exclusive and Sharing Policies Exclusive Policy - Gives each application exclusive access to the systems resources, similar to some more traditional approaches Sharing Policy - Use inter-application parallelism to try and increase throughput and save energy Power per Speedup (PPS) gives the power required for a given level of speedup through parallelism Energy is for 8 instances of Raytrace benchmark case energy speedup PPS 1-16 85645 12.44 264.19 2-8 72811 7.32 224.49 4-4 69780 3.82 215.08 8-2 68276 1.95 210.67 Table: PPS for Raytrace benchmark at 2.5 GHz Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Speedup per Watt (SPW) Similar to the commonly used metric Performance per Watt (PPW) PPW measures the performance of the hardware SPW measures the performance of parallel applications Application metrics are suited for scheduling as they look at the applications power characteristics SPW = speedup power where time fastest sequential speedup = time current configuration Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
SPW Graphs 1 1 2.5 GHz 2.5 GHz 1.8 GHz 1.8 GHz 1.3 GHz 1.3 GHz 800 MHz 800 MHz 0.8 0.8 Speedup per Watt Speedup per Watt 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 Number of cores Number of cores Figure: Speedup per Watt for Figure: Speedup per Watt for Mandelbrot Raytrace Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Small Cluster: Policies Non Power-Aware (N) Runs each instance at 2.5GHz, using all 16 cores. During idle time the CPU is scaled down to 800MHz to save power Hare Policy (H) + Global task migration (G) Migrate tasks to fewest nodes possible and run at 2.5GHz on each node. Sleep while idle Sharing Policy (S) + Hare Policy (H) Share the resources between all instances on a node, running at 2.5GHz, sleeping when idle Sharing Policy (S) + Hare Policy (H) + Global task migration (G) The same as SH except the fewest number of nodes possible become fully loaded Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Small Cluster: Results 10 instances of the Raytrace benchmark are to be scheduled The time window for each node is 210 seconds No more than 8 instances can be scheduled on a single node Node 1 Node 2 Node 3 Node 4 energy(J) policies I C I C I C I C - 2 - 2 - 2 - 4 - - N 2 16 2 16 2 16 4 16 233294.18 HG 8 16 2 16 - - - - 115675.98 SH 2 8 2 8 2 8 4 4 99321.95 SHG 8 2 2 8 - - - - 96509.55 Table: Exploration of multiple scheduling policies, N = Non power-aware, G = Global task migration, S = Sharing Policy, H = Hare Policy Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Conclusion Parallelism can be used to save energy in a system Savings can be further increased when combined with DVFS Sharing system resources is better than exclusive use A heavily loaded system using the Sharing Policy can save up to 20% energy over the more traditional Exclusive Policy On a lightly loaded system the Hare Policy reduce energy usage by up to 72% Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Future Work Explore the Sharing and Tortoise Policies through the use of additional benchmarks which have different characteristics Rerun the small cluster experiment on a cluster of a sufficient size Application of the policies in real systems by automatically detecting the features of programs through binary instrumentation Change the Sharing Policy so that it handles cases with idle cores and minimizes the overall PPS for the entire system Metrics and Task Scheduling Policies for Energy Saving in Multicore Computers
Recommend
More recommend