star cap cluster power management using software only
play

Star-Cap: Cluster Power Management Using Software-Only Models John - PowerPoint PPT Presentation

Star-Cap: Cluster Power Management Using Software-Only Models John D. Davis Suzanne Rivoire (rivoire@sonoma.edu) Moiss Goldszmidt (Microsoft Research) ICPP Workshop on Power-aware Algorithms, Systems, and Architectures (PASA) Sept. 10, 2014


  1. Star-Cap: Cluster Power Management Using Software-Only Models John D. Davis Suzanne Rivoire (rivoire@sonoma.edu) Moisés Goldszmidt (Microsoft Research) ICPP Workshop on Power-aware Algorithms, Systems, and Architectures (PASA) Sept. 10, 2014

  2. Power capping motivation o Reduce waste from overprovisioning o Provision for actual maximum power instead of sum of nameplate power o Have a mechanism to throttle power consumption o Major server manufacturers offer this feature; Intel offers at chip level (RAPL) [Femal ICAC ‘05 , Ranganathan ISCA ‘06, Lefurgy ICAC ’07 … ] 2

  3. The problem with vendor solutions o Additional management hardware, additional cost or limited to chip o Compare to trend of customized bare- bones servers … o … and “wimpy nodes” for data-intensive workloads Goal: eliminate cost of hardware instrumentation 3

  4. Outline o Star-Cap overview o Software-only power models o Power capping schemes o Evaluation 4

  5. Two-level scheme Power( Management( Policies Machine(1 Machine(N Power( Power( Power( Power( Control Model Model Control ! o Top level: determine node power budgets o Node level: enforce and report 5

  6. Sensors and Actuators o Sensors: OS-level, architecture- independent performance counters o Actuators: n For this work, DVFS states n Nothing prevents other mechanisms from being used 6

  7. Outline o Star-Cap overview o Software-only power models o Power capping schemes o Evaluation 7

  8. OS-level counters OS-level Node AC f(x) counters power o Full-system, not a specific component o OS-level, architecture-independent counters o Piecewise quadratic model, fit with MARS [Davis et al., IISWC ‘12] 8

  9. Model training process 1 ETW (Event Tracing for Windows) n Architecture counters: ~250 n Processor, physical and logical disk, network, memory, filesystem 2 Remove redundant counters: ~45 n Correlation Matrix (> |0.95|) n Performance counter definitions 3 Select features: ~10 n R glmpath with L1 regularization n Stepwise refinement 9

  10. Outline o Star-Cap overview o Software-only power models o Power capping schemes o Evaluation 10

  11. Star-Cap Overview o Inputs to all schemes n Target node-level power consumption (set at top level) n Current power (modeled or measured) n List of available frequency states o Outputs n List of frequency states available to OS n Let current OS policy select from available states 11

  12. Threshold-based o If P current < P lo n Make the next highest frequency state available o If P current > P hi n Remove highest frequency state from available list o Our thresholds: n P hi = 95% of cap n P lo = 90% of cap 12

  13. Reactive Capping (ReCap) o Adjust frequency state based on P current o After making a change, wait for it to settle before making another (reduce oscillations) o Three versions: n M-ReCap: P current is measured power n L-ReCap: P current is predicted by a CPU- utilization-based linear model n C-ReCap: P current is predicted by quadratic power model in previous section 13

  14. Proactive Capping (ProCap) o Use quadratic power model to predict P current o Before changing available frequencies, predict P next n Using next allowable frequency state n Keeping all other counters constant (oversimplification!) o If P next would violate threshold, don’t bother adjusting available frequencies 14

  15. Outline o Star-Cap overview o Software-only power models o Power capping schemes o Evaluation 15

  16. Workloads Primes Staticrank o Primes (CPU) o Staticrank (Net) o Sort (Disk, Net) o Wordcount (Disk) Sort Wordcount o All run across 5 homogeneous nodes 16

  17. Hardware Systems Cluster Intel Core 2 Duo AMD Opteron (server) (laptop) CPU Intel Core 2 Duo X2 AMD Opteron 2X4 2.26 GHz 2.0 GHz Storage SSD HDD Idle Power (W) 25 135 Dyn Power range (W) 20 55 OS Windows Server 2008 R2 17

  18. Hardware Systems Cluster Intel Core 2 Duo AMD Opteron (server) (laptop) CPU Intel Core 2 Duo X2 AMD Opteron 2X4 2.26 GHz 2.0 GHz Storage SSD HDD Idle Power (W) 25 135 Dyn Power range (W) 20 55 OS Windows Server 2008 R2 18

  19. Hardware Systems Cluster Intel Core 2 Duo AMD Opteron (server) (laptop) CPU Intel Core 2 Duo X2 AMD Opteron 2X4 2.26 GHz 2.0 GHz Storage SSD HDD Idle Power (W) 25 135 Dyn Power range (W) 20 55 OS Windows Server 2008 R2 4 frequency states: 100%, 94%, 82%, 70% 19

  20. Power profiles 50 50 NodeD02 Node-02 No)Frequency)Cap NodeD03 Node-03 94%)Frequency)Cap 45 45 Node-04 NodeD04 82%)Frequency)Cap Node-05 NodeD05 Server)Power)(W) 40 40 70%)Frequency)Cap Power&(W) 35 35 30 30 25 25 { WordCount Prime Sort PageRank 20 20 1 2 3 1 3601 7201 10801 Time&(Hrs) ! If DVFS is the only actuator, some power budgets will be much easier to deal with than others. 20

  21. Reactive capping: modeled vs. measured power 50 500 700 M+ReCap C+ReCap ProCap 45 450 650 40 400 600 35 350 550 30 300 500 25 250 450 o Low power cap (38 W) o Graph shows 1 node o Blue: ReCap based on measured power o Gray: ReCap based on model power 21

  22. Reactive vs. proactive capping (A) 40 200 400 35 150 350 30 100 300 25 50 250 WordCount Prime Sort PageRank 0 0 200 (B) 0 10 20 30 40 50 60 o Same power cap ! o Blue: ReCap based on measured power o Purple: ProCap 22

  23. Higher power cap 50 50 50 50 MEReCap LEReCap Linear,model,,window Measured,power,,window ProCap Cluster,model,,prediction 45 45 45 45 Power,Cap,Threshold Power,Cap 40 40 40 40 Power&(W) Server,Power,(W) 35 35 35 35 30 30 30 30 25 25 25 25 WordCount Prime WordCount Prime WordCount Prime (A) (B) (C) 20 0 20 20 20 0 100 200 0 100 200 100 200 1 101 201 1 101 201 1 101 201 Time(s) Time(s) Time(s) Time&(s) Time&(s) Time&(s) Figure'4 ."42W"power"cap"examples"for"WordCount"and"Primes"using"the"Reactive"Power"Capping"with"(A)"(M?ReCap)"measured"power"and" o 42W cap o Left: M-Recap Center: L-Recap Right: ProCap o Model accuracy matters! 23

  24. Conclusion o Demonstrated the potential of high- accuracy, software-only models for server- level power capping o Suitable for low-power, low-cost “wimpy nodes” o Extensible to other power management hooks and policies 24

  25. Backup slides 25

  26. Dynamic Range Error Report error as a percent of the dynamic range – idle power shouldn’t count. Max Dynamic power Range Idle power Cluster Power 26

  27. Model Accuracy 16% 14% Model Features CPU utilization 12% Average DRE CPU utilization and MHz 10% Cluster specific General 8% 6% 4% 2% 0% Linear Piecewise Linear Quadratic Switching Linear Modeling Techniques 27

  28. Model Features o Automatically selected from over 200 OS counters o Processor: utilization, frequency o Memory: cache faults/sec; pool nonpaged allocations o Disk: total disk time % o Filesystem and virtual memory: file system pin read/sec, peak page file bytes 28

Recommend


More recommend