context
play

Context More a more devices are powered by battery: High - PDF document

Giorgio Buttazzo g.buttazzo@sssup.it Scuola Superiore SantAnna Context More a more devices are powered by battery: High performance Required features: Long lifetime 2 1 Contrasting objectives The problem is not trivial, because


  1. Giorgio Buttazzo g.buttazzo@sssup.it Scuola Superiore Sant’Anna Context More a more devices are powered by battery: High performance Required features: Long lifetime 2 1

  2. Contrasting objectives The problem is not trivial, because performance and lifetime have opposite energy requirements: High Long Performance Lifetime High speed Low energy ? power power High power Low power Progress of components 10 4 1000 100 10 1 year 1990 1994 1998 2002 2006 2010 4 2

  3. How to increase lifetime? Considering the limited progress of batteries, the only hope to increase system lifetime is to reduce energy consumption by proper power management. ti b t  In real life, and also in embedded systems, a lot of energy is wasted due to bad power management.  Research work is needed to optimize resource  R h k i d d t ti i usage and reduce waste. 5 Same problem in data centers 6 3

  4. Consumption in data centers 33% 8% 59% 7 Consumption in data centers W 12000 Total Power 10000 Server 8000 Power 6000 4000 year 2000 2004 2008 2012 2016 8 4

  5. How to reduce? • Outside air economizers • Expand temperature setpoints • Efficient cooling equipment S Space Cooling • Efficient hardware • Optimized distribution • Efficient voltage regulators Electrical Losses • Virtualization Servers 9 Consumption in data centers 12000 Total Power Servers Servers 10000 Virtualization and Virtualization and Power efficiency 8000 impact 6000 4000 Virtualization impact year 2000 2004 2008 2012 2016 10 5

  6. Power model Power dissipation in CMOS integrated circuits is mainly due to two causes:  Dynamic power ( P ) consumed during operation;  Dynamic power ( P d ) consumed during operation;  Static power ( P s ) consumed when the circuit is off. V dd Inverter V dd P-MOS V in V V V in out out C L N-MOS Gnd Dynamic power V dd Dynamic power has two components: 1. Switching power P sw I sw P-MOS consumed during logic state d d i l i t t change (1  0) to charge I sc V in V the load capacitance C L . out C L N-MOS 6

  7. Dynamic power V dd Dynamic power has two components: 1. Switching power P sw P-MOS consumed during logic state d d i l i t t change (1  0) to charge I sc V in V the load capacitance C L . out C L Note that during transition N-MOS (0  1) the capacitance is discharged through the N-MOS. discharged through the N MOS.    The switching power 2 P C f V sw L dd can be expressed by: f = clock frequency Dynamic power 2. Short circuit power P sc V dd consumed for a very short time, during the ramp time of the input signal, when the ramp time of the input signal when input is at threshold voltage and both PMOS and NMOS are ON. I sc V in V out  P V I sc dd sc C L Hence, the total dynamic power is dominated by the switching power:      2 P P P C f V d sw sc L dd 7

  8. Static power V dd Static power P s is due to a quantum phenomenon where mobile charge carriers g (electrons or holes) tunnel through an insulating region, creating a V in V out leakage current I lk C L P  I lk V I s dd lk  Static power consumption is independent of the switching activity is always present if the circuit is on.  As devices scale down in size, gate oxide thicknesses decreases, resulting in larger leakage current. CMOS Inverter P-MOS N-MOS Input p Output Source Gate Source V dd Gate Drain Drain Gnd p+ p+ n+ n+ n well n-well p-substrate 8

  9. Dynamic vs. static power Static Power significant at 90 nm 10 2 Dynamic Dynamic Power 1 ormalized power 10 -2 Static Power (leakage) No 10 -4 10 -6 year 1990 1995 2000 2005 2010 2015 2020 Gate length (nm) : 500 350 250 180 130 90 65 45 22 In summary  The dynamic power consumption increases with the supply voltage and with the clock frequency:    2 P C f V d L dd  Moreover, the supply voltage also affects the circuit delay (hence the max clock frequency): V V  dd V t = threshold D  2 ( V V ) voltage dd t Note that D decreases for higher V dd and lower V t 9

  10. Dynamic Volt./Freq. scaling  Hence, the dynamic power consumed by a system can be controlled by scaling the clock frequency and the voltage at which the processor operates: and the voltage at which the processor operates: Long lifetime short lifetime low performance high performance dynamic dynamic power f max V dd Dynamic Power Management  On the other hand, static power can be controlled by turning the CPU off, or putting it in a sleep state: The overhead to go sleep is the Break even time (B e ): The overhead to go sleep is the Break even time (B ): V dd the deeper the sleep state, the longer the overhead. B e (a,s) =  as +  sa active SLEEP1 sleep1 p SLEEP2 sleep2 OFF t OFF active-to-sleep sleep-to-active overhead (  as ) overhead (  sa ) 10

  11. Minimizing energy In real-time systems, the problem is to minimize energy consumption still guaranteeing a desired level of performance (schedule feasibility). f f ( h d l f ibilit ) power performance f max V V min Low-power features To exploit such a possibility, modern processors are designed to  work under different operating modes, each characterized by a power consumption P, voltage V and clock frequency f: (P 1 , V 1 , f 1 ), (P 2 , V 2 , f 2 ), …, (P m , V m , f m )  Switching between two modes j-k is characterized by a power consumption P jk and time overhead  jk  have different low-power states, each characterized by a specific power consumption and transition overheads: S 1 (P 1 ,  1as ,  1sa ), … S L (P L ,  Las ,  Lsa ) 11

  12. Energy-saving methods DVFS : Dynamic Voltage and Frequency Scaling The consumed energy is varied by acting on the supply voltage and clock frequency: time time Power full speed reduced speed P(100 MHz) P(50 MHz) P(sleep) time Energy-saving methods DPM : Dynamic Power Management The consumed energy is varied by exploiting the inactive low-power states of the processor: time time Power full speed sleep full sleep P(100 MHz) P(50 MHz) P(sleep) time 12

  13. Energy-saving methods Hybrid : DVFS + DPM The consumed energy is varied by exploiting both techniques in different time intervals: time time Power full speed DPM DPM DVFS P(100 MHz) P(50 MHz) P(sleep) time Normalized speed To make the analysis more general, instead of using the absolute clock frequencies, f 1 , f 2 , …, f m it is better to use a normalized speed s  [0,1] : s f s  s m = 1 f m s 3 s 2 s 1 f f 1 f 2 f 3 f m 13

  14. Notation When dealing with processors with variable speed, often the schedule is represented in a bi-dimensional diagram, where time is on the x-axis and normalized speed is on the y-axis. For instance, the following schedule represents 3 jobs of a periodic task  i with period T i = 6 and WCET at the maximum speed C i (1) = 1, executed at three decreasing speeds:  i speed s = 1 s = 0.5 s = 0.25 1 0.5 0 2 8 14 20 time Power model To take different components into account, power consumption can be modeled as follows [Martin & Siewiorek, 2001]:     3 2 P ( s ) K s K s K s K 3 2 1 0 K 3 expresses the weight of the power components that vary with both voltage and frequency. K 2 captures the nonlinearity of DC-DC regulators in the range of the output voltage of the output voltage. K 1 is related to the hardware components that can only vary the clock frequency (but not the voltage). K 0 represents the power consumed by the components that are not affected by the processor speed. 14

  15. WCET scaling CC i = number of clock cycles required by  i C i = task computation time C i task computation time C i ( s ) 1 CC C   i i C ( s ) i s s C i (s 1 ) where C i (s 2 )    1 C C ( s 1 ) CC i i i C i (s 3 ) is the shortest execution s 1 s 2 s 3 speed time achievable at the maximum speed WCET scaling In practice, several operations are performed on I/O devices and memory units that do not share the clock with the CPU.  For instance hard disk operations mostly depend on the  For instance, hard disk operations mostly depend on the bus clock frequency, the hard disk read/write speed, and the interference caused by other tasks accessing the bus. Hence, a more realistic model for the task WCET is: var C   fix i C ( s ) C i i s 15

  16. WCET scaling Note, however, that var C   it is more precise, but it fix i C C ( ( s s ) ) C C i i i i s complicates the analysis 1 C  it is safe, because it represents an i C ( s ) i upper bound of the previous model s       1 fix var C C C C ( ( 1 1 ) ) C C C C In fact, since In fact since i i i i fix var var C C C    fix for any s  1 i i i C we have: i s s s Utilization scaling Note that, if using the simplified model C ( s ) = C 1 / s : 1 1 n n C C ( ( s s ) ) C C U U      i i U ( s ) T sT s   i 1 i 1 i i U ( s ) 1 n  C  1 i where: U U m T T   i i 1 1 i i is the task set utilization at s max = 1 U 1 s s min s max = 1 16

Recommend


More recommend