power management in power management in wireless socs
play

Power Management in Power Management in Wireless SOCs SOCs - PowerPoint PPT Presentation

Power Management in Power Management in Wireless SOCs SOCs Wireless Jan M. Rabaey Scientific Co-Director BWRC Director GSRC EECS Dept. Univ. of California, Berkeley With contributions of M. Sheets and H. Qin The Leakage Challenge (1) The


  1. Power Management in Power Management in Wireless SOCs SOCs Wireless Jan M. Rabaey Scientific Co-Director BWRC Director GSRC EECS Dept. Univ. of California, Berkeley With contributions of M. Sheets and H. Qin

  2. The Leakage Challenge (1) The Leakage Challenge (1) 2 1.2 120 P LEAK Technology node[nm] 1 100 Power [µW / gate] Subthreshold leak Voltage [V] 0.8 80 (Active leakage) V DD 1 0.6 60 Technology 0.4 40 node 0.2 20 V TH P DYNAMIC 0 0 0 2002 ’04 ’06 ’08 ’10 ’12 ’14 ’16 2002 ’04 ’06 ’08 ’10 ’12 ’14 ’16 Year Year T. Sakurai, ISSCC 03

  3. The Leakage Challenge (2) The Leakage Challenge (2) 1.4 Normalized Frequency 1.3 1.2 0.18 micron ~1000 samples 1.1 30% 1.0 20X 0.9 0 5 10 15 20 Normalized Leakage (Isb) Source: S. Borkar, Intel

  4. The Other Side of the Story: The Other Side of the Story: Leakage is good for you! Leakage is good for you! 1 ref -180mV 0.8 V th Op ref max 0.81V dd E Op / nominal E 0.6 ref -95mV V th max 0.4 0.57V dd nominal ref -140mV V th 0.2 parallel max 0.52V dd pipeline 0 -2 -1 0 1 10 10 10 10 E Leakage /E Switching Optimal designs have high leakage (E Lk /E Sw ≈ 0.5) Must adapt to process variations and activity

  5. Source: P. Gelsinger Gelsinger (DAC04) (DAC04) Source: P.

  6. What to do about memory? What to do about memory? “The data retention voltage (DRV) “The data retention voltage (DRV) urrent ( µ A) 60 50 40 4KB SRAM Leakage C 30 Measured DRV range 20 10 0 0 0.2 0.4 0.6 0.8 1 Supply Voltage (V) 7000 6000 5000 4000 Data obtained from 4K bytes SRAM test- -chip, chip, 3000 Data obtained from 4K bytes SRAM test 2000 implemented in 130 nm CMOS implemented in 130 nm CMOS 1000 0 0 100 200 300 400 500 DRV (mV)

  7. Calibrating for Process Variations Calibrating for Process Variations Most variations are systematic, and can be adjusted for at start-up time using one-time calibration! • Relevant parameters: T clock , V dd , V th • Can be easily extended to include leakage-reduction and power-down in standby Test inputs and responses T clock Test Module Module V dd V bb • Achieves the maximum power saving under technology limit • Inherently improves the robustness of design timing • Minimum design overhead required over traditional design methodology

  8. Adaptive Body Biasing Adaptive Body Biasing Source: P. Gelsinger (DAC04)

  9. Introducing “Power Domains (PDs PDs)” )” Introducing “Power Domains ( Similar in Concept to “Clock Domains”, but extended to include power-down (really!) and local supply and threshold voltage management. Power source • Dynamic voltages for Active Power Network Active Power Network variable workload • Power gating or shut-off for leakage control • Lifetime extension Load Load Load exploiting battery attributes • Noise management

  10. Introducing “Power Domains (PDs PDs)” )” Introducing “Power Domains ( Who is in charge? Timer subsystem System supervisor Power Alarm Table control µ -coded state msgs w/IDs Power control bus System machine Timewheel Functional units = Alarm Manager System status Event msgs Next Alarm decoder Power Scheduler/ Chip Supervisor (or Chip O/S) Chip Supervisor • Maintains global state and perspective • Maintains system timers • Alerts blocks of important events Domain1 Domain3 Domain2

  11. A Case Study — — A Case Study Protocol Processor for Wireless Sensor Networks Protocol Processor for Wireless Sensor Networks Sensor/ Sensor/actuator actuators App/UI interface User interface Aggregation/ Target: < 50 µ W average Transport forwarding Network Chip Locationing DLL (MAC) Supervisor Energy train Baseband Reactive RF (TX/RX) radio Antenna “Charm” Processor

  12. Charm Architecture Charm Architecture 16kB CODE • 1 V operational supply voltage 4kB XDATA • 16 MHz Clock Frequency 256 DATA • Simple processor aided Chip DW8051 with dedicated accelerators Supervisor Serial Interconnect network FlashIF SIF Serial ADC MAC GPIO SIF ADC LocalHW PHY • Reactive inter- and intra-chip signaling • Aggressive Use of Power-Domains • Chip Supervisor Manages Activity

  13. Call a Plumber…This Thing Leaks! Call a Plumber…This Thing Leaks! Est. leakage @1V (uW) Area (um 2 ) Block Logic Memory Locationing 337990 39.9 DW8051 63235 8.2 2880.0 64KB SRAM for SW code and data Interface 6098 0.8 Neighborlist 21282 2.5 13.5 Serial 2554 0.4 NetQ 6296 0.7 108.0 DLL 126846 17.4 13.5 Supervisor 51094 6.4 30X the target power…just in leakage!! Total 76.3 3015.0 Leakage vs. Supply Voltage Hey buddy, turn down the voltage! ~15X Data retention voltage reduction 1/15 A * 0.3 V = 98% less leakage power

  14. Gated Power Architecture Gated Power Architecture vdd vvdd gnd gnd vdd vvdd gnd gnd vdd vvdd gnd gnd gnd gnd gnd vddhi vddlo gnd vdd vdd vdd vddlo gnd vddlo gnd vddhi vddhi • Vddhi – active mode voltage (nominal) • Vddlo – standby mode voltage allows retention of state

  15. Power Switch Tile Power Switch Tile STBY VDD (1V) 300mV VVDD STBY_buf Std cell height GND • Tile is easily incorporated into standard design flow – Cell has same pitch as std. cell library components – Switch tiles placed prior to other standard cells – One additional power strap added to power routing step • Switch design can be independent of block size – Built in buffer distributes driver circuitry – Enables creation of a buffer tree during STBY signal routing

  16. Power Switch Sizing Power Switch Sizing D elay / Leakage Tradeoff 3.5 1 Delay overhead 3 0.8 2.5 0.6 2 0.4 1.5 0.2 1 0 Delay overhead 0 10 20 30 40 50 Leakage Power switch width (um) • Switch sizing enables trade-off between delay overhead and leakage – Delay scale normalized to un-gated design – Leakage scale normalized to case when switch size is 50 µ m • Timing slack determines delay requirement – Control domains (DLL, processor) – tolerant of delay overhead – Datapath domains (locationing) – longer critical paths, less tolerant of delay overhead H. Qin

  17. System Supervisor System Supervisor Time subsystem Power Domain A Power Network Power Command/ Power Clock Network Event Domain B subsystem Interface Dispatcher Power Domain C Power subsystem • How to control block activation/deactivation? • System supervisor centralizes power control – Power subsystem – gates block power rails – Clock subsystem – gates block clocks – Timer subsystem – system time-wheel and wake-up timers

  18. Power Subsystem Power Subsystem Src Decoder Connection Table Dest Decoder connection mask To/From Session Controller Dispatcher Session Table SYSCLK • Session controller – opens/closes sessions • Connection table – holds connectivity masks and performs port address translation • Session table – keeps track of open sessions

  19. Charms Sub- -blocks and Connectivity blocks and Connectivity Charms Sub Sensor I 2 C Serial RS-232 Interface SPI A A E D Controller C A B Neighborlist (DW8051) B A C B D E A Network Locationing DLL Queues A C A B B A Connectivity grid Baseband Block Port A Port B Port C Port D Port E BB DLL DLL NETQ BB LOC NL DW8051 DW8051 NETQ NL DLL SERIAL LOC DLL NL DW8051 RF-frontend NETQ DW8051 DLL NL DW8051 DLL LOC SERIAL DW8051

  20. Power Session Table Power Session Table Session Table Before a power domain can communication with a neighbor, it 0 must first open a session Domain 0 1 1 Power policy: Src 0 ... A power domain can sleep if… 0 1 1 1 1 ... 0 1) It has closed all its sessions 2) No other domain has a session open with it Dest Domain 3) It wants to go to sleep A ‘1’ in row i means that power domain i has an open a session with another domain A ‘1’ in column k means that another domain opened a session with domain k A ‘1’ in entry (i, i) is domain i's self-sleep bit can_sleep(i) = reduction_nor(row i) and reduction_nor(col reduction_nor(col i) i) can_sleep(i) = reduction_nor(row i) and

  21. Clocking Subsystem Clocking Subsystem N-stage chain (N ev en) REF_CLK_ROOT 16 MHz RINGOSC_EN SYS_CLK_ASYNC Clock generator REF_CLK_ROOT REF_CLK_PIN SYS_CLK_ROOT IBUF (pad) TIMERCLK Phase SYSCLK synchronous Clock tree M-stage delay line Variable Clock DETECT delay line tree SYS_CLK_ASYNC Parallel phase detector Priority encoder + digital controller M REF_CLK_CLOCKMAN • Low frequency external clock (32 KHz) • Generated, switchable, higher frequency clock (16 MHz) • Two clocks are made phase-synchronous using DLL • Control signals are generated by system supervisor

  22. Timer Subsystem Timer Subsystem Alarm Manager System Time-wheel beep_beep = Alarm Entry #0 alarm_time new_alarm Free-running Counter Alarm Entry #1 Alarm To/From TIMERCLK Scheduler Dispatcher Alarm Entry #N-1 SYSCLK • Centralized system time-wheel – Blocks schedule wake-up alarms – Eliminates other large counters so blocks can sleep – Allows power domains to sleep • Very low switching activity factor – SYSCLK is disabled during deep sleep – Serial (ripple) comparison starting with MSB

Recommend


More recommend