individual voltage scaling in logic and memory circuits
play

Individual Voltage Scaling in Logic and Memory Circuits towards - PowerPoint PPT Presentation

Individual Voltage Scaling in Logic and Memory Circuits towards Runtime Energy Optimization in Processors Jun Shiomi, Tohru Ishihara, Hidetoshi Onodera Graduate School of Informatics, Kyoto University, Japan 1 Energy Reduction by Dynamic


  1. Individual Voltage Scaling in Logic and Memory Circuits towards Runtime Energy Optimization in Processors Jun Shiomi, Tohru Ishihara, Hidetoshi Onodera Graduate School of Informatics, Kyoto University, Japan 1

  2. Energy Reduction by Dynamic Voltage Scaling Threshold voltage tuning ( 𝑊 th ) Supply voltage tuning ( 𝑊 DD ) DVFS: Dynamic Voltage and ABB: Adaptive Body Biasing Frequency Scaling Delay Delay Energy Energy Threshold voltage ( 𝑊 th ) Supply voltage ( 𝑊 DD ) Dynamic energy Static energy DD - and 𝑊 th -tuning technique for energy minimization 𝑊 2

  3. Minimum Energy Point Tracking (MEP Tracking) Energy minimization by voltage scaling under a given frequency 1.2 Supply Voltage [V] MEPT example: Renesas SOTB 65-nm 1.0 140 pJ Cell-based memory 0.8 90 pJ Performance contour 0.6 0.4 - 0.5 - 1.0 - 1.5 - 2.0 0 Body Bias [V] Small 𝑊 Large 𝑊 th th Minimum Energy Point: MEP (Best combination of 𝑊 DD and 𝑊 th ) Target: MEP tracking technique for processors 3

  4. Activity Factor Dependency of MEP Curves (Activity 𝟐𝟏𝟏% → 𝟐𝟏% ) Issue: MEPs heavily depend on activity factors (toggle rates) 1.2 Optimized Supply Voltage [V] 12 pJ 1.0 0.8 Performance contour 0.6 Unoptimized 0.4 20 pJ - 0.5 - 1.0 - 1.5 - 2.0 0 Body Bias [V] Small 𝑊 Large 𝑊 th th  Activity factor: Important parameter determining MEPs 4

  5. Overview of This Work MEP with 10% activity ≅ On-chip memory 1.2 Supply Voltage [V] 1.0 0.8 Performance contour 0.6 MEP with 100% activity 0.4 ≅ Logic circuits - 0.5 - 1.0 - 1.5 - 2.0 0 Body Bias [V] Small 𝑊 Large 𝑊 th th  Individual voltage scaling problem in logic and memory circuits  Heuristic algorithm for runtime optimization 5

  6. Outline • Background • Individual Voltage Scaling Problem • Silicon Measurement • Conclusion 6

  7. (Existing) Uniform Voltage Scaling Problem MEP curve Circuit energy 𝐹 min Performance contour DD for 𝐸 = 𝐸 0 s. t. 𝐸 ≤ 𝐸 0 𝑊 Target performance 𝑊 DD , 𝑊 th ∈ ℝ Circuit delay Solution 𝑊 th • Existing approach: Runtime MEP tracking [5]  Tunes 𝑊 DD and 𝑊 th iteratively Initial point DD  Requires only simple circuits 𝑊  Enables to track MEPs at runtime even if 𝐸 = 𝐸 0 target performance Finish 𝑊 th temperature dynamically change Energy & delay monitoring activity 7 (MEP check)

  8. Individual Voltage Scaling Problem 𝑊 𝑊 DD,M 𝑊 𝑊 DD,L th,M th,L 𝐹 L + 𝐹 M min Memory Logic 𝐸 L + 𝐸 M ≤ 𝐸 0 s. t. 𝐸 M 𝐸 L 𝑊 DD,L , 𝑊 th,L , 𝑊 DD,M , 𝑊 th,M ∈ ℝ Constraint 𝐸 0 L No runtime algorithms due to complex delay assignment between 𝐸 L and 𝐸 M Logic Memory Power Power Voltage scaling in logic Huge energy saving Voltage boost in mem. Delay Delay 𝐸 0 𝐸 0 This work: Heuristic algorithm for runtime voltage scaling 8

  9. Various Strategies in Uniform Voltage Scaling Delay contour ( 𝐸 L + 𝐸 M = 𝐸 0 ) DD 𝑊 Memory MEP ( 𝐹 M min.) Processor MEP ( 𝐹 L + 𝐹 M min.) Logic MEP ( 𝐹 L min.) 𝑊 th 𝐹 L optimized, but 𝐹 M NOT optimized 𝐹 L , 𝐹 M balanced ⇒ Solution in uniform voltage scaling 9

  10. Concept of the Proposed Heuristic Algorithm Delay contour ( 𝐸 L + 𝐸 M = 𝐸 0 ) DD 𝑊 Memory MEP ( 𝐹 M min.) Processor MEP ( 𝐹 L + 𝐹 M min.) Logic MEP ( 𝐹 L min.) 𝑊 th Logic voltages ( 𝑊 DD,L , 𝑊 th,L ) Memory voltages ( 𝑊 DD,M , 𝑊 th,M ) Point: 𝐸 L and 𝐸 M are constant over the delay contour ( )  Enable local minimum energy point operation 10

  11. Simple Heuristic Algorithm for Individual Voltage Scaling Logic MEP Step 1 Logic Energy Optimization Delay contour DD,M 1. Uniform voltage tuning in Logic & Mem. 𝐸 L + 𝐸 M = 𝐸 0 Init. point (i.e., 𝑊 DD,M & 𝑊 th,M ) DD,L = 𝑊 th,L = 𝑊 DD,L = 𝑊 Enables to apply existing techniques Mem. MEP 2. Find logic MEP ( ) 𝑊 𝑊 th,L = 𝑊 th,M Step 2 Memory Energy Optimization 1. Tune only mem. voltages ( 𝑊 DD,M & 𝑊 th,M ) Tune only mem. voltages DD,M 2. Find memory MEP ( ) DD,L ≠ 𝑊  Enable runtime energy optimization Fix logic voltages 𝑊  Local minimum energy point operation 𝑊 th,L ≠ 𝑊 11 th,M

  12. Outline • Background • Individual Voltage Scaling Problem • Silicon Measurement • Conclusion 12

  13. Case Study: 32-bit RISC Processor Target • Renesas SOTB 65-nm • On-chip memory - 4 kB I-Cache + TAG - 8 kB I-SPM - 16 kB D-SPM  Standard-cell based memory Logic ( 𝑊 DD,L ) Mem. ( 𝑊 DD,M ) Main memory (DCT loop) Supply voltage & body bias I/O • Individual in logic and mem. - Body bias for nMOSFETs in logic circuits is fixed at GND • No level converters between logic and memory 13 Body bias 𝑊 𝑊 BN,M 𝑊 BP,L BP,M

  14. Activity Factor Dependency of Memory MEPs ( 𝑊 BB,M ) DD,L = 𝑊 DD,M & 𝑊 BB,L = 𝑊 Fmax contour of the fabricated processor [MHz] 1.2 𝜷 𝐍 = 𝟏. 𝟏𝟐 𝜷 𝐍 = 𝟏. 𝟐 Supply Voltage [V] 𝜷 𝐍 = 𝟐 𝛽 M : Memory activity factor 1.0 1 Activate in each clock cycle 0.8 0.1 Activate once in 10 clock cycles 0.01 0.6 Activate once in 100 clock cycles Logic 0.4 MEP -0.5 -1.5 0 -1.0 -2.0 Body Bias [V] Small 𝑊 Large 𝑊 th th  MEPs move to the upper right as activity 𝛽 M decreases 14

  15. Measurement Results of the Proposed Algorithm ( 𝛽 M = 0.01 ) Fmax contour of the fabricated processor [MHz] 1.2 Step 1 Supply Voltage [V] 1.0 1. Uniform voltage scaling 2. Find logic MEP ( ) 0.8 Step 2 1. Fix logic voltages @ 0.6 Mem. 2. Tune only mem. voltage & MEP Logic find mem. MEP ( ) 0.4 MEP -0.5 -1.5 0 -1.0 -2.0 Body Bias [V] Small 𝑊 Large 𝑊 th th  Individual voltage tuning achieved by the proposed algorithm 15

  16. Energy Reduction by Individual Voltage Scaling ( 𝛽 M = 0.01 ) 100 Memory static energy Memory dynamic energy 80 Logic static energy −10% Logic dynamic energy Total Energy 60 −13% Consumption −15% [pJ / cycle] −16% 40 20 0 Fmax 4 MHz 8 MHz 20 MHz 29 MHz 16  Up to 16% energy reduction by individual voltage scaling

  17. Conclusion & Future Work Conclusion • Individual voltage scaling problem in logic and memory presented - Key: Activity factor gap between logic and memory circuits • A heuristic algorithm proposed for runtime energy optimization • Case study using RSIC processors in 65-nm process - Up to 16% energy reduction compared with uniform voltage scaling Future work • Energy overhead compared with the global solution • Energy overhead introduced by fine- grained voltage tuning, etc… 17

  18. 18

  19. Energy Reduction by Individual Voltage Scaling ( 𝛽 M = 0.1 ) 100 Memory static energy −5% Memory dynamic energy 80 Logic static energy Logic dynamic energy Total Energy −7% 60 Consumption −11% −9% [pJ / cycle] 40 20 0 Fmax 4 MHz 8 MHz 20 MHz 29 MHz  No energy improvement when 𝛽 M = 1 19

  20. Definition of 𝛽 M On-chip memory property • No clock gating circuits • Dynamic energy consumption @ each clock cycle Implemented on-chip memory has large activity factor • Parameter 𝛽 M implemented to scale activity factor Measured value Evaluated value × 𝛽 M Dynamic energy Static energy Measured Leakage 20 memory energy energy

  21. System-Level Optimization Problem The problem can be abstracted to system-level optimization CPU execution time 𝑊 DD , 𝑊 th ( ≃ 𝐸 M ) CPU Low activity ( ≃ Memory) Time DSP execution time ( ≃ 𝐸 L ) DSP High activity ( ≃ Logic) Time 𝑊 DD , 𝑊 th Deadline ( ≃ 𝐸 0 ) Future work: Applying the heuristic to system-level optimization 21

Recommend


More recommend