1 Thermal Management Low Power Design Prof. Dr. J. Henkel CES - Chair for Embedded Systems KIT, Germany Thermal Management – Part 2 (Thomas Ebi) http://ces.itec.kit.edu T. Ebi, KIT, SS13
2 Thermal Management Overview  Thermal modeling & Simulation  Multi-core architectures  Motivation Part 2  Reactive thermal management  Proactive thermal management  3D architectures  Thermal Management at CES http://ces.itec.kit.edu T. Ebi, KIT, SS13
3 Thermal Management The RC-Model P P 1 2 P RC equivalent thermal circuit for P 3 4 single component with heat dissipating, e.g. through packaging Voltage ≙ Temperature Current ≙ Heat dissipation RC equivalent thermal circuit for four component s with heat dissipating to outside through package (Cp, Rp) This gives us the thermal equation from last week as: dT T P dt R C C [Shi, 2010] http://ces.itec.kit.edu T. Ebi, KIT, SS13
4 Thermal Management The RC Model (cont) [Skadron, 2004] http://ces.itec.kit.edu T. Ebi, KIT, SS13
5 Thermal Management Thermal Simulation  Thermal simulators such as HotSpot calculate thermal distribution by solving equation of RC equivalent model  Accuracy of simulation dependent on the granularity of components  Block based: coarse granularity (CPU, cache, etc.), fast  Grid based: divides blocks into smaller parts, slower, more accurate temperature distribution, slow  Accuracy also dependent on the power input!  Instruction-based simulators count execution of instructions and know power consumption of each block  E.g. Wattch, m5+McPAt  Inaccurate but fast (Wattch inaccuracy up to 30%) [Brooks 2000]  Circuit-based simulators  Highly accurate but very slow http://ces.itec.kit.edu T. Ebi, KIT, SS13
6 Thermal Management Thermal Sensors: Thermal Diodes  Currently most common method for on-chip thermal measurement  Used by Intel, AMD, Xilinx, etc..  Xilinx Virtex 5 FPGA datasheet: Accuracy +/- 4°C  Analog circuitry  Needs A/D converter  Occupies large chip area [Long, 2008] http://ces.itec.kit.edu T. Ebi, KIT, SS13
7 Thermal Management Thermal Sensors: Ring Oscillator  Idea: analyze negative thermal side-effects to quantify temperature  Due to increased delay ring oscillators oscillate slower at higher temperatures  Oscillation frequency determined using a reference clock  Provide relative temperature values  Challenge: must be calibrated to obtain absolute values  Xilinx reference design: [src: Xilinx] Inverter Delay http://ces.itec.kit.edu T. Ebi, KIT, SS13
8 Thermal Management Thermal Sensors: Leakage based  Since leakage is temperature dependent, measuring leakage can also determine temperature Idea: measure the time a capacitor takes to discharge capacitance through leakage current 1. Input switches from low-to-high  M1 transitions from “on” to “off”  Charge stored in CL should remain, but slowly decreases due to leakage current 2. When voltage of CL falls below a threshold, [Ituero 2008] the inverter M3-M4 produces a low-to-high transition 3. Temperature can be determined by the delay between the input and output transitions http://ces.itec.kit.edu T. Ebi, KIT, SS13
9 Thermal Management Multi-core Motivation Hot Tile 4 Tile 3 Tile 8 Tile 2 Tile 7 Tile 12 Tile 11 Tile 16 Tile 1 Tile 6 Cold Tile 15 Tile 5 Tile 10 Tile 9 Tile 14 Tile 13 Spreading applications reduces thermal hotspots  Thermal hotspots! http://ces.itec.kit.edu T. Ebi, KIT, SS13
10 Thermal Management Example Platform: Intel’s SCC  24 Tiles each consisting of two Pentium cores  Two thermal sensors per tile (same principle as ring oscillators)  Frequency scaling per core (100-800MHz)  Voltage scaling per “voltage island” (4 Tiles per island, 1 island for on-chip mesh comm. network, 208 voltage levels)  Tile area: 18.7mm 2  1.3B transistors at 45nm process [src: intel] http://ces.itec.kit.edu T. Ebi, KIT, SS13
11 Sensors on the SCC  Half of the cores running the program, half in idle state Nikil Dutt and Jörg Henkel, Tutorial @ ASP-DAC 2013
12 Thermal Management Problems  Mutual heating  Heat conducts to surrounding areas  Thermal gradients  Variations of temperature across chip  Thermal cycling  Management may lead to periodic heating/cooling http://ces.itec.kit.edu T. Ebi, KIT, SS13
13 Thermal Management Multi-core thermal management  Classification of thermal management approaches:  Reactive approaches  Depend on the current temperature  Proactive approaches  Predict the temperature  Aim to balance temperature to avoid hotspots  Naïve reactive approache:  [Skadron, ISCA.2004] controls the temperature by:  Switching off the hottest core and turning on the coldest one,  but that leads to:  Thermal cycling and large spatial variations  Negative effect on the performance. http://ces.itec.kit.edu T. Ebi, KIT, SS13
14 Thermal Management Reactive approaches (cont’d)  [Coskun, 2007] proposed two OS-level methods that achieve temperature-aware task scheduling.  First method: Coolest-FLP  Depends on the current temperature and floor-plan. • Select the coolest processors For each ready job •Give priority to processors, whose neighbors are “idle”  Reduces the hot spots.  Second method: probabilistic method  Takes into consideration the analysis of the temperature history. • Calculates the probability for each core to receive the incoming job For each P n = P n-1 ± W ready job Weight depends on the core‟s history Previous probability  Achieves more balancing in the temperature and reduces the spatial variation in the temperature http://ces.itec.kit.edu T. Ebi, KIT, SS13
15 Thermal Management Reactive approaches (cont’d)  [Coskun ASPDAC 2008] uses Integer Linear Programming (ILP):  Models the applications as tasks graph  Results in optimal task scheduling for  Given set of tasks with deadlines and dependence constraints  Given temperature profiles.  Aims at reaching the best temporal and spatial distribution of temperature http://ces.itec.kit.edu T. Ebi, KIT, SS13
16 Thermal Management Reactive approaches (cont’d) Normal mode: Thermal balancing mode:  Processing demand < certain threshold.  Processing demand > certain threshold.  Goal: maximize energy savings with  Goal: prevent concentration of high meeting performance demands and power densities, then saving energy. thermal constraints. Yes Demand > α No Task assignment to the cores Global frequency assignment Core-Level frequency assignment Task assignment to the cores Calculating processing demand Calculating processing demand No Yes Demand < β http://ces.itec.kit.edu T. Ebi, KIT, SS13
17 Thermal Management Proactive Approach  [Coskun 2008 ] uses autoregressive moving average (ARMA) modeling to:  Predicting the future temperature from history  Apply thermal-aware job allocation method, which aims to:  Avoid reaching a set thermal threshold achieve and balance the temperature across the chip Temperature Data from Thermal Sensors ARMA Model Validation: Predictor (ARMA) Update Model if Necessary Temperature at time (Tcurrent+tn) for all cores Scheduler Temperature-Aware Allocation on Cores http://ces.itec.kit.edu T. Ebi, KIT, SS13
18 Thermal Management Proactive Approach  ARMA models autocorrelation in a time series y t - value at time t p q e t - noise/error at time t y ( a y ) e ( c e ) a - autoregressive coef. t i t i t i t i c - moving avrg. coef. i 1 i 1  Given a stationary stochastic process  y t can be predicted as weighted sum of past values and moving average of error term  Steps involved:  Identification: determine p and q  Estimation: determine coefficients a and c  Model checking: determine quality of estimated values http://ces.itec.kit.edu T. Ebi, KIT, SS13  aas
19 Thermal Management Proactive Approach  Benefits of ARMA model  Model is generated through automated process  Does not require in depth thermal knowledge  High accuracy achievable with large number of samples (>150)  Shortcomings  Workloads vary over time  temperature is not a stationary function!  Solution: Thermal sensors are used to check if model is still valid If not, model is updated at runtime  As such: requires thermal sensors on each core http://ces.itec.kit.edu T. Ebi, KIT, SS13
20 Thermal Management Multicore Strategies & Scalability  „Centralized‟ management scheme: Manager can use global knowledge but also forms bottleneck for communication as well as computation  central point of failure, limited scalability  „Fully distributed‟ scheme: No central bottlenecks. Management is limited by local knowledge  can result in local maxima/minima  Hierarchical scheme: Combines local management with access to global knowledge http://ces.itec.kit.edu T. Ebi, KIT, SS13
Recommend
More recommend