a statistical framework for designing on chip thermal
play

A Statistical Framework for Designing On-chip Thermal Sensing - PowerPoint PPT Presentation

A Statistical Framework for Designing On-chip Thermal Sensing Infrastructure Yufu Zhang, Bing Shi, Ankur Srivastava University of Maryland, College Park {yufuzh, bingshi, ankurs}@umd.edu Outline Motivation/overview Fusion center design


  1. A Statistical Framework for Designing On-chip Thermal Sensing Infrastructure Yufu Zhang, Bing Shi, Ankur Srivastava University of Maryland, College Park {yufuzh, bingshi, ankurs}@umd.edu

  2. Outline  Motivation/overview  Fusion center design  Sensor design/compression  Noisy sensor behavior  Exploiting the correlation  Sensor placement  Overall flow and interplay  Results and conclusion 2

  3. Motivation  Thermal/power stress  Heavy task execution  Increasing chip density  Leakage power  Dynamic thermal management (DTM)  Essentially sacrificing performance for lower temperature  Need accurate runtime thermal information 3

  4. Motivation  Need sensors to provide accurate runtime thermal input  On-chip thermal sensors  On-chip sensors can sample the thermal state of the chip during runtime Sensor A simple ring Counter Output oscillator-based EN thermal sensor Enabled for a fixed period of time t p 4

  5. Motivation  Several problems for a naïve thermal sensing scheme.  Sensors cannot go everywhere  Sensors are subject to noise  Resource is limited  Our goal --- a complete thermal sensing infrastructure that includes:  Sensor design/compression  Sensor placement  Data fusion 5

  6. Overall structure 6

  7. Fusion Center Design  Central register (finite size M )  Could be a single or multiple actual registers  Fusion algorithm  Model the thermal profile as a random vector T  Predict ( T ) given the sensor obs vector ( T S )  Exploit statistical information (mean, var, correlation etc.)  Bayesian Estimation Philosophy 4000 3500 ρ σ 3000 Number of samples = µ + − µ 2500 TS T ( | ) ( ) E T T T Scalar case: σ 2000 S T S S           1500 S = µ + Σ Σ − − µ 1000 1 ( | ) ( ) E T T T Vector case: S S 500 T TS SS S 0 85 90 95 100 105 110 115 Temperature (degrees Celsius) 7

  8. Fusion Center Design  Given sensor input, the variance of T is reduced to: ( )  Σ = − ⋅ − )) T ( ( | )) ( ( | E T E T T T E T T TT s s − = Σ − Σ Σ Σ 1 TT TS SS ST  Diagonal elements – variance of the thermal estimates.  Reflects the fundamental uncertainty of our estimation. (how far away our estimates are from the real temperature)  Used to drive sensor placement.  A better metric to drive sensor placement?  Sensors are not like cameras  Generate the probability of capturing all hotspots 8

  9. Sensor Design  Noisy sensor behavior (Monte Carlo Simulation)   1 1  −  2 1 ln 3 4 C V V V = = = + f   t  DD t  t + µ − − PHL ( ) P N t t ( / ) ( ) 2   C W L V V  V V V  n ox n DD t DD t DD PHL PLH Sensor − µ = µ Counter 1.5 ( / ) T T Output EN / 0 0 n p = + − 0.002( ) V V T T t t 0 0 Enabled for a fixed period of time t p 4 3.5x 10  Sensor readings are T = 100°C 3 compressed as well T = 80°C 2.5 Number of samples T = 60°C due to center register 2 T = 40°C T = 20°C 1.5 size constraint 1  Hypothesis testing 0.5 0 0 50 100 150 200 250 300 350 Sensor frequency(MHz) 9

  10. Sensor Design  Target: minimize the expected prediction error: = − (| | ) Cost E T T T pred real obs n ∑ = − ⋅ = | | ( | ) T H prob T H T pred i real i obs = 1 i = ⋅ ( | ) prob T T H P = obs real i i ( ) prob T obs = ⋅ Bayes rule ( | ) prob T T H P = obs real i i ∑ n = ⋅ ( | ) prob T T H P = obs real j j 1 j  Optimal decision rule: = δ = − ( ) arg min (| | ) T T E T T T pred obs pred real obs = 1 ... T H H pred n  Implement as an encoder at the sensor output 10

  11. Sensor and fusion center co-design  How do we compress sensors so that…  They fit into the central register  Collectively they provide better accuracy ---- more compressed sensors vs fewer non-compressed ones  Bit allocation problem:  Decide how to allocate a total of M bits to n sensors so that the overall expected estimation error is minimum: ( suppose s i is the number of bits allocated to sensor i ) ( ( , ,..., )) Minimize E error s s s 1 2 n ≤ ≤  0 s b  i i  ∑  Subject to = s M  i i 11

  12. Sensor Compression  Target: to reduce the overall expected error caused by sensor compression. ( ) = ( , ,..., ) TotalCost E error s s s 1 2 n       ∑   c a = −   | ( | ) ( | ) | E E T T E T T s s i i ( | ) E T T   S ∀ i : grids i     = µ + Σ Σ − − µ     1   ( ) T ∑ S c a − T TS SS S = Σ Σ − 1 i | ( ) |   E T T s s TS SS   ∀ rows  Different compression scheme leads to different overall error.  Can be formulated as a optimization problem (see details in our paper). 12

  13. Sensor Placement  Let “ S” and “ T” represent the set of sensor locations and all chip locations, respectively.  Problem formulation: ⊂ = | | choose S T with S n  Σ ( ) such that trace is minimized TT  TT Σ  As mentioned earlier represents the fundamental uncertainty/variance associated with our thermal estimates  − Σ = Σ − Σ Σ Σ 1 TT TT TS SS ST 13

  14. Sensor Placement Algorithm 14

  15. Overall flow and interplay Increase number of sensors Design spec Sensor placement Fusion center design total size of CR = M Yes Bit allocation/ Statistical info Too much error? sensor compression No Evaluate overall  trace Σ ( ) E(error) = TT Done 15

  16. Experimental Results 105 Actual temperature Our estimates Temperature (degrees Celsius) 100 Range-based estimates 95 90 85 80 75 1200 1300 1400 1500 1600 1700 1800 Time (seconds) Fig. 1 Dynamic temperature tracking curves 16

  17. Experimental Results Fig. 2 RMS error comparison when increasing the number of sensors 17

  18. Conclusion  We presented a unified statistical framework for designing a complete thermal sensing infrastructure.  Significant improvement in thermal sensing accuracy can be achieved with very small overhead  Our methodology has the capability of trading off complexity for accuracy at will. It also takes into account various design considerations such as sensor noise and area constraints. 18

  19. Thank you! 19

Recommend


More recommend