online aging monitoring and resilience
play

Online Aging Monitoring and Resilience Hao-Chun Chang, Li-An Huang, - PowerPoint PPT Presentation

Selective Sensor Placement for Cost-Effective Online Aging Monitoring and Resilience Hao-Chun Chang, Li-An Huang, Kai-Chang Wu Department of Computer Science, National Chiao Tung University Yu-Guang Chen Department of Electrical Engineering,


  1. Selective Sensor Placement for Cost-Effective Online Aging Monitoring and Resilience Hao-Chun Chang, Li-An Huang, Kai-Chang Wu Department of Computer Science, National Chiao Tung University Yu-Guang Chen Department of Electrical Engineering, National Central University

  2. Outline Introduction • Background • Motivation • Contribution Preliminaries Proposed Framework Experimental Results Conclusion

  3. Background ⚫ Aging effect is the major challenge of reliability-aware IC/SOC design techniques ⚫ The effects of device aging ⚫ Performance degradation ⚫ Potential failure

  4. Motivation ⚫ Traditional design method adopts guard-band by adding extra timing margin ⚫ Razor flip-flop is a well-known technique of timing speculation May need to be deployed widely throughout a circuit ⚫ ⚫ The sensitization rate of critical paths is negligibly small ⚫ Many critical paths have common sub-paths in the forward section

  5. The Deployment RFF1 G4 FF1 P1 G3 RFF2 G5 FF2 P2 PI G1 G2 G6 G7 RFF3 FF3 P3 G8

  6. The Deployment TD G3 RFF1 G4 FF1 P1 G3 RFF2 G5 FF2 P2 PI G1 G2 G6 G7 RFF3 FF3 P3 G8

  7. The Deployment TD G2 RFF1 G4 FF1 P1 G3 RFF2 G5 FF2 P2 PI G1 G2 G6 G7 RFF3 FF3 P3 G8

  8. Contribution ⚫ Reduction in hardware cost with insignificant performance loss ⚫ Tradeoff between extra hardware cost and performance by adjusting the weights of cost function

  9. Outline Introduction Preliminaries • Razor Flip-Flop • Transition Detector Proposed Framework Experimental Results Conclusion

  10. Razor Flip-Flop (RFF) ⚫ Consist of a regular main flip-flop, an additional shadow latch and some control logic

  11. Transition Detector (TD) ⚫ Detect whether the signal transition has happened lately ⚫ Forecast the excessive aging from forwarding gates ⚫ Send a warning signal to stall for one clock cycle Detection Warning window D Q CLK signal Detection The monitored window signal (D) The monitored signal (D) D Q T1 T2 Warning signal

  12. TD: Detection Window ⚫ Detect whether the signal transition has happened lately ⚫ Reserve the execution time with worst-case aging for the remaining sub-paths = + −   , ∀ p monitored by TD g DW t Tc Max ( RD ) ⚫ g w p , g ⚫ TD g : Transition detector at gate g ⚫ DW g : Detection window on TD g ⚫ Tc : Clock period ⚫ t : Current clock cycle beginning time ⚫ 𝜀 w : (Worst-case) 10-years aging of the long path ⚫ RD p : The remaining delay of vulnerable path p

  13. Outline Introduction Preliminaries Proposed Framework • Problem Formulation • Exact Set Cover • Maximum Satisfiability Experimental Results Conclusion

  14. Problem Formulation ⚫ TD deployment to monitor vulnerable paths ⚫ A TD monitors those vulnerable paths, passing through it ⚫ It’s a covering problem! G4 FF1 P1 G3 G5 FF2 P2 PI G1 G2 G6 G7 FF3 P3 G8

  15. Exact Set Cover (ESC) U ⚫ The universal set U : {1, 2, 3, 4} 1 2 ⚫ Given a collection C of subsets of U ⚫ C : {{1, 3}, {2, 3}, {2, 4},{1, 2}} 3 4 ⚫ An exact cover is a sub-collection C * of C ⚫ Each element in U is covered by exactly one subset in C * ⚫ C * : {{1, 3}, {2, 4}}

  16. ESC: The Universal Set U ⚫ Each vulnerable path represents a variable of the universal set ⚫ U : {P1, P2, P3} G4 FF1 P1 G3 G5 FF2 P2 PI G1 G2 G6 G7 FF3 P3 G8

  17. ESC: A Collection C of Subsets of U ⚫ Each gate that including FF and PI on vulnerable paths defines a set S g and S g is a subset of the universal set ⚫ C : { S PI , S G1 , S G2 , …, S G8 , S FF1 , …, S FF3 } ⚫ S G2 : {P1, P2, P3} ⚫ S G3 : {P1, P2} G4 FF1 P1 ⚫ S FF3 : {P3} G3 G5 FF2 P2 PI G1 G2 G6 G7 FF3 P3 G8

  18. ESC: A Sub-collection C * of C ⚫ Each element in U is covered by exactly one subset in C * ⚫ C * : { S G1 } = {{P1, P2, P3}} ⚫ C * : { S G3 , S G6 } = {{P1, P2}, {P3}} ⚫ C * : { S FF1 , S FF2 , S FF3 } = {{P1}, {P2}, {P3}} G4 FF1 P1 G3 G5 FF2 P2 PI G1 G2 G6 G7 FF3 P3 G8

  19. Maximum Satisfiability ⚫ Maximum number of clauses can be made true by an assignment ⚫ Weighted MAX-SAT solver ⚫ The minimum sum of costs of unsatisfied clauses ⚫ CNF: Conjunctive Normal Form ⚫ For each candidate g introduce a new variable V g ⚫ Hard clauses I &II ⚫ Boolean constraint ⚫ Soft clause ⚫ Cost function

  20. MAX-SAT: Hard Clause I ⚫ If two subsets S g and S h intersect, produce a hard clause    , ∀ g , h ; g ≠ h and S g ∩ S h ≠ ∅ V V ⚫ g h    ⚫ V V G2 G3 TD G2 {P1, P2} G4 FF1 P1 ∨ F T G3 G5 FF2 P2 PI G1 G2 G6 G7 {P1, P2, P3} FF3 P3 G8

  21. MAX-SAT: Hard Clause II ⚫ For each vulnerable path p , produce a hard clause  V ⚫  g  S p g      ⚫ V V V V V V PI G1 G2 G3 G5 FF2 T ∨ G4 FF1 P1 ∨ ∨ F F ∨ ∨ F F F G3 RFF2 G5 FF2 P2 PI G1 G2 G6 G7 FF3 P3 G8

  22. MAX-SAT: Soft Clause ⚫ For each candidate g with COST( g ), produce a soft clause ( )  V COST g ⚫ g ⚫ The minimum sum of costs of unsatisfied clauses ⚫ The minimum sum of costs of C *

  23. Definition: Cost ⚫ Cost function ( )  =  +  COST g W M ( g , p ) W A ⚫ M A g   p S g ⚫ A g : Area cost ⚫ TD: 1.5 ⚫ RFF: 1 ⚫ W M : The weight of misprediction rate ⚫ M( g , p ): Misprediction rate at gate g on the vulnerable path p ⚫ W A : The weight of area cost ⚫ A g : Area cost for deployment

  24. Cost: Sensitization Rate ⚫ Probability that consecutive side-inputs from starting point are non- controlling value SR(G3,P1) SR(G2, P1) = SR(G3,P2 ) = SR(G2,P2) 0.15*(1-0.6) G4 = SR(G2,P3) FF1 G3 0.5*(1-0.7) 0.6 G5 FF2 PI G1 G2 0.1 0.7 G6 G7 FF3 G8 0.3 0.4 ⚫ Sensitization rate ⚫ Probability of side inputs being 1 ⚫ SR( g , p ): Sensitization rate at gate g on path p

  25. Cost: Misprediction Rate ⚫ Probability that the sensitization from the starting point but not propagate to the endpoint ( ) =  − M ( g , p ) SR ( g , p ) 1 SR ( FF , p ) / SR ( g , p ) ⚫ SR(FF1, P1) 0.06 ∑ ∀ p ϵ S G2 M(G2, p ): 0.342 G4 FF1 M(G2, P1): 0.09 SR(G2, ( ∀ p ϵ S G2 )) G3 SR(FF2, P2) 0.006 0.15 G5 FF2 PI G1 M(G2, P2): 0.144 G2 0.042 SR(FF3, P3) G6 G7 FF3 G8 M(G2, P3): 0.108 ⚫ Sensitization rate ⚫ SR( g , p ): Sensitization rate at gate g on path p ⚫ M( g , p ): Misprediction rate at gate g on the vulnerable path p

  26. The Input & Output of MAX-SAT ⚫ Finally, Combine three kinds of clauses   HC 1 HC 2 SC ⚫ ⚫ The deployment by the output of MAX-SAT solver ⚫ Expected: an acceptable solution exists ⚫ Or no more improvement ⚫ Unexpected: W M and W A adjustment

  27. Weight Adjustment: Case 1 ⚫ W M : W A = 1:10 ⚫ More consideration on hardware cost ⚫ Total cost: 15.342 TD G2 G4 FF1 G3 15.342 G5 FF2 PI G1 G2 G6 G7 FF3 G8

  28. Weight Adjustment: Case 2 ⚫ W M : W A = 5:1 ⚫ More consideration on performance TD G3 ⚫ Total cost: 1.77+1 1 G4 FF1 G3 1 G5 FF2 PI G1 G2 G6 1 G7 RFF3 FF3 G8

  29. Weight Adjustment: Case 3 ⚫ W M : W A = 10:1 ⚫ More consideration on performance ⚫ Total cost: 1+1+1 RFF1 1 G4 FF1 G3 RFF2 1 G5 FF2 PI G1 G2 G6 G7 RFF3 1 FF3 G8

  30. Outline Introduction Preliminaries Proposed Framework Experimental Results Conclusion

  31. Setting ⚫ The benchmark are chosen from the IWLS’05 and ISCAS’89 ⚫ The technology used is TSMC 65nm GP standard cell series ⚫ Timing reports are provided by ITRI ⚫ Workstation: ⚫ Linux ⚫ C++ ⚫ MAXHS 2.9

  32. Outline Introduction Preliminaries Proposed Framework Experimental Results Conclusion

  33. Conclusion ⚫ Cost-effective TD deployment for aging resilience with insignificant performance loss ⚫ Tradeoff between extra hardware cost and performance

  34. The End Thanks for listening!

Recommend


More recommend