Stochastic Analysis of Bubble Razor Guowei Zhang Peter A. Beerel Department of Microelectronics and Ming Hsieh Electrical Engineering Department Nanoelectronics Tsinghua University, China Univ. of Southern California, USA
Why Bubble Razor? Notable delay variations in IC • – Process, temperature, and voltage variations – Aging effects Traditional synchronous design • – Too much timing margin – Performance and energy loss Existing resilient designs – detect errors & change freq. or voltage • – Canary circuits [Nakai 2005, Hirair 2012] • Delay chains mimic critical path – Razor I & II & Lite [Ernst 2003, Park 2013, Kim 2013, Tokunaga 2009] • In situ error detection and correction • Recover via architectural replay 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 2
Why Bubble Razor? Problem: Adoption challenge • – Detection: Necessary to analyze inserted error signals – Correction: Necessary to implement replay Bubble Razor [Fojtik et al., ISSCC 2012] [Fojtik et al.,JSSC 2013] • – Architecture independent – Latch-based structure • Latches don’t change together • No architectural replay requirement – Local correction mechanism • Single stall recovery • Suitable for large circuits (no global control requirement) 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 3
Our focus – Analysis of Bubble Razor • Factors effecting performance unclear – # of stages in pipeline – Delay variance – Probability of a timing error • Thus, effectiveness and sensitivities not quantified • Our proposed solution – Analyze performance using Markov Chains • Focus on an N-stage pipeline ring • Consider both normal and log-normal delays – Propose simplified model for other pipeline structures 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 4
Bubble Razor • Mechanism – 2-stage Bubble Razor Ring 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 5
Quantifying Performance of Bubble Razor • Clock Cycle Time (C) – Cycle time of global clock • Effective Clock Cycle Time (EC) – Average time to process each instruction – EC > C because of pipelines stalls • Performance of Bubble Razor – Assume we can somehow find π(working) • Probability of a latch processing an instruction – Then, EC = C / π(working) Examples No timing violations Timing violation every instruction • π(working) = 1 • π(working) = 1/2 • EC = C • EC = 2C 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 6
Markov Chain Analysis • Circuit State – If modeled correctly, next circuit state depends on current state and probability of error p – System is then a Markov Chain • Our approach – Define circuit state as combination of latch states Description of all latch states A timing violation is No W Working detected? Yes E Neither N To whom this latch sends Right Neighbor (RN) R Stalling bubbles? Left Neighbor (LN) L Both B 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 7
Markov Chain Analysis • Circuit state transition rule – Easily derived from latch state transition rule • Latch state transition rule LN RN Next State W L L, B E L (annihilation) B, R N (annihilation) B, R R E B W, E, N, R N, L W W with probability of (1-p) W E with probability of p • Leads to stationary distribution of circuit states, and thus latch states [ π(working) = π(W) + π(E) ] – Expressed as a function of p and N 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 8
Markov Chain Analysis Results • Closed-form formulas 2 1.8 1.6 1.4 N = 1 MC Model N = 2 MC Model N = 3 MC Model 1.2 N = 4 MC Model 1 0 0.2 0.4 0.6 0.8 1 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 9
Markov Chain– State Explosion Problem • Transition Probability Matrix (T) – Product of 2 transition matrices • Each represents one clock phase – Can be reduced Size of T Optimized size Final size Theoretical size N (impossible states (unreachable states = 6 ^ (2N) deleted) deleted) 1 36 7 5 2 1,296 45 21 3 46,656 301 95 4 1,679,616 2017 449 • The problem – Not feasible for large N 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 10
Our Solution: Simplified Analysis Stalling is related to all other latches • – But, can assume they are independent – 2N independent possible sources of a stall Thus • – Probability (Stalling) = 1 - ( 1 - p ) ^ 2N – EC = C + C * Probability (Stalling) = C [ 2 – ( 1 – p ) ^ 2N ] 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 11
Markov Chain versus Simplified Models 2 1.8 1.6 1.4 N = 1 MC Model N = 1 Simplified Model 1.2 N = 4 MC Model N = 4 Simplified Model 1 0 0.2 0.4 0.6 0.8 1 • EC ~ C, N, p – Simplified model is conservative – Results for small p are close 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 12
Delay Distribution Delay of pipeline stage (d) is a random variable • – p = Probability (d > C/2) – Two kinds of delay distribution • Normal distribution • Log-normal distribution [Zhai 2005, Chandrakasan 2005] – μ: mean of a delay – σ: standard deviation of a delay Final results • – EC is a function of • C: Clock cycle time • N: Number of stages • σ/μ: represents delay variance 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 13
Systematic Error Rate • How do N, σ/μ, distribution type (normal / log-normal) and model type (Markov Chain / Simplified) influence EC? – EC ~ C, N, σ/μ • Is Bubble Razor better than traditional counterparts? – Condition • Constrained with same Systematic Error Rate – For example, 0.1% – Performance metric • EC(Bubble Razor) / EC(Traditional circuit) • Set μ = 0.5 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 14
Performance Analysis Results (1 of 3) • Vary pipeline depth N 4 – Set σ/μ = 0.4 3.5 – Assume Log-normal 3 distribution 2.5 • Results 2 – BR always better 1.5 N = 1 N = 2 than sync 1 N = 3 N = 4 0.5 – As N↑, benefits 0 drop slightly (2%) 0 0.5 1 1.5 2 2.5 3 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 15
Performance Analysis Results (2 of 3) • Vary σ/μ – Assume N = 4 5 – Assume log-normal 4 distribution • Results 3 – σ/μ↑, benefit improves 2 σ / µ = 20% – For σ/μ = 0.5, EC is σ / µ = 30% σ / µ = 40% 1 reduced by 40.2% σ / µ = 50% 0 0 0.5 1 1.5 2 2.5 3 3.5 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 16
Performance Analysis Results (3 of 3) • Vary distribution and 2.5 model – Assume N = 4 2 – Assume σ/μ = 0.25 1.5 • Results – Distribution type (line 1 color) impacts benefits Log-Normal, MC Model Log-Normal, Simplified Model – Simplified model tracks Normal, MC Model 0.5 Normal, Simplified Model well ( < 5% ) • p is usually < 25% 0 0 0.5 1 1.5 2 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 17
Optimizing Strategy 5 • Point A – Constrained by SER 4 – Easy to realize 3 • Increase f until PoFF 2 σ / µ = 20% • Point B σ / µ = 30% σ / µ = 40% 1 – Local Minimum Effective σ / µ = 50% Cycle (EC) time 0 0 0.5 1 1.5 2 2.5 3 3.5 Better than traditional sync Optimal (for every N) Point A σ/μ ≥ 16% (moderate variance) σ/μ ≥ 31% (high variance) Point B σ/μ ≥ 3% σ/μ ≤ 28% 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 18
Summary and Conclusions • Proposed Analytical Methods for Analyzing Bubble Razor – Markov Chain & Simplified Model – The latter is a conservative and a close approximation • Bubble Razor indeed has better performance than traditional circuits, especially under high delay variance • Setting clock cycle time as short as possible is often efficient 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 19
Thank you! Guowei Zhang Peter A. Beerel Department of Microelectronics and Ming Hsieh Electrical Engineering Department Nanoelectronics Tsinghua University, China Univ. of Southern California, USA 26-Mar-14 Guowei Zhang, Tsinghua University and Peter A. Beerel, USC 20
Recommend
More recommend