Does noise affect τ ? � Probability of escape from metastability does not change with gaussian noise (Couranz and Wann 1975 ) Trajectories 0.7 0.5 0.3 0.1 Volts -0.1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 -0.3 -0.5 -0.7 Time 33 Tutorial 7 April 2008
Does noise affect τ ? � Probability of escape from metastability does not change with gaussian noise (Couranz and Wann 1975 ) Trajectories 0.7 0.5 0.3 0.1 Volts -0.1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 -0.3 -0.5 -0.7 Time 34 Tutorial 7 April 2008
Noise can change the input time D Q D ∆ t in Clock Q Clock ∆ t in -> 0 Or maybe not..… 35 Tutorial 7 April 2008
The normal case Probability Probability of initial difference due t n to noise component P 1 (v) Probability of initial difference due T >> t n to input clock data overlap P 0 (v) Convolution Result of convolution P(v) Time 36 Tutorial 7 April 2008
The malicious input Probability of initial difference due to t n noise component P 1 (v) Probability of initial difference with T << t n zero input clock data overlap P 0 (v) Probability Result of convolution P(v) Time 37 Tutorial 7 April 2008
Noise measurement Probability of an output 1 as a function of input voltage difference A measurement of 1.0000 approximately 1.7mV Probability RMS at the input 0.5000 corresponds to about 0.6mV total between 0.0000 latch nodes -0.0030 -0.0010 0.0010 0.0030 4 kT Input mV ≈ 0 . 7 mV C This is equivalent to about 0.1ps Typically this leads to a synchronization time of about 11 τ longer than the simple case for a malicious input. 38 Tutorial 7 April 2008
Outline � What’s the problem? Why does it matter? � Synchronizer and arbiter circuits � Noise, and its effects � Latency, and how to overcome it � Metastability measurement 1 – Simple measurements � Arbitration � Metastability measurement 2 – Second order effects (Which may matter) 39 Tutorial 7 April 2008
Request and Acknowledge Data Available REQ D Q D Q Read Clocks DATA ACK Q D Q D Read done Write Clocks 40 Tutorial 7 April 2008
Latency � It takes one - two receive clocks to synchronise the request � Then one – two write clocks to acknowledge it � Significant latency (1-3 clocks) � Poor data rate (2 – 6 Clocks) 41 Tutorial 7 April 2008
FIFO � Can improve data rate by using a FIFO � But not latency (which gets worse) � FIFO is asynchronous (usually RAM + read and write pointers) DATA DATA FIFO Data Available Free to write Full Not Empty Q D Q D D Q D Q Write clock 1 Read Clock 2 Write clock 2 Read Clock 1 WRITE READ Write Data Read done 42 Tutorial 7 April 2008
Timing regions can have predictable relationships � Locked – Two clocks are not the same but phase linked, The relationship is known as mesochronous . – Two clocks from same source – Linked by PLL – One produced by dividing the other – Some asynchronous systems – Some GALS � Not locked together – Phase difference can drift in an unbounded manner. This relationship is called plesiochronous – Two clocks same frequency, but different oscillators. – As above, same frequency ratio 43 Tutorial 7 April 2008
Don’t synchronise when you don’t need to � If the two clocks are locked together, you don’t need a synchroniser, just an asynchronous FIFO big enough to accommodate any jitter/skew � FIFO must never overflow, so there is latency DATA DATA FIFO ACK IN REQ OUT REQ IN ACK OUT Read done Write Data Available 44 Tutorial 7 April 2008
Mesochronous data exchange � Intermediate X register used to retime data � Need to find a place where write data is stable, and read register available – Greenstreet 2004 DATA Out DATA In W X R Write Clock Read Clock Controller 45 Tutorial 7 April 2008
Finding the place to clock X � provided that tc > 2(th + ts) at least one place is always available for data transfer, but we lose one cycle. – Write before read, or – Read before write Write Clock Read Clock t h t s t h t s t h t s OK OK RW WR 46 Tutorial 7 April 2008
Pre synchronizing If the phase can vary with time (Plesiochronous), synchronization still need not cause large latencies Detect conflict Potential conflict (metastability issue) zone Read Clock d d Delay read clock Write Clock Predicted conflict Synchronization problem known in advance 47 Tutorial 7 April 2008
Conflict prediction � Predict when clocks are going to conflict and delay synchronization � Dike’s conflict detector WCLK WCLK R1 G1 R1 G1 MUTEX MUTEX R2 R2 G2 G2 d d conflict conflict R1 G1 R1 G1 MUTEX MUTEX d d RCLK RCLK R2 R2 G2 G2 d d d d RCLK RCLK conflict region 48 Tutorial 7 April 2008
Clock delay synchronizer (Ginosar 2004) REG REG DATA DATA WCLK WCLK conflict conflict SYNC SYNC 0 1 0 1 detector detector RCLK RCLK t KO t KO t KO t KO d d d d RCLK RCLK conflict region 49 Tutorial 7 April 2008
Pre synchronizer latency � Nominally 0 – 1 clock cycle � Relies on accurately predicting conflicts � Clocks must remain stable over synchronisation time. � Always lose t ko of next computation stage � Alternative: shift all conflicts to next read cycle – On average this loses 2d – 2d must be big enough to cover any clock drift/jitter over synchronization time 50 Tutorial 7 April 2008
Speculation � Mostly, the synchronizer does not need 30 τ to settle � Only e -13 (0.00023%) need more than 13 τ � Why not go ahead anyway, and try again if more time was needed 51 Tutorial 7 April 2008
Low latency synchronization � Data Available, or Free to write are produced early. � If they prove to be in error, synchronization failed. � Read Fail or Write Fail flag is then raised and the action can be repeated. DATA DATA FIFO Free to write Data Available Full Not Empty Speculative Speculative Write Fail Read Fail synchronizer synchronizer Write clock Read Clock WRITE READ Write Data Read done 52 Tutorial 7 April 2008
Q Flop With CLK low, both outputs are low � With CLK high, Q becomes equal to D only after metastability � � Q and Qbar are both low until metastability resolved � We can detect events that take longer than a half cycle D Q Q Gnd CLK 53 Tutorial 7 April 2008
Was it OK? FF#1 is set after a half cycle - 2 τ , FF#2 after a half cycle, FF#3 at a full � cycle Latency is normally half a cycle = 15 τ , but synchroniser fails often � By the time we look at the Read Fail signal ( a full cycle = 30 τ ) all � signals are stable DATA Data Available Speculative Synch Final Synch D Q D Q Q #2 #4 Not Empty Early Synch D Q D Q Q F Fl lo op p Read Q D Q CLK D Q Fail 2 τ τ 2 #3 QBAR #1 Read Clock 54 Tutorial 7 April 2008
When to recover Early Speculative Fail Comment FF1 FF2 FF3, 4 Half Cycle Half End of – 2/13 τ Cycle/15 τ Cycle/30 τ ? ? metastable? Unrecoverable error, Probability low. 0 0 0 No data was available 0 1 1 Stable at the end of the cycle, but the speculative output may have been metastable. Return to original state 1 1 0 Normal data Transfer 55 Tutorial 7 April 2008
Speculative Synchronisation latency � Recovery means restoring any corrupted registers, and may take some time, BUT � Probability of recovery operation is e -13 , so little time lost on average. � Can reduce average synchronization latency from one cycle to a half cycle 56 Tutorial 7 April 2008
Comments � Synchronization/arbitration requires special circuit elements � They’re not digital! � If there’s a real choice, and bounded time you will have failures. � The MTBF can be made longer than the life of the universe � Design gets more difficult with small dimensions � Latency is a problem, but not insuperable. � Synchronizers are not deterministic. 57 Tutorial 7 April 2008
Outline � What’s the problem? Why does it matter? � Synchronizer and arbiter circuits � Noise, and its effects � Latency, and how to overcome it � Metastability measurement 1 – Simple measurements � Arbitration � Metastability measurement 2 – Second order effects (Which may matter) 58 Tutorial 7 April 2008
Testing synchronizers Data 10.01 MHz Clock 10MHz Q Output 100pS Osc 1 Scope Trigger D Q #1 • Data and Clock are asynchronous Osc 2 • Q only changes if Data and clock edges are within 100ps (1 in 1000) 59 Tutorial 7 April 2008
Event histogram t = Clock to Q time D Q #1 Number Log(Number of events of events) Q to clock delay Q to clock delay � Trigger from Q going high � Observe clock, so scale is negative � Log scale of events because T = = − τ / Elapsed . t Events T T f f e Elapsed w c d MTBF 60 Tutorial 7 April 2008
Experimental measurement set-up • Two asynchronous oscillators are used to drive the data and the clock inputs of a D-type edge triggered Flip-Flop. • With the slight difference in the frequency of the two oscillators, the clock rising edge may or may not produce a change in the Q output. • Oscillators should produce constant probability of Data – Clock change with time (But may not: Cantoni 2007) 61 Tutorial 7 April 2008
Altera FPGA measurements � An Altera FLEX10K70 used here, manufactured in a 0.45 µ m CMOS process. � The events collected over a period of 4 hours. � To calculate the value of τ (resolution time constant), the histogram of the trace density can be plotted in semi-log scale . 62 Tutorial 7 April 2008
Altera FPGA plot M e ta sta b le re g io n D e te rm inis tic S ynchro no us 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 E v e n t 1 0 0 0 S e rie s 1 1 0 0 1 0 1 0 .0 0 E + 2 .0 0 E - 4 .0 0 E - 6 .0 0 E - 8 .0 0 E - 1 .0 0 E - 1 .2 0 E - 1 .4 0 E - 1 .6 0 E - 0 0 1 0 1 0 1 0 1 0 0 9 0 9 0 9 0 9 T im e � The X-axis represent time from a triggering Q output back to the clock edge. Therefore increasing metastability time is shown from right to left. Here τ = 120ps � 63 Tutorial 7 April 2008
Points to consider � This type of measurement depends on – Uniform distribution of clock data overlaps – Often not true because the oscillators affect each other (Cantoni 2007) � Uses an expensive oscilloscope to do the histograms – You don’t HAVE to use one. Counters and delays will do � The theory only applies to simple FFs – FFs need to be predesigned, or laid out in a small area 64 Tutorial 7 April 2008
Measurements in a bistable element D Q CLK � A D-type edge triggered Flip-Flop constructed using NAND gates on the Altera FPGA. � The master and the slave were placed very close to each other. 65 Tutorial 7 April 2008
Components are close, but not in same cell � Routing delays play significant role in this experiment. � Long metastability times due to the feedback loop delays . 66 Tutorial 7 April 2008
Measurements in a bistable element (cont.) 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 Events 1 0 0 0 S e rie s 1 1 0 0 1 0 1 2 .4 5 E -0 8 2 .5 5 E -0 8 2 .6 5 E -0 8 2 .7 5 E -0 8 2 .8 5 E -0 8 2 .9 5 E -0 8 3 .0 5 E -0 8 3 .1 5 E -0 8 T im e � From the histogram, a damped oscillation in the deterministic region can be observed. � The value of τ is in the order of 5 nanoseconds, making this particular design unsuitable for any application. � Circuits with feedback loops passing through LUTs can exhibit oscillation. 67 Tutorial 7 April 2008
Too much delay D Q CLK � It may not be easy to place elements close to each other � Extra delay can cause the loop to become unstable 68 Tutorial 7 April 2008
Complex response � We put an extra gate in the feedback loop of the master FF here � So the output oscillates, and causes ripples in the histogram � Time between cycles is about 3ns, so you get lots of outputs at more than 20ns � Demo by Nikolaos Minas 69 Tutorial 7 April 2008
Outline � What’s the problem? Why does it matter? � Synchronizer and arbiter circuits � Noise, and its effects � Latency, and how to overcome it � Metastability measurement 1 – Simple measurements � Arbitration 11:00 – 12:00 � Metastability measurement 2 – Second order effects (Which may matter) 70 Tutorial 7 April 2008
Arbitration � Complex systems may require that some Data Output requests overtake others Data switch line control � Here three input channels require access line 0 P0 to a single output port Dynamic priority arbiter r0 control g0 � Each request may have a different priority line 1 P1 � Priority can be r1 control g1 topologically fixed, or line 2 determined by a function P2 r2 control g2 71 Tutorial 7 April 2008
Types of arbiter � Topologically fixed – priorities determined by structure, e.g. daisy-chain requests ~r 1 ,r 1 g 1 d 1 r 2 g 2 d 2 r n g n d n Start order of polling � Static or dynamic priority – determined by fixed hardware, or priority data supplied 72 Tutorial 7 April 2008
73 grants Static or dynamic priority Priority logic Control and Interface Request lock register Tutorial 7 April 2008 priority busses requests
Metastability and priority � Lock the request pattern – incoming requests cause Lock to go high – following MUTEX ensures that request w ins or l oses ? l MUTEX Lock s r w � Evaluate priorities with a fixed request pattern 74 Tutorial 7 April 2008
75 G3 G2 G1 C C C Static priority arbiter Priority Module Lock Register s1 s3 s2 MUTEX MUTEX MUTEX q r* s r3 r2 r1 Lock C s* q s* q s* q r r r Tutorial 7 April 2008 R3 R2 R1
More than one request � Priority needed if requests are competing � Shared resource free – resolution required only if second request arrives before the lock signal due to first request � Shared resource busy – Further requests may accumulate, and one may be higher priority 76 Tutorial 7 April 2008
77 G3 G2 G1 C C C Priority Module Two more requests Lock Register s1 s3 s2 MUTEX MUTEX MUTEX q r* s r3 r2 r1 Lock C s* q s* q s* q r r r Tutorial 7 April 2008 R3 R2 R1
Outline � What’s the problem? Why does it matter? � Synchronizer and arbiter circuits � Noise, and its effects � Latency, and how to overcome it � Metastability measurement 1 – Simple measurements � Arbitration � Metastability measurement 2 – Second order effects (Which may matter) 78 Tutorial 7 April 2008
Measured Histogram 0.6mv/0.1ps time 0.6mV is about the level of thermal noise on a node in 0.18 µ 79 Tutorial 7 April 2008
Why isn’t it straight? Vout 2 1.8 High Start Volts 1.75V 1.6 0 100 200 300 400 500 Low Start 1.4 1.2 ps • The starting point makes a difference • Early events are more affected than late ones 80 Tutorial 7 April 2008
Histogram of events Model Response 0.1 0 0.2 0.4 0.6 0.8 0.01 Low Start Events 0.001 High Start Slope 0.0001 0.00001 Output time, ns � Probability of an event occurring within 10ps of a particular output time 81 Tutorial 7 April 2008
Metastability filters � Affect response � Inverters usually have a threshold close to the metastability level 0 Vdd/2 Vdd/2 Vt =Vdd/4 0 Vdd/2 82 Tutorial 7 April 2008
MUTEX with low threshold output Low Threshold Inverter R1 10 1 0 0.2 0.4 0.6 0.8 1 1.2 0.1 0.01 Ev ents Events 0.001 R2 0.0001 0.00001 0.000001 ns � Starts high, needs to go low to give output � Threshold about 100 mV low 83 Tutorial 7 April 2008
MUTEX with filter Filter 10 R1 1 0 0.2 0.4 0.6 0.8 1 1.2 0.1 Events 0.01 Events 0.001 R2 0.0001 0.00001 0.000001 ns � Needs more than 1V difference to give output � Slower, but slope more constant 84 Tutorial 7 April 2008
What we know � Things we know – Synchronizers are unreliable, the more there are the more unreliable the system – How to measure reliability up to a few hours � Things we know we don’t know – What reliability is at 3 years – How to measure it – Complex circuits give complex results, the simple MTBF formula may not apply � Things we don’t know we don’t know – What happens on the back edge of the clock 85 Tutorial 7 April 2008
74F5074 Histogram -4ns -7ns � Slope, τ , is about 120ps (in fast region) � Typical delay time (most events) is 4ns � 99.9% of clock cycles do not cause useful events � To get 1 event at 7ns requires hours 86 Tutorial 7 April 2008
Increasing the number of events � Test FF is driven to metastability � Every clock produces a metastable response � Integrator ensures half outputs high, half low Integrator Integrator Fast 100ps variation Variable Delay 10 MHz 10 MHz D Q D Q D Q D Q Test Test Slave Slave FF FF FF FF 87 Tutorial 7 April 2008
What you get � Clock to D (Input) histogram 200ps 3ns � Q to Clock (Output) histogram 88 Tutorial 7 April 2008
Interpreting results 0 < Balance point > 1 40000 Input time distribution is not flat 35000 30000 Proportion of total inputs causing events 25000 Events vs input time 20000 15000 Proportion of total output events vs output time 10000 5000 0 Mapping output times to input times 50 150 250 350 D to Clock, ps Total input events normalized Total output events normalized 1.0 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0.0 200 250 300 350 3.50 4.50 5.50 Input time, ps Output time, ns 89 Tutorial 7 April 2008
100ps variation � ∆ t is the time from the “balance point” of ~200ps 1.00E-09 � Similar to original graph BUT 1.00E-10 ∆ t not events 1.00E-11 � Much quicker to gather data 1.00E-12 Delta t � Reliability results days not 1.00E-13 minutes 1.00E-14 � ∆ t does not depend on f c and 1.00E-15 f d or measurement time. 1.00E-16 Events do 1.00E-17 1 3.00 5.00 7.00 = MTBF Q time, ns ∆ f f t c d 90 Tutorial 7 April 2008
Deep metastability Minimum deviation is 7.6ps � 100/7.6 = 13 times as many events with small input times (weeks not � days) They occur every 100ns, too fast for the scope � Only 1 in 1000 captured � Most events still produce early output times � � Filter them out so that the event rate is much slower Results years not weeks � Scope input Q D Q D Early Early FF FF Q D Q D Test Test FF FF Scope trigger Q D Q D Late Late FF FF t1 (early) t2 (late) 91 Tutorial 7 April 2008
Results of all methods 1.0E-09 1.00E-09 1.0E-10 1.00E-10 1.0E-11 1.00E-11 1.0E-12 1.00E-12 1.0E-13 1.00E-13 Delta t Delta t 1.0E-14 1.00E-14 1.0E-15 1.00E-15 1.0E-16 1.00E-16 1.0E-17 1.00E-17 1.0E-18 1.00E-18 1.0E-19 1.00E-19 1.0E-20 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10. 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 0 Output time, ns Q to Clock time, ns 100ps input variation 100ps input variation 7.6ps noise 7.6ps noise Deep metastability Deep metastability 74F5074 Schottky bipolar 74ACT74 CMOS � Reliability measurements to 10 -20 seconds (MTBF ~ 11days) � Done in 3 minutes 92 Tutorial 7 April 2008
Results 1 1 = = = 11 MTBF days − ∆ 20 7 7 10 . 10 . 10 f f t c d � We can measure reliabilities of weeks not hours in a few minutes � To get to 3 years reliability (10 -22 seconds input overlap?) the experiment is run for 5 hours – picoseconds 10 -12 , femtoseconds 10 -15 , attoseconds 10 -18 , zeptoseconds 10 -21 , yoctoseconds 10 -24 � More than two slopes on one sample, 350ps, 120ps and 140ps � We can see output events at up to 10 ns 93 Tutorial 7 April 2008
When the clock goes low Master Latch Slave latch 1E-13 D D Q Q D Q M M S 1E-14 Clock Inverse Clock 1E-15 Delta t 1E-16 Clock 1E-17 � Clock goes high, master goes 1E-18 metastable 5.0 6.0 7.0 8.0 9.0 10.0 ns � Master output arrives at slave No Back Edge – Before slave clock high: transparent 4.5 Back Edge gate delay t d 5.5 Back Edge – As slave clock goes high: metastable, slightly longer delay Back edge of clock causes increased delay 94 Tutorial 7 April 2008
Effect of clock low on 74F5074 1.00E-11 5.00E- 6.00E- 7.00E- 8.00E- 9.00E- 1.00E- 1.10E- 1.20E- 1.00E-12 09 09 09 09 09 08 08 08 1.00E-13 1.00E-14 1.00E-15 Input time 1.00E-16 6 ns pulse 1.00E-17 1.00E-18 1.00E-19 1.00E-20 1.00E-21 Output time 5ns pulse 4ns pulse No back edge 4 ns pulse � 1 – 3 ns additional delay 95 Tutorial 7 April 2008
On-chip metastability measurement � Analog delay replaced by digital delay (VDL) � Analog integrator replaced by counter Up/Down Integrator Integrator Counter Variable Delay 100 MHz 100 MHz VDL D Q D Q D Q D Q Test Slave Test Slave FF FF FF FF VDL 96 Tutorial 7 April 2008
Variable delay stage � Pair of current Vdd starved inverters In Out � Source current i i variable in steps � Delay changes can be as low as 0.1ps Gnd 97 Tutorial 7 April 2008
On-chip Implementation Controlling Circuit using standard cells based design Devices under test using full custom design Layout of on-chip measurement circuit 98 Tutorial 7 April 2008
99 Devices Under Test Jamb Latch Tutorial 7 April 2008
100 Devices Under Test Robust Synchronizer Tutorial 7 April 2008
Recommend
More recommend