On the Stability and Robustness of Non-Synchronous Circuits with Timing Loops Matthias Függer, Gottfried Fuchs, Ulrich Schmid and Andreas Steininger Vienna University of Technology Embedded Computing Systems Group {fuegger, fuchs, schmid, steininger}@ecs.tuwien.ac.at
The main messages � sketch DARTS clocking scheme for SoC � study systematic timing variations in asynchronous circuits � influence of k-of-n voting (fault tolerance) on − tractability of analysis − stability − robustness 2
Clocking in SoCs synchronous SoC � global synchrony ☺ very powerful abstraction (< 1 tick) efficient metastability-free communication � single point � nanoscale requires fault tolerance of failure for clocks as well [Seifert et al.] 3
Clocking in SoCs DARTS GALS synchronous SoC � global synchrony � global synchrony � NO (inherent) (< 1 tick) (> 1 tick) global synchrony � single point � no single point � no single point of failure of failure of failure 4
DARTS Principle & Implementation Distributed Algorithms for Robust Tick Synchronization (1) Initially: (2) send tick(0) to all; clock:= 0; (3) If received tick(m) from at least f+1 remote nodes and m > clock: (4) send tick(clock+1),…, tick(m) to all; clock:= m; (5) If received tick(m) from at least 2f+1 remote nodes and m >= clock: (6) send tick(m+1) to all; clock:= m+1; − asynchronous HW- implementation − two concurrent rules − k-of-n- thresholds ... for fault tolerance ... 5
Properties of the DARTS Clock � precision of a few clock cycles & bounded accuracy – can be guaranteed by formal proof [EDCC06, PODC09] – on condition of some (weak) routing constraints � a system of 3 f +2 nodes can tolerate f Byzantine faults (nodes and interconnect) – guaranteed by the same formal proof ? stability of clock frequency – important for many applications – BUT: adaptive systems cannot be completely stable ? robustness of clock frequency – important for nanotechnology – variations (tolerances, environment,memb-ship) affect frequency 6
An Interesting Observation -9 x 10 Setup: - 5 node DARTS system 9.5 - pronounced wire delays 9 round times [s] 8.5 8 7.5 7 0 2 4 6 8 10 tick number -9 x 10 9.5 9 8.5 � permanent „oscillation“ of round time 8 � systematic, not random effect! 7.5 � strong dependence on wire delays 7 7 0 2 4 6 8 10 7 tick number
A Closer Look… � theoretical model − min/max/+ algebra difference equation � simplify problem − simplify algorithm − simplify model topology � wait-for-all instead of k-of-n � 3 nodes only � no concurrency (one rule) can use max/+ algebra diff equ − nonlinear control theory from − game theory − asynchronous logic 8
An Example 10 9 1 round time 10 8 round times [s] 4 1 7 1 6 5 8 4 1 2 3 4 tick number tick number k longest paths of length k ending in node P determine P‘s round times 9
Another Example 10 9 8 1 round time 10 7 1 round times [s] 1 6 5 5 4 3 2 8 1 characteristics: 1 2 3 4 5 6 7 8 9 tick number tick number k − length of initial transient phase : depends on delays & initial phase alignment − mean rate („cycle vector“) during periodic phase : determined by cycle with maximum mean cycle weight − peak-to-peak variation (inluding/excluding transient phase) 10
Is this Relevant for HW Designers? � timing oscillations in all asynchronous architectures − with reasonable complexity − under some conditions that may not always apply − usually not an issue in asynchronous designs − still should be known and considered − theory largely available (max/+ algebra) � the specific problems when fault tolerance is required − concurrent execution of two (or more) rules − use of k-of-n thresholds instead of wait-for-all − same principle but requires complex min/max/+ algebra 11
Our Current Status wrt Stability � identified appropriate formalism � developed simulation environment � derived conditions for oscillation for simplified case (max/+) � applying „Duality Conjecture“ to explore complex case (min/max/+) ≈ stabilizing min (cycle vectors) … 12
Our Current Status wrt Robustness � good robustness against delay variations � wait-for-all causes saturation (masks „faster than slowest“) � k-of-n causes 2 nd saturation („also masks slower than k th “) p 2f+1 1 1 1 1 1 DARTS DARTS (different „wait for all“ setting) f+1 p p p 13
Still a Far Way to Go � need efficient algorithms for min/max/+ algebra to characterize − mean rate − maximum swing − transient phase length � explore more complicated cases � consider real worls effects − noise and jitter − rise/fall asymmetry, … 14
Conclusion � nanoelectronics needs � adaptive timing � fault tolerance � robustness � round times in asyn loops can show systematic variance � characterization possible for wait-for-all architectures (max/+ algebra) � FT solutions need k-of-n architectures � this severely complicates the analysis (min/max/+ alg.) � k-of-n improves asynchronous designs‘ robustness 15
Thank you!
The DARTS Architecture data bus • modules FU i augmented with FU 1 simple local clock unit (TG alg) FU 3 TG network • TG algs communicate over TG algs FU 2 dedicated bus (TG network) to generate local clocks Clock tree • need 3 f +1 modules to Distributed clock tolerate f arbitrary clock faults Synchronous solution
What we want… FU 1 tick(3) tick(4) TG alg p p q FU 2 TG alg q
The DARTS clock-generation node clock inputs threshold function counter and tick generation clock modules . . . output ... ... ... ... ... ...
Recommend
More recommend