Asynchronous system design flow based on Petri nets - - PowerPoint PPT Presentation

asynchronous system design flow based on petri nets
SMART_READER_LITE
LIVE PREVIEW

Asynchronous system design flow based on Petri nets - - PowerPoint PPT Presentation

Asynchronous system design flow based on Petri nets Microelectronics System Design Research Group School of Electrical, Electronic and Computer Engineering University of Newcastle upon Tyne DATE-2005 Asynchronous system design flow based on


slide-1
SLIDE 1

Asynchronous system design flow based on Petri nets

Microelectronics System Design Research Group School of Electrical, Electronic and Computer Engineering University of Newcastle upon Tyne DATE-2005

Asynchronous system design flow based on Petri nets – p.1/40

slide-2
SLIDE 2

Outline

Motivation BESST (BEhavioural Synthesis of Self-Timed Systems) design flow Splitting of control and data paths Synthesis of data path Adding security features to data path Direct mapping of control path from Labelled PNs Direct mapping of low-latency control path from STGs Logic synthesis of control path Summary

Asynchronous system design flow based on Petri nets – p.2/40

slide-3
SLIDE 3

Motivation: Asynchronous circuits

Asynchronous circuits address International Technology Roadmap for Semiconductors (ITRS) challenges

ITRS-2003: “As it becomes impossible to move signals across large die within one clock cycle or in a power-efficient manner,

  • r to run control and dataflow processes at the same clock

rate, the likely result is a shift to asynchronous design style.”

Modularity and scalability (productivity and reuse) Low noise and electromagnetic emission (security) Robustness to parametric variations No clock skew

Asynchronous system design flow based on Petri nets – p.3/40

slide-4
SLIDE 4

Motivation: Petri nets

Well developed theory Powerful modeling language Simple to understand Can be hidden from the designer (intermediate system representation in our design flow)

p0 a p1 p2 p3 p4 p6 p5 p7 p8 h b c d e f g i

Asynchronous system design flow based on Petri nets – p.4/40

slide-5
SLIDE 5

Motivation: Petri nets

Well developed theory Powerful modeling language Simple to understand Can be hidden from the designer (intermediate system representation in our design flow)

p0 a p1 p2 p3 p4 p6 p5 p7 p8 h b c d e f g i

Asynchronous system design flow based on Petri nets – p.4/40

slide-6
SLIDE 6

Asynchronous design flows

Syntax-driven translation (Tangram, Balsa) Computationally simple Local peephole optimisation Adopted by industry (Philips’ incubator company Handshake Solutions) Slow control circuit Logic synthesis (PipeFitter, TAST, MOODs, CASH) Separate synthesis and optimisation of control and data paths Adopted by industry (Theseus Logic (NCL)) Computationally hard for explicit logic synthesis

Asynchronous system design flow based on Petri nets – p.5/40

slide-7
SLIDE 7

BESST Design flow

System Layout Timing Extraction

Datapath Synthesis Simulation

Control/Data Merging System Implementation

Control Unit Synthesis

Control Path Impl. Control Path Spec. Placement & Routing Information System Timing

Simulation Timing Simulation Functional

Conventional EDA tools

Testbench Testbench Deriviation Logic Synt. Direct Map. Coloured Petri Net Mapping Data Path Spec. Data Path Impl. Coloured Petri Net Labelled Petri Net Verilog Netlist Verilog Netlist Verilog Netlist manual Behavioural Verilog VeriSAT ConfRes, PN2DCs, OptiMist PN2DCs, VeriMap VeriSyn VeriSyn Control/Data Splitting System Spec. Behavioural Verilog

Input

Function Implementation RTL Library

Asynchronous system design flow based on Petri nets – p.6/40

slide-8
SLIDE 8

Greatest Common Divisor (GCD) spec.

01 module gcd(x, y, z); 02 input [7:0] x, y; 03

  • utput reg [7:0] z;

04 reg [7:0] x_reg, y_reg; 05 always 06 begin 07 x_reg = x; y_reg = y; 08 while (x_reg != y_reg) 09 begin 10 if (x_reg < y_reg) 11 y_reg = y_reg - x_reg; 12 else 13 x_reg = x_reg - y_reg; 14 end 15 z <= x_reg; 16 end 17 endmodule

Asynchronous system design flow based on Petri nets – p.7/40

slide-9
SLIDE 9

Splitting of control and data paths

System Layout Timing Extraction

Simulation

Control/Data Merging System Implementation

Control Unit Synthesis

Control Path Impl. Control Path Spec. Placement & Routing Information System Timing

Simulation Timing Simulation Functional

Conventional EDA tools

Testbench Testbench Deriviation Logic Synt. Direct Map. Coloured Petri Net Mapping Data Path Spec. Data Path Impl. Coloured Petri Net Labelled Petri Net Verilog Netlist Verilog Netlist Verilog Netlist manual Behavioural Verilog VeriSAT ConfRes, PN2DCs, OptiMist PN2DCs, VeriMap VeriSyn VeriSyn Control/Data Splitting System Spec. Behavioural Verilog

Input

Function Implementation RTL Library

Datapath Synthesis

Asynchronous system design flow based on Petri nets – p.8/40

slide-10
SLIDE 10

Global PN for GCD

Initial PN for GCD module

always

x_reg = x; y_reg = y; while (x_reg != y_reg) begin if (x_reg < y_reg) y_reg = y_reg - x_reg; else x_reg = x_reg - y_reg; end z <= x_reg;

Asynchronous system design flow based on Petri nets – p.9/40

slide-11
SLIDE 11

Global PN for GCD

Always-statement refinement

while z<=x_reg x_reg=x y_reg=y

x_reg = x; y_reg = y; while (x_reg != y_reg) begin if (x_reg < y_reg) y_reg = y_reg - x_reg; else x_reg = x_reg - y_reg; end z <= x_reg;

Asynchronous system design flow based on Petri nets – p.9/40

slide-12
SLIDE 12

Global PN for GCD

While-statement refinement

x_reg=x y_reg=y z<=x_reg if x_reg==y_reg x_reg!=y_reg

x_reg = x; y_reg = y; while (x_reg != y_reg) begin if (x_reg < y_reg) y_reg = y_reg - x_reg; else x_reg = x_reg - y_reg; end z <= x_reg;

Asynchronous system design flow based on Petri nets – p.9/40

slide-13
SLIDE 13

Global PN for GCD

If-statement refinement

x_reg=x x ? y x_reg>y_reg x_reg==y_reg x_reg<y_reg z<=x_reg x_reg=x_reg−y_reg y_reg=y_reg−x_reg y_reg=y

x_reg = x; y_reg = y; while (x_reg != y_reg) begin if (x_reg < y_reg) y_reg = y_reg - x_reg; else x_reg = x_reg - y_reg; end z <= x_reg;

Asynchronous system design flow based on Petri nets – p.9/40

slide-14
SLIDE 14

Global PN for GCD

Assignment-operation refinement

x y cmp gt eq lt z px1 px2 py2 py1 p0 pgt1 pgt2 plt1 plt2 peq1 cmp2 cmp1 dum0 dum1 sub_gt sub_lt store_y store_x

x_reg = x; y_reg = y; while (x_reg != y_reg) begin if (x_reg < y_reg) y_reg = y_reg - x_reg; else x_reg = x_reg - y_reg; end z <= x_reg;

Asynchronous system design flow based on Petri nets – p.9/40

slide-15
SLIDE 15

Syntax-driven translation

x

−> −> −> − ;

y

−> −> − ; || −>

guard

−> ;

1

−>

2 2 1 1 2 1 2 1 2 2 1 2 1 1 2 1 2 1 2 2 1 2 1 1 2

| |

a[0..7] b[0..7] aux:a aux:b

do /=

@0;1

2 1 1 2 2 1

#

1

;

2 3

z

2 1

−>

2 1

>

activate

1 1 1 1 1

Asynchronous system design flow based on Petri nets – p.10/40

slide-16
SLIDE 16

Labelled PN for GCD control path

z p0 peq1 dum0 dum1 py1 py2 px2 lt gt pgt1 sub_gt pgt2 plt1 sub_lt plt2 sub_gt_req store_y store_x sub_lt_req gt_ack lt_ack y_ack x_ack px1 x y x_req y_req z_req z_ack cmp2 cmp1 cmp_req cmp eq eq_ack

Asynchronous system design flow based on Petri nets – p.11/40

slide-17
SLIDE 17

Coloured PN for GCD data path

y_mux REG_y MUX_y_1 y−x MUX_y_0 y_store x_mux REG_x MUX_x_1 MUX_x_0 x_store x−y x y y_cur x_cur x>y x=y x<y CMP_xy SUB_gt SUB_lt cmp x_req y_req sub_lt_req sub_gt_req cmp_req x_ack y_ack lt_ack eq_ack gt_ack

Asynchronous system design flow based on Petri nets – p.12/40

slide-18
SLIDE 18

Control-data path interface

x y z Data path Control path

x_ack y_ack gt_ack eq_ack lt_ack z_ack x_req y_req cmp_req sub_gt_req sub_lt_req z_req

Asynchronous system design flow based on Petri nets – p.13/40

slide-19
SLIDE 19

Synthesis of data path

System Layout Timing Extraction

Datapath Synthesis Simulation

Control/Data Merging System Implementation

Control Unit Synthesis

Control Path Impl. Control Path Spec. Placement & Routing Information System Timing

Simulation Timing Simulation Functional

Conventional EDA tools

Testbench Testbench Deriviation Logic Synt. Direct Map. Coloured Petri Net Mapping Data Path Spec. Data Path Impl. Coloured Petri Net Labelled Petri Net Verilog Netlist Verilog Netlist Verilog Netlist manual Behavioural Verilog VeriSAT ConfRes, PN2DCs, OptiMist PN2DCs, VeriMap VeriSyn VeriSyn Control/Data Splitting System Spec. Behavioural Verilog

Input

Function Implementation RTL Library

Asynchronous system design flow based on Petri nets – p.14/40

slide-20
SLIDE 20

Coloured PN for GCD

y_mux REG_y MUX_y_1 y−x MUX_y_0 y_store x_mux REG_x MUX_x_1 MUX_x_0 x_store x−y x y y_cur x_cur x>y x=y x<y CMP_xy SUB_gt SUB_lt cmp x_req y_req sub_lt_req sub_gt_req cmp_req x_ack y_ack lt_ack eq_ack gt_ack

Asynchronous system design flow based on Petri nets – p.15/40

slide-21
SLIDE 21

Mapping Coloured PN into circuit

in1

  • ut

req ack

in2 OP REG

  • ut

req ack

in CMP

req

in1 in2

eq_ack lt_ack gt_ack ack req

in

  • ut

ack req

in1

  • ut

in2

gt_ack eq_ack lt_ack

in2 in1

req req1 req2

in1

ack

  • ut

in2 MUX in2

req2 req1

in1

  • ut

ack

Operation Comparator Register Multiplexer Coloured PN Circuit

Asynchronous system design flow based on Petri nets – p.16/40

slide-22
SLIDE 22

GCD data path schematic

cmp_req eq_ack x_ack y_ack gt_ack lt_ack

x y x−y y−x SUB SUB REG CMP REG

sub_lt_req

y ? x

x_mux y_mux y_req x_req

y x MUX MUX

y_store sub_gt_req

z

x_store

Asynchronous system design flow based on Petri nets – p.17/40

slide-23
SLIDE 23

Dual-Rail Protocols

Single spacer dual-rail protocol

"1" "0" words code all−zeroes spacer

00 01 10

C

sp 1 sp sp

c_1 c_0 c_done b_1 b_0 b_go a_0 a_1 a_go

Alternating spacer dual-rail protocol

all−zeroes spacer all−ones spacer "1" "0" words code

00 01 01 10 10 11

C c_1 c_0 b_1 b_0 a_0 a_1 c_done b_go a_go

sp0 1 sp1 sp0

Asynchronous system design flow based on Petri nets – p.18/40

slide-24
SLIDE 24

Energy imbalance and exposure time

Single spacer protocol in 2-input dual-rail AND gate

all−zeroes spacer code word all−zeroes spacer

energy imbalance exposure time

Alternating spacer protocol in 2-input dual-rail AND gate

exposure time energy imbalance

all−zeroes spacer code word all−ones spacer

Asynchronous system design flow based on Petri nets – p.19/40

slide-25
SLIDE 25

Direct mapping of control path

System Layout Timing Extraction

Simulation

Control/Data Merging System Implementation

Control Unit Synthesis

Control Path Impl. Control Path Spec. Placement & Routing Information System Timing

Simulation Timing Simulation Functional

Conventional EDA tools

Testbench Testbench Deriviation Logic Synt. Direct Map. Coloured Petri Net Mapping Data Path Spec. Data Path Impl. Coloured Petri Net Labelled Petri Net Verilog Netlist Verilog Netlist Verilog Netlist manual Behavioural Verilog VeriSAT ConfRes, PN2DCs, OptiMist PN2DCs, VeriMap VeriSyn VeriSyn Control/Data Splitting System Spec. Behavioural Verilog

Input

Function Implementation RTL Library

Datapath Synthesis

Asynchronous system design flow based on Petri nets – p.20/40

slide-26
SLIDE 26

Direct mapping of control path

Two approaches to direct mapping of control path (implemented in PN2DCs and OptiMist tools)

Tool PN2DCs OptiMist Specification Labeled PN Signal Transition Graph (STG) Abstraction level Abstract Detailed

Advantages

Smaller size Lower latency

Asynchronous system design flow based on Petri nets – p.21/40

slide-27
SLIDE 27

Labelled PN for GCD control path

Labelled PN for GCD control unit

z p0 peq1 dum0 dum1 py1 py2 px2 lt gt pgt1 sub_gt pgt2 plt1 sub_lt plt2 sub_gt_req store_y store_x sub_lt_req gt_ack lt_ack y_ack x_ack px1 x y x_req y_req z_req z_ack cmp2 cmp1 cmp_req cmp eq eq_ack

Asynchronous system design flow based on Petri nets – p.22/40

slide-28
SLIDE 28

Labelled PN for GCD control path

Optimised Labelled PN for GCD control unit

lt gt pgt1 sub_gt pgt2 plt1 sub_lt plt2 sub_gt_req store_y store_x sub_lt_req gt_ack lt_ack eq_ack y_ack x_ack x_req y_req cmp_req cmp cmp2 cmp1 dum1 dum0 pin pout eq z_ack z_req

Asynchronous system design flow based on Petri nets – p.22/40

slide-29
SLIDE 29

Mapping of Labelled PN places into DCs

David Cell (DC) - state holding element for one token

1 s a r a1 r1 s− s+ p1 p2 r+ a− a+ r1− a1+ a1− r1+ r− 1 1 GasP section gnd vdd r1 a1 a r

Gate-level DC STG Transistor-level DC

Mapping of a Labelled PN place into a DC

C C DC request logic cur_req

  • ut1

ack_in1 pred1_req pred2_req cur acknowledgement logic cur_ack OR succ2_ack succ1_ack pred2 pred1

  • ut1

succ2 in2

  • ut2

in3 succ1 cur in1

  • ut3

Asynchronous system design flow based on Petri nets – p.23/40

slide-30
SLIDE 30

GCD control path schematic

x_req DC DC y_ack x_ack C y_req OR DC pgt2 plt2 pgt1 plt1 DC DC DC DC DC C C C C lt_ack gt_ack OR C C cmp2 cmp1 pin pout z_req cmp_req eq_ack sub_lt_req y_ack sub_gt_req x_ack z_ack

Asynchronous system design flow based on Petri nets – p.24/40

slide-31
SLIDE 31

Deriving STG for GCD control path

GCD global net

x y cmp gt eq lt z px1 px2 py2 py1 p0 pgt1 pgt2 plt1 plt2 peq1 cmp2 cmp1 dum0 dum1 sub_gt sub_lt store_y store_x

Control unit STG (with explicit places)

cmp_req−/1 eq_ack− eq_ack+ gt_ack+ cmp1 cmp_req−/2 plt2 plt1 gt_ack− sub_gt_req+ pgt3 plt4 plt3 x_ack−/2 plt5 sub_gt_req− x_ack+/2 plt6 cmp2 y_ack+/2 y_ack−/2 sub_lt_req− sub_lt_req+ lt_ack− cmp_req−/3 lt_ack+ peq1 peq2 peq3 peq4 peq5 peq6 x_req+ px1 x_ack+/1 x_req− x_ack−/1 px5 y_req+ y_ack+/1 y_req− y_ack−/1 py5 py1 px2 py2 py3 py4 px4 px3 cmp_req+ pgt6 pgt5 pgt4 pgt2 pgt1 dum1 z_req+ z_ack+ z_req− z_ack−

Asynchronous system design flow based on Petri nets – p.25/40

slide-32
SLIDE 32

Device-environment splitting

cmp1 cmp2 py5 py1 px1 px5 (x_ack+/1) (x_ack−/1) (y_ack+/1) (y_ack−/1) y_ack=0 x_ack=0 (gt_ack+) (gt_ack−) (x_ack+/2) (x_ack−/2) pgt6 pgt5 pgt4 pgt3 pgt2 pgt1 plt1 plt2 plt3 plt4 plt6 plt5 (y_ack−/2) (y_ack+/2) (lt_ack−) (lt_ack+) (eq_ack−) peq1 peq2 peq3 peq4 peq5 peq6 eq_ack=1 eq_ack=0 cmp_req+ gt_ack=0 x_ack=1 x_ack=0 lt_ack=1 gt_ack=1 lt_ack=0 y_ack=1 y_ack=0 y_ack=1 x_ack=1 py2 py3 py4 px4 px3 px2 (eq_ack+) dum1 z_ack=1 z_ack=0 (z_ack−) (z_ack+) z_req+ cmp_req−/1 x_req− y_req+ y_req− sub_gt_req− sub_gt_req+ cmp_req−/2 cmp_req−/3 sub_lt_req+ sub_lt_req− z_req− x_req+ Asynchronous system design flow based on Petri nets – p.26/40

slide-33
SLIDE 33

Output exposure

plt5 sub_lt_req=0 sub_lt_req=1 sub_lt_req+ sub_lt_req− plt3 peq3 peq5 z_req+ z_req=0 z_req=1 z_req− pgt5 sub_gt_req− sub_gt_req=1 sub_gt_req=0 sub_gt_req+ pgt3 py3 y_req=0 y_req+ y_req=1 y_req− py1 px3 x_req=0 x_req=1 x_req− x_req+ px1 cmp1 pgt1 cmp_req+ cmp_req=0 cmp_req=1 cmp_req− peq1 plt1 cmp1 cmp2 py5 py1 px1 px5 (x_req+) (x_ack+/1) (x_req−) (x_ack−/1) (y_req+) (y_ack+/1) (y_req−) (y_ack−/1) x_req=1 y_ack=0 x_ack=0 (gt_ack+) (cmp_req−/2) (gt_ack−) (sub_gt_req+) (x_ack+/2) (sub_gt_req−) (x_ack−/2) pgt6 pgt5 pgt4 pgt3 pgt2 pgt1 plt1 plt2 plt3 plt4 plt6 plt5 (y_ack−/2) (sub_lt_req−) (y_ack+/2) (sub_lt_req+) (lt_ack−) (cmp_req−/3) (lt_ack+) (cmp_req−/1) (eq_ack−) peq1 peq2 peq3 peq4 peq5 peq6 cmp_req=0 eq_ack=1 y_req=0 x_req=0 eq_ack=0 cmp_req=1 gt_ack=0 sub_gt_req=1 x_ack=1 sub_gt_req=0 x_ack=0 lt_ack=1 gt_ack=1 lt_ack=0 sub_lt_req=1 y_ack=1 y_ack=0 sub_lt_req=0 y_ack=1 x_ack=1 y_req=1 py2 py3 py4 px4 px3 px2 (eq_ack+) dum1 (z_req+) z_req=1 z_ack=1 z_ack=0 (z_ack−) (z_ack+) cmp_req=0 cmp_req=0

TRACKER BOUNCER

z_req=0 (z_req−) (cmp_req+) Asynchronous system design flow based on Petri nets – p.27/40

slide-34
SLIDE 34

Optimisation

sub_gt_req− sub_gt_req=1 sub_gt_req=0 sub_gt_req+ pgt1 gt_ack=0 pgt4 x_ack=1 peq4 eq_ack=0 peq1 z_ack=1 z_req+ z_req=0 z_req=1 z_req− sub_lt_req=0 sub_lt_req=1 sub_lt_req+ sub_lt_req− plt1 lt_ack=0 plt4 y_ack=1 y_req=0 y_req+ y_req=1 y_req− peq6y z_ack=0 y_ack=1 py2 x_req=0 x_req=1 x_req− x_req+ peq6x x_ack=1 px2 z_ack=0 cmp1 pgt1 cmp_req+ cmp_req=0 cmp_req=1 cmp_req− peq1 plt1 cmp1 (x_req−) (y_req+) (y_req−) (gt_ack+) (sub_gt_req+) (sub_gt_req−) (x_ack−/2) pgt6 pgt4 pgt1 plt1 plt4 plt6 (y_ack−/2) (sub_lt_req−) (sub_lt_req+) (lt_ack+) peq1 peq4 eq_ack=1 y_req=0 y_req=1 sub_gt_req=1 sub_gt_req=0 x_ack=0 lt_ack=1 gt_ack=1 sub_lt_req=1 y_ack=0 sub_lt_req=0 x_req=0 py2 py4 px4 (eq_ack+) (x_req+) x_req=1 px2 peq6x peq6y x_ack=0 y_ack=0 dum1 z_req=1 (z_req+) z_req=0 (z_req−)

TRACKER BOUNCER

Asynchronous system design flow based on Petri nets – p.28/40

slide-35
SLIDE 35

Mapping into David cells and flip-flops

Mapping tracker places into David cells

C DC request logic OR AND AND cur_req c b p3_req a p2_req p1_req cur_ack acknowledgement logic AND OR s3_ack s2_ack s1_ack cur cur s2 s1 p1 p2 p3 s3 c e b a d

Mapping elementary cycles into flip-flops

AND AND OR AND set function reset function a=1 a=0 a FF s2 s1 r1 r2 r3 r4 s1 s2 r1 r2 r3 r4 a+ a−/1 a−/2 a=0 a=1

Asynchronous system design flow based on Petri nets – p.29/40

slide-36
SLIDE 36

Low-latency GCD control circuit

FF cmp_req=1 cmp_req=0 OR cmp_req plt1_req peq1_req pgt1_req cmp1_req AND AND FF x_req x_req=1 x_req=0 peq6x_req z_ack=0 x_ack=1 px2_req AND AND FF sub_lt_req sub_lt_req=0 sub_lt_req=1 plt4_req y_ack=1 plt1_req lt_ack=0 AND AND FF sub_gt_req sub_gt_req=0 sub_gt_req=1 pgt1_req gt_ack=0 pgt4_req x_ack=1 AND AND FF z_req z_req=1 z_req=0 peq4_req z_ack=1 eq_ack=0 peq1_req AND AND FF y_req y_req=0 y_req=1 py2_req x_ack=1 z_ack=0 peq6y_req DC peq6y DC py2 DC DC peq6x px2 px4 DC plt1 DC pgt1 DC pgt4 DC pgt6 DC plt6 DC plt4 cmp1 AND AND AND AND AND AND AND AND AND AND AND AND DC DC py4 px2_ack peq6x_req px4_ack px2_req py2_req py4_ack py2_ack peq6y_req z_req=0 x_req=1 x_req=0 z_req=0 y_req=1 y_req=0 y_ack=0 x_ack=0 x_ack=0 px4_req plt1_ack plt1_req plt6_ack plt4_req pgt6_ack pgt4_req pgt1_req pgt4_ack pgt1_ack plt4_ack DC peq1 DC AND AND eq_ack=1 peq4 AND cmp1_req peq1_ack peq1_req peq4_ack peq4_req peq6x_ack peq6y_ack sub_gt_req=1 sub_lt_req=1 sub_lt_req=0 sub_gt_req=0 z_req=1 OR DC y_ack=0 lt_ack=1 gt_ack=1 OR AND AND C AND py4_req

Asynchronous system design flow based on Petri nets – p.30/40

slide-37
SLIDE 37

Logic synthesis of control unit

System Layout Timing Extraction

Simulation

Control/Data Merging System Implementation

Control Unit Synthesis

Control Path Impl. Control Path Spec. Placement & Routing Information System Timing

Simulation Timing Simulation Functional

Conventional EDA tools

Testbench Testbench Deriviation Logic Synt. Direct Map. Coloured Petri Net Mapping Data Path Spec. Data Path Impl. Coloured Petri Net Labelled Petri Net Verilog Netlist Verilog Netlist Verilog Netlist manual Behavioural Verilog VeriSAT ConfRes, PN2DCs, OptiMist PN2DCs, VeriMap VeriSyn VeriSyn Control/Data Splitting System Spec. Behavioural Verilog

Input

Function Implementation RTL Library

Datapath Synthesis

Asynchronous system design flow based on Petri nets – p.31/40

slide-38
SLIDE 38

GCD control unit STG

GCD global net

x y cmp gt eq lt z px1 px2 py2 py1 p0 pgt1 pgt2 plt1 plt2 peq1 cmp2 cmp1 dum0 dum1 sub_gt sub_lt store_y store_x

Control unit STG (with implicit places)

x_ack−/1 x_req+ x_req− x_ack+/1 y_ack−/1 y_req− y_ack+/1 y_req+ x_ack−/2 gt_ack− sub_gt_req+ lt_ack+ gt_ack+ cmp2 lt_ack− y_ack−/2 cmp1 y_ack+/2 sub_lt_req+ x_ack+/2 p0 sub_lt_req− sub_gt_req− cmp_req−/3 cmp_req−/2 eq_ack+ cmp_req−/1 eq_ack− dum1 cmp_req+ z_ack− z_req− z_ack+ z_req+

Asynchronous system design flow based on Petri nets – p.32/40

slide-39
SLIDE 39

Reachability graph and CSC

< x_req, y_req, x_ack, y_ack, cmp_req, gt_ack, eq_ack, lt_ack, sub_gt_req, sub_lt_req, z_req, z_ack >

000000000001 000000000000 010000000000 100000000000 101000000000 110000000000 01010000000 001000000000 111000000000 110100000000 000100000000 000000000000 011000000000 111100000000 100100000000 000000000000 010000000000 011100000000 101100000000 100000000000 010100000000 001100000000 101000000000 000100000000 00100000000 000000100000 000000000000 000010000000 000010100000 000000000000 000000000010 000000000011 000011000000 000010010000 000001000000 000010000000 000000010000 000000000000 001000001000 000100000000 000000000000 000000000100 000100000100 001000001000 000000001000 gt_ack+ lt_ack+ eq_ack+ eq_ack− lt_ack− gt_ack− cmp_req+ cmp_req− cmp_req+ cmp_req− cmp_req−

* Complete State Coding (CSC) − initial state *Unique State Coding (USC) − conflicting states

sub_gt_req+ sub_gt_req− sub_lt_req− sub_lt_req+ z_req+ z_req− z_ack+ z_ack− x_ack+ y_ack+ x_ack− y_ack+ x_ack+ y_ack− x_ack+ y_ack+ x_ack− y_ack+ x_ack+ y_ack− y_ack+ x_ack− y_ack− x_ack+ x_ack− y_ack− y_ack− x_ack− x_req+ y_req+ y_req+ x_req+ x_req− y_req+ x_req+ y_req− y_req+ x_req− y_req− x_req+ y_req+ x_req− y_req− x_req+ y_req− x_req− y_req− x_req− x_ack+ x_ack− y_ack+ y_ack−

Asynchronous system design flow based on Petri nets – p.33/40

slide-40
SLIDE 40

Automatic CSC solution

y_ack−/1 x_ack−/1 x_req+ y_req+ x_ack+/1 y_ack+/1 y_req− x_req− csc2+ csc0− csc2− p5 p4 x_ack+/3 csc1+ p7 p3 p1 p2 p0 cmp_req+/1 gt_ack+ csc3− gt_ack− x_ack+/2 csc3+ lt_ack+ csc4− lt_ack− csc4+ y_ack+/2 cmp_req−/1 eq_ack− csc0+ eq_ack+ cmp2 cmp1 sub_gt_req+ sub_lt_req+ cmp_req−/2 cmp_req−/3 y_ack+/3 cmp_req+/2 x_ack−/2 sub_gt_req− sub_lt_req− y_ack−/2 z_req+ z_ack+ z_req− z_ack− csc1−

Asynchronous system design flow based on Petri nets – p.34/40

slide-41
SLIDE 41

Semi-automatic CSC solution

cmp_req−/3 lt_ack− sub_lt_req+ y_ack+/2 sub_lt_req− y_ack−/2 cmp_req−/2 gt_ack− sub_gt_req+ x_ack+/2 sub_gt_req− x_ack−/2 lt_ack+ gt_ack+ cmp_req−/1 eq_ack+ eq_ack− cmp_req+/1 cmp1 cmp2 z_req+ z_ack+ z_req− dum1 y_req+ y_ack+/1 y_req− y_ack−/1 x_ack+/1 x_req− x_ack−/1 x_req+ z_ack−

Asynchronous system design flow based on Petri nets – p.35/40

slide-42
SLIDE 42

Semi-automatic CSC solution

cmp_req−/3 lt_ack− sub_lt_req+ y_ack+/2 sub_lt_req− y_ack−/2 cmp_req−/2 gt_ack− sub_gt_req+ x_ack+/2 sub_gt_req− x_ack−/2 lt_ack+ gt_ack+ cmp_req−/1 eq_ack+ eq_ack− cmp_req+/1 cmp1 cmp2 csc_x− z_req+ z_ack+ z_req− dum1 y_req+ y_ack+/1 y_req− y_ack−/1 x_ack+/1 x_req− x_ack−/1 x_req+ csc_x+ Core_2 Core_1 z_ack− Core_3

Asynchronous system design flow based on Petri nets – p.35/40

slide-43
SLIDE 43

Semi-automatic CSC solution

cmp_req−/3 lt_ack− sub_lt_req+ y_ack+/2 sub_lt_req− y_ack−/2 cmp_req−/2 gt_ack− sub_gt_req+ x_ack+/2 sub_gt_req− x_ack−/2 lt_ack+ gt_ack+ eq_ack− cmp_req+/1 cmp1 cmp2 cmp_req−/1 eq_ack+ csc_y− csc_x− csc_lt+ csc_lt− z_req+ z_ack+ z_req− dum1 y_req+ y_ack+/1 y_req− y_ack−/1 x_ack+/1 x_req− x_ack−/1 x_req+ csc_x+ csc_y+ z_ack− Core_5 Core_4

Asynchronous system design flow based on Petri nets – p.35/40

slide-44
SLIDE 44

Semi-automatic CSC solution

cmp_req−/3 lt_ack− sub_lt_req+ y_ack+/2 sub_lt_req− y_ack−/2 cmp_req−/2 gt_ack− sub_gt_req+ x_ack+/2 sub_gt_req− x_ack−/2 eq_ack− cmp_req+/1 cmp1 cmp2 cmp_req−/1 eq_ack+ csc_y− csc_x− csc_lt− gt_ack+ lt_ack+ csc_lt+ csc_gt+ csc_gt− Core_6 z_req+ z_ack+ z_req− dum1 y_req+ y_ack+/1 y_req− y_ack−/1 x_ack+/1 x_req− x_ack−/1 x_req+ csc_x+ csc_y+ z_ack−

Asynchronous system design flow based on Petri nets – p.35/40

slide-45
SLIDE 45

Result of semi-automatic CSC solution

x_ack−/1 x_req− y_ack−/1 y_req− x_ack−/2 gt_ack− sub_gt_req+ lt_ack+ gt_ack+ cmp2 lt_ack− y_ack−/2 cmp1 y_ack+/2 sub_lt_req+ x_ack+/2 p0 sub_lt_req− sub_gt_req− x_req+ x_ack+/1 y_req+ y_ack+/1 cmp_req−/2 cmp_req−/3 csc_x+ csc_y+ csc_eq− csc_gt+ csc_gt− csc_lt+ csc_lt− csc_eq+ csc_x− csc_y− eq_ack− cmp_req+ z_req+ z_ack+ z_req− z_ack− cmp_req−/1 eq_ack+

Asynchronous system design flow based on Petri nets – p.36/40

slide-46
SLIDE 46

Complex gate impl. of GCD controller

csc_lt lt_ack’ csc_gt gt_ack’ csc_x’ eq_ack’ csc_eq’ csc_y’ z_ack’ csc_eq csc_x’ z_ack’ csc_eq y_req x_req z_req sub_gt_req sub_lt_req z_ack csc_x’ csc_y’ x_ack y_ack csc_eq csc_eq’ csc_y’ csc_x’ csc_gt csc_lt y_ack’ x_ack’ csc_eq’ cmp_req y_ack’ lt_ack gt_ack x_ack’ y_ack eq_ack’ eq_ack’ x_ack csc_x’ csc_x csc_y’ csc_y csc_gt csc_lt eq_ack eq_ack’ lt_ack lt_ack’ gt_ack gt_ack’ z_ack z_ack’ x_ack’ x_ack y_ack’ y_ack Asynchronous system design flow based on Petri nets – p.37/40

slide-47
SLIDE 47

Comparison: Control path

method circuit estimated comp. name size latency time (trans) (neg.gate) (sec) direct from Labeled PNs 55 4 <1 mapping from STGs 174 3 <1 automatic 116 9 18 logic CSC solution synth. semi-automatic 120 5 2 CSC solution

Asynchronous system design flow based on Petri nets – p.38/40

slide-48
SLIDE 48

Comparison: GCD circuit

Tool Area Speed (ns) computation (µm2) x=y x=12, y=16 time (s) Balsa 119,647 21 188 < 1 PN2DCs 100,489 14 109 < 1 Improvement 16% 33% 42%

Asynchronous system design flow based on Petri nets – p.39/40

slide-49
SLIDE 49

Conclusions

Coherent asynchronous circuit design flow Initial spec is in behavioural Verilog form Petri nets intermediate circuit representation Interface to conventional EDA tools for place-and-route and simulation Control path synthesis allows to trade off Circuit size Output latency Computation time Data path synthesis Direct mapping from Coloured PNs Optional security features

Asynchronous system design flow based on Petri nets – p.40/40