automated generation of round robin arbitration and
play

Automated Generation of Round-robin Arbitration and Crossbar - PowerPoint PPT Presentation

Automated Generation of Round-robin Arbitration and Crossbar Switch Logic Eung S. Shin Advisor: Professor Vincent J Mooney III School of Electrical and Computer Engineering, Georgia Institute of Technology Overview Crossbar (Xbar) GPP GPP


  1. Automated Generation of Round-robin Arbitration and Crossbar Switch Logic Eung S. Shin Advisor: Professor Vincent J Mooney III School of Electrical and Computer Engineering, Georgia Institute of Technology

  2. Overview Crossbar (Xbar) GPP GPP DSP DSP peripheral on-chip network custom logic core round- & arbiter robin arbiter memory memory memory memory module module module module Multiprocessor System-on-a Chip (SoC) 2/12/2004 2

  3. Arbiter Problems Network Switch (16x16) A fast and powerful VOQ(0,0) output port 0 arbiter for an SoC . . . input port 0 . VOQ(0,16) A fast arbiter for terabit Crossbar . Switch . Fabric . switching speeds (16x16)x16 . . VOQ(16,0) A tedious and error- output port 16 . . . prone task input port 16 … … … VOQ(16,16) req(0, 0) . grant(0, 0-16) grant(16, 0-16) . . . . 16 (16x16 arbiter)s . . . . req(16, 16) 2/12/2004 3

  4. Xbar Problems Multiple communication channels demanded in a multiprocessor SoC Challenge: reducing productivity gap Productivity gap reduction techniques: � Enhancing IP core reusability � Developing a CAD tool 2/12/2004 4

  5. Objective To design and automate a fast round-robin arbiter logic generation for a bus or a network switch � The generated arbiter employed to crossbar (Xbar) switch arbitration logic To automate Xbar generation providing multiple communication paths among masters � The generated Xbar customized according to user specifications 2/12/2004 5

  6. Outline Terminology Origin and history of problems: � Arbiter design: PPE and PPA � Crossbar switch design: “Smart” Memory Arbiter design Arbiter experiments RAG: Round-robin Arbiter Generator X-Gt: Xbar Generator Xbar experiments Conclusion 2/12/2004 6

  7. Terminology Network Switch (32x32) VOQ(0,0) MxN Switch : M-input by N-output switch output port 0 . . � Example: A 32x32 switch − 32-input by 32-output . input port 0 . VOQ(0,31) Crossbar . switch with 1024 (32 2 ) possible connections Switch . Fabric between input ports and output ports . (32x32)x32 . . Virtual Output Queues (VOQs) : to remove VOQ(31,0) possible output port contention (Head of Line output port 31 . . (HOL) blocking) . input port 31 … … … VOQ(31,31) req(0, 0) VOQ ( m , n ) : m − the input port index; n − the . grant(0, 0-31) grant(31, 0-31) . . . output port index . 32 (32x32 arbiter)s . . . � Example: VOQ ( 1 , 0 ) . req(31, 31) 2/12/2004 7

  8. Terminology (Continued) (MxV)xN Switch : Network Switch (32x32) VOQ(0,0) � M − the number of output port 0 . input ports of an MxN . . input port 0 switch . VOQ(0,31) Crossbar . � V − the number of Switch . Fabric VOQs per input port . (32x32)x32 . � N − the number of . output ports of an MxN VOQ(31,0) output port 31 switch . . � Typically, V = N . input port 31 … … … VOQ(31,31) req(0, 0) � The total number of . grant(0, 0-31) grant(31, 0-31) . . VOQs in an MxN . switch − M ∗ N . 32 (32x32 arbiter)s . . . . req(31, 31) 2/12/2004 8

  9. Terminology (Continued) (MxV)xN crossbar (32x32)x32 Crossbar Switch Fabric switch fabric : VOQ (0, 0) VOQ (0, 0) � Connections between VOQ (1, 0) VOQ (1, 0) output port 0 (MxV) inputs and N . . . . . . . . outputs . . . . . VOQ (31, 0) VOQ (31, 0) . MxM Switch Arbiter . . . . . . . (SA) : VOQ (0, 31) � Controlling M specific VOQ (1, 31) output port 31 . . transmission gates . . . . between M VOQs and VOQ (31, 31) . . . a particular output port � N MxM SAs in an MxN switch grant (31, 31) grant (31, 0) grant (0, 31) grant (1, 31) grant (1, 0) grant (0, 0) . . . 32 x 32 32 x 32 SA_0 SA_31 2/12/2004 9 Thirty-two 32x32 SAs

  10. Terminology (Continued) MxM distributed SA (MxM hierarchical SA) : � Equivalent to an MxM SA � Consisting of smaller switch arbiter in the form of a hierarchical tree structure Bus Arbiter (BA) : resolving bus conflicts 8x8 hierarchical SA 8x8 hierarchical SA 8x8 hierarchical SA 8x8 hierarchical SA 8x8 hierarchical SA 8x8 hierarchical SA 8x8 hierarchical SA ack ack D-FF D-FF D-FF D-FF clock clock Ring Counter Ring Counter ack ack ack ack ack ack ack reset reset 4x4 BA 4x4 BA token [1] token [1] token [1] token [1] token [0] token [0] token [0] token [0] token [3] token [3] token [3] token [3] token [2] token [2] token [2] token [2] clock clock clock clock clock clock clock ack0[1] ack0[1] ack0[1] ack0[1] ack0[1] ack0[1] ack0[1] EN EN ack0[0] ack0[0] ack0[0] ack0[0] ack0[0] ack0[0] ack0[0] output[0] output[0] in[0] in[0] req[0] req[0] in[1] in[1] output[1] output[1] grant[0] grant[0] Priority Priority req[1] req[1] D D D D D D D counter counter counter counter counter counter counter grant0[0] grant0[0] grant0[0] grant0[0] grant0[0] grant0[0] grant0[0] grant0[0] grant0[0] grant0[0] grant0[0] in[2] in[2] output[2] output[2] Logic 0 Logic 0 req0[0] req0[0] req0[0] req0[0] req0[0] req0[0] req0[0] req0[0] req0[0] req0[0] req0[0] req[2] req[2] output[3] output[3] in[3] in[3] req[3] req[3] 4x4 4x4 4x4 4x4 4x4 4x4 4x4 grant0[1] grant0[1] grant0[1] grant0[1] grant0[1] grant0[1] grant0[1] grant0[1] grant0[1] grant0[1] grant0[1] req0[1] req0[1] req0[1] req0[1] req0[1] req0[1] req0[1] req0[1] req0[1] req0[1] req0[1] ack-req ack-req ack-req ack-req ack-req ack-req ack-req EN EN grant0[2] grant0[2] grant0[2] grant0[2] grant0[2] grant0[2] grant0[2] grant0[2] grant0[2] grant0[2] grant0[2] grant[1] grant[1] req0[2] req0[2] req0[2] req0[2] req0[2] req0[2] req0[2] req0[2] req0[2] req0[2] req0[2] SA 0 SA 0 SA 0 SA 0 SA 0 SA 0 SA 0 Priority Priority grant0[3] grant0[3] grant0[3] grant0[3] grant0[3] grant0[3] grant0[3] grant0[3] grant0[3] grant0[3] grant0[3] req0[3] req0[3] req0[3] req0[3] req0[3] req0[3] req0[3] req0[3] req0[3] req0[3] req0[3] Logic 1 Logic 1 grant[2] grant[2] EN EN D D D D D D D counter counter counter counter counter counter counter grant1[0] grant1[0] grant1[0] grant1[0] grant1[0] grant1[0] grant1[0] grant1[0] grant1[0] grant1[0] grant1[0] Priority Priority req1[0] req1[0] req1[0] req1[0] req1[0] req1[0] req1[0] req1[0] req1[0] req1[0] req1[0] Logic 2 Logic 2 4x4 4x4 4x4 4x4 4x4 4x4 4x4 grant1[1] grant1[1] grant1[1] grant1[1] grant1[1] grant1[1] grant1[1] grant1[1] grant1[1] grant1[1] grant1[1] req1[1] req1[1] req1[1] req1[1] req1[1] req1[1] req1[1] req1[1] req1[1] req1[1] req1[1] ack-req ack-req ack-req ack-req ack-req ack-req ack-req grant[3] grant[3] grant1[2] grant1[2] grant1[2] grant1[2] grant1[2] grant1[2] grant1[2] grant1[2] grant1[2] grant1[2] grant1[2] req1[2] req1[2] req1[2] req1[2] req1[2] req1[2] req1[2] req1[2] req1[2] req1[2] req1[2] SA 1 SA 1 SA 1 SA 1 SA 1 SA 1 SA 1 EN EN grant1[3] grant1[3] grant1[3] grant1[3] grant1[3] grant1[3] grant1[3] grant1[3] grant1[3] grant1[3] grant1[3] req1[3] req1[3] req1[3] req1[3] req1[3] req1[3] req1[3] req1[3] req1[3] req1[3] req1[3] Priority Priority Logic 3 Logic 3 2x2 2x2 2x2 2x2 2x2 2x2 2x2 req0 req0 req0 req0 req0 req0 req0 root root root root root root root 2/12/2004 10 req1 req1 req1 req1 req1 req1 req1 SA SA SA SA SA SA SA

  11. Requirements for a Terabit Switch Arbiter Starvation free Fast Arbitration Simplicity to implement Low power: � Power budget of single rack router ~ 10kW 2/12/2004 11

  12. Outline Terminology Origin and history of problems: � Arbiter design: PPE and PPA � Crossbar switch design: “Smart” Memory Arbiter design Arbiter experiments RAG: Round-robin Arbiter Generator X-Gt: Xbar Generator Xbar experiments Conclusion 2/12/2004 12

  13. History: Arbiter in PPE P_enc Req log 2 n n Centralized Switch tothermo Arbiters: n P_thermo � Programmable Priority Encoder (PPE) new_Req n implementing iterative Priority Encoder Priority Encoder_thermo round-robin algorithm (iSLIP) n n n Gnt_PE P. Gupta and N. Mckeown, � any_Gnt_PE_thermo “Designing and Implementing a Fast Crossbar Scheduler,” IEEE Micro , 1999, pp. 20-28. N. Mckeown, P. Varaiya, and J. Gnt_PE_thermo � Warland, “The iSLIP Scheduling Algorithm for Input-Queued Switch,” n IEEE Transaction on Networks , 1999, pp. 188-201. Gnt 2/12/2004 13

  14. History: Arbiter in PPA r0 Distributed Switch r1 layer 4 Arbiter: Fi Gg0 Gg1 � Ping Pong Arbiter (PPA) external grant signals layer 3 H. J. Chao, C. H. Lam, and � r0 g0 X. Guo, “A Fast Arbitration 2x2 Scheme for Terabit Packet r1 g1 PPA Switches,” Proceedings of layer 2 IEEE Global Telecommunications Gg0 Conference , 1999, pp. 1236- Fi Fo 1243. layer 1 Gg1 Q D Comparison: our generated SA 2.3X 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Clock faster than PPE and root PPA intermediate PPA leaf PPA 1.8X faster than PPA g1 Fo g0 2/12/2004 14

  15. Why do we need an arbiter for an SoC? Arbitration required by all buses Our arbiter applicable to anywhere requiring arbitration The generated arbiter utilized in our Xbar 2/12/2004 15

Recommend


More recommend