of efficient 3d network on chip
play

of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben - PowerPoint PPT Presentation

BWCCA 2010 Fukuoka, Japan November 4-6 2010 Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben Ahmed, Abderazek Ben Abdallah, Kenichi Kuroda The University of Aizu School of Computer Science and


  1. BWCCA 2010 Fukuoka, Japan November 4-6 2010 Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben Ahmed, Abderazek Ben Abdallah, Kenichi Kuroda The University of Aizu School of Computer Science and Engineering, Adaptive Systems Laboratory, Aizu-Wakamatsu, Japan. Email:m5141153@u-aizu.ac.jp The University of Aizu Adaptive systems lab 1

  2. Outline • Introduction • 2D-OASIS-NoC Overview • Minimal Hop Routing Algorithm • 3D-OASIS-NoC Architecture • Design Results • Conclusion The University of Aizu Adaptive systems lab 2

  3. Introduction • Communication becomes an essential part in current Systems On chip (SoC). • Networks-On-chip (NoC) overcomes bus-based systems problems. • NoC features: – Simple and scalable architecture. – Connects processors, memories and other custom designs together. – Switches packets instead of switching wires. The University of Aizu Adaptive systems lab 3

  4. 2D-OASIS-NoC overview NORTH • 4x4 Mesh topology • Wormhole FIFO switching • Stall-and-Go flow FIFO control WEST EAST • 76 bit flit FIFO FIF O SOUTH K. Mori, A. Ben Abdallah, K. Kuroda, Design and Evaluation of a Complexity Effective Network-on-Chip Architecture on FPGA, Proc. of The 19th Intelligent System Symposium (FAN 2009), pp.318-321, Sep. 2009. The University of Aizu Adaptive systems lab 4

  5. 2D-OASIS-NoC pipeline stages Cycle 1 2 3 4 5 6 7 8 9 10 03 RC SA CT 13 RC SA CT 12 RC SA CT XY routing Bidirectional links 4x4 Mesh topology 76 bit flit 5 ports switch The University of Aizu Adaptive systems lab 5

  6. 2D-OASIS-NoC drawbacks • 2D-NoC advantages become limited and 3D-NoC showed better performance: ― Decreases the number of hops.  Effect the latency and the throughput The University of Aizu Adaptive systems lab 6

  7. Contribution • Efficient routing algorithm named minimal hop routing algorithm (MHRA). • 3D architecture, design and preliminary results. Reduce overall traffic latency by hops minimization The University of Aizu Adaptive systems lab 7

  8. Minimal hop routing algorithm Start Route Next_port = EAST Yes xadr xadr No Next_port = WEST No == xdst < xdst Next_port = NORTH Yes Yes yadr yadr No Next_port = SOUTH == ydst < ydst No Next_port = UP Yes Yes zadr zadr No Next_port = DOWN == zdst < zdst No Yes Next_port = LOCAL To switch allocator The University of Aizu Adaptive systems lab 8

  9. Minimal hop routing algorithm 000 Current node addresses 000 000 011 From previous node 001 310 311 001 301 300 module To next node EAST From switch allocator Node 000 Node 001 2x2x4 Mesh topology xaddr= 000 < xdst= 001 Input port architecture Next port Payload Destination node addresses 1 001 001 011 000….1 0000100 EAST=0000100 Packet format The University of Aizu Adaptive systems lab 9

  10. Minimal hop routing algorithm 000 000 001 011 001 001 310 311 301 300 Node 011 Node 001 NORTH xaddr= 001 = xdst= 001 yaddr= 000 < ydst= 001 1 001 001 011 000….1 0000010 NORTH= 0000010 The University of Aizu Adaptive systems lab 10

  11. Minimal hop routing algorithm 000 001 001 011 310 311 001 001 300 301 Node 111 Node 011 UP xaddr= 001 = xdst= 001 yaddr= 001 = ydst= 001 zaddr= 000 < zdst= 011 1 001 001 011 000….1 0100000 UP= 0100000 The University of Aizu Adaptive systems lab 11

  12. Minimal hop routing algorithm 011 001 001 011 310 311 001 001 300 301 LOCAL xaddr= 001 = xdst= 001 yaddr= 001 = ydst= 001 zaddr= 011 = zdst= 011 1 001 001 011 000….1 0000001 LOCAL= 0000001 The University of Aizu Adaptive systems lab 12

  13. 3D-OASIS-NoC architecture: Switch architecture • • NORTH PE N U WEST EAST R E W D S SOUTH The University of Aizu Adaptive systems lab 13

  14. 3D-OASIS-NoC architecture: Switch allocation stop-in (7) Round Robin Flow data-sent (7) Control grant-out (7) STALL-Go Scheduling Flow control sw-cntrl (49) sw-req(7) port-req (49) tail-sent (49) The University of Aizu Adaptive systems lab 14

  15. 3D-OASIS-NoC architecture: Crossbar traversal From switch allocator To the Next node From Input port The University of Aizu Adaptive systems lab 15

  16. Design results: Design methodology Module # code lines • Verilog HDL is used. Define.v 46 Route.v 80 • Quartus II Fifo.v 100 • Target device : Stratix III Input_port.v 113 Stop_go.v 56 • Modelsim Matrix_arb.v 111 Sw_alloc.v 109 Mux_out.v 55 Crossbar.v 45 Router.v 69 Network.v 158 Total 942 The University of Aizu Adaptive systems lab 16

  17. Design results: Configuration parameters Parameters 2D 3D Network size 4x4-mesh 2x2x4-mesh Buffer depth 4 4 Flit size 28 bit 33 bit Header 12 bit 17 bit Payload 16 bit 16 bit Switching Wormhole Wormhole Flow control Stall-Go Stall-Go Scheduling Round-robin Round-robin Routing X-Y MHRA The University of Aizu Adaptive systems lab 17

  18. Design results: Delay Analysis • Flits payload are randomly generated. • One single destination node: OASIS-NoC (00) and 3D- OASIS-NoC (000). 2D (Destination node:00) 3D (Destination node:000) Improvement % Node(Y-X) Delay Node(Z-Y-X) Delay 33 2200 311 1900 13.6 23 2700 211 2100 22.2 13 2600 111 1900 27 03 2300 011 1700 26 22% improvement The University of Aizu Adaptive systems lab 18

  19. Design results: Hardware Complexity Architecture Area Power(mW) Speed(MHz) (ALUTs) Balance Speed Area 2D 11016 867.97 123.3 126.23 106.18 3D 16812 883.10 113.51 114.23 97.68 8.5% 52% 1.74 % decreased increased overhead The University of Aizu Adaptive systems lab 19

  20. Conclusion • Combining the 3D integration with Network on Chips offers a good opportunity for big Multi-core SoC designs. • We present a hardware design for 3D OASIS Network-on- Chip. • 3D-OASIS-NoC achieves about 22% overall delay reduction compared with OASIS-NoC with only 1.74% overhead and 52% additional area. The University of Aizu Adaptive systems lab 20

  21. Future work • Test the design with Larger workloads (like JPEG application). • Reduce the routing algorithm complexity. The University of Aizu Adaptive systems lab 21

  22. Thank you The University of Aizu Adaptive systems lab 22

Recommend


More recommend