utilize partially faulty links in
play

Utilize Partially Faulty Links in Networks-on-Chip Changlin Chen*, Ye - PowerPoint PPT Presentation

A Novel Flit Serialization Strategy to Utilize Partially Faulty Links in Networks-on-Chip Changlin Chen*, Ye Lu , Sorin D. Cotofana* ECIT, QUB *Computer engineering, TU Delft {c.chen-2, S.D.Cotofana}@tudelft.nl ylu10@qub.ac.uk NOCS 2012


  1. A Novel Flit Serialization Strategy to Utilize Partially Faulty Links in Networks-on-Chip Changlin Chen*, Ye Lu † , Sorin D. Cotofana* † ECIT, QUB *Computer engineering, TU Delft {c.chen-2, S.D.Cotofana}@tudelft.nl ylu10@qub.ac.uk NOCS 2012 1

  2. Outline Motivation Related Works Flit Serialization Evaluation results Conclusion Computer Engineering NOCS 2012 2

  3. Motivation • NoC: Routers + Links + Network Interfaces • Links are prone to – Manufacturing defects – Chip wear out effects – Process Parameter Variations • Faulty links can be isolated by fault tolerant routing algorithm • The remaining bandwidth of the partially faulty link is wasted • Partially faulty links should be utilized Computer Engineering NOCS 2012 3

  4. Related works • Use pre-fabricated spare wires to replace faulty wires – Grecu et al. * – Lehtonen et al. • Simple flit quad splitting (SFQS) – *Palesi et al. – Lehtonen et al. c b a 3 3 3 a c b a 0 3 3 3 c b a • Packet rebuilding/restoring  2 2 2 b c b a 0 2 2 2 c b a 1 1 1 – Yu et al. c c b a 0 1 1 1 c b a 0 0 0 • Partially faulty link recovery mechanism (PFLRM) – † Vitkovskiy et al. † Computer Engineering NOCS 2012 4

  5. The remaining link bandwidth should be used more efficiently Computer Engineering NOCS 2012 5

  6. Flit Serialization • Proposed link fault tolerant architecture • Links are diagnosed periodically and the fault vector of a link is sent to the control logics at TX and RX side Computer Engineering NOCS 2012 6

  7. Computer Engineering NOCS 2012 7

  8. Flit Transmission Process: fault_vector fault_vector sel flit_serialize_ctrl data_acceptable flit_deserialize_ctrl flit_type flit_type 12 update_0 16 update_1 a b a 3 3 3 mux a b 2 2 a b 1 1 a 2 mux a b 0 0 a 1 mux mux link_reg_TX link_reg_RX clk1 clk2 clk3 clk4 clk5 clk2 clk3 clk4 clk5 clk6 CLK CLK a a a a b b bb c c c c a a a data_from_crossbar wait d d d d c c c bb c data_ on_link a b b d d d 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 1 0 3 2 1 0 0 3 2 3 2 1 c c c c b b bb a a a a  cyclic_reg_TX a a a a b b bb a a a  a a a a b b  3 2 1 0 3 2 1 0 cyclic_reg_RX 3 2 1 0 3 2 1 0 c a a a b b bb c c c c b b bb 3 2 1 0 3 2 1 3 2 1 0 3 2 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 a a a a b b c c c bb c data_on_link b b bb 3 2 1 0 3 2 2 1 0 a a a a c c c c 1 0 3 data_to_input_buffer 3 2 1 0 3 2 1 0 3 2 1 0 high_reg_state flit_1_recovered low_reg_state flit_2_recovered data_acceptable Computer Engineering NOCS 2012 8

  9. Flit Transmission Process: fault_vector fault_vector sel flit_serialize_ctrl data_acceptable flit_deserialize_ctrl flit_type flit_type 12 update_0 16 update_1 c b a a 3 3 a 3 0 3 mux c b 2 2 a 2 c b a 1 1 a 1 b 2 3 mux c 0 b 0 a 0 a b 1 2 mux mux link_reg_TX link_reg_RX clk1 clk2 clk3 clk4 clk5 clk2 clk3 clk4 clk5 clk6 CLK CLK a 2 a a a a b b bb c c c c a a a data_from_crossbar wait d d d d c c c bb c data_ on_link a b b d d d 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 1 0 3 2 1 0 0 3 2 3 2 1 c c c c b b bb a a a a  cyclic_reg_TX a a a a b b bb a a a  a a a a b b  3 2 1 0 3 2 1 0 cyclic_reg_RX 3 2 1 0 3 2 1 0 c a a a b b bb c c c c b b bb 3 2 1 0 3 2 1 3 2 1 0 3 2 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 a a a a b b c c c bb c data_on_link b b bb 3 2 1 0 3 2 2 1 0 a a a a c c c c 1 0 3 data_to_input_buffer 3 2 1 0 3 2 1 0 3 2 1 0 high_reg_state flit_1_recovered low_reg_state flit_2_recovered data_acceptable Computer Engineering NOCS 2012 9

  10. Flit Transmission Process: fault_vector fault_vector sel flit_serialize_ctrl data_acceptable flit_deserialize_ctrl flit_type flit_type 12 update_0 16 update_1 c b a 3 a 1 0 a a 3 3 3 mux c 2 a a a 2 2 2 c a a a 1 1 b b 1 1 0 3 mux c a 0 a 0 0 b b c 3 2 3 mux b 2 b 1 mux b 0 link_reg_TX link_reg_RX clk1 clk2 clk3 clk4 clk5 clk2 clk3 clk4 clk5 clk6 CLK CLK a a a a b b bb c c c c a a a data_from_crossbar wait d d d d c c c bb c data_ on_link a b b d d d 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 1 0 3 2 1 0 0 3 2 3 2 1 c c c c b b bb a a a a  cyclic_reg_TX a a a a b b bb a a a  a a a a b b  3 2 1 0 3 2 1 0 cyclic_reg_RX 3 2 1 0 3 2 1 0 c a a a b b bb c c c c b b bb 3 2 1 0 3 2 1 3 2 1 0 3 2 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 a a a a b b c c c bb c data_on_link b b bb 3 2 1 0 3 2 2 1 0 a a a a c c c c 1 0 3 data_to_input_buffer 3 2 1 0 3 2 1 0 3 2 1 0 high_reg_state flit_1_recovered low_reg_state flit_2_recovered data_acceptable Computer Engineering NOCS 2012 10

  11. Flit Transmission Process: fault_vector fault_vector sel flit_serialize_ctrl data_acceptable flit_deserialize_ctrl flit_type flit_type 12 update_0 16 update_1 d c b 3 c 1 2 a a b 3 3 3 3 mux d 2 c a a b 2 2 2 2 d c a a b 1 1 c b 1 1 1 1 0 mux d a b c a 0 0 0 0 0 b c c b 3 0 3 3 mux b b 2 2 b 1 mux b 0 link_reg_TX link_reg_RX clk1 clk2 clk3 clk4 clk5 clk2 clk3 clk4 clk5 clk6 CLK CLK a a a a b b bb c c c c a a a data_from_crossbar wait d d d d c c c bb c data_ on_link a b b d d d 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 1 0 3 2 1 0 0 3 2 3 2 1 c c c c b b bb a a a a  cyclic_reg_TX a a a a b b bb a a a  a a a a b b  3 2 1 0 3 2 1 0 cyclic_reg_RX 3 2 1 0 3 2 1 0 c a a a b b bb c c c c b b bb 3 2 1 0 3 2 1 3 2 1 0 3 2 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 a a a a b b c c c bb c data_on_link b b bb 3 2 1 0 3 2 2 1 0 a a a a c c c c 1 0 3 data_to_input_buffer 3 2 1 0 3 2 1 0 3 2 1 0 high_reg_state flit_1_recovered low_reg_state flit_2_recovered data_acceptable Computer Engineering NOCS 2012 11 11

  12. Flit Transmission Process: fault_vector fault_vector sel flit_serialize_ctrl data_acceptable flit_deserialize_ctrl flit_type flit_type 12 update_0 16 update_1 d e c d 3 3 c 2 3 c b c 3 3 3 3 mux d e 2 2 c a b c 2 2 2 2 d e c d a b c 1 1 1 c 2 1 1 1 1 mux d e b c c a 0 0 0 0 0 0 b d c b 3 0 1 3 mux b b 2 2 b b 1 1 mux b b 0 0 link_reg_TX link_reg_RX clk1 clk2 clk3 clk4 clk5 clk2 clk3 clk4 clk5 clk6 CLK CLK a a a a b b bb c c c c a a a data_from_crossbar wait d d d d c c c bb c data_ on_link a b b d d d 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 1 0 3 2 1 0 0 3 2 3 2 1 c c c c b b bb a a a a  cyclic_reg_TX a a a a b b bb a a a  a a a a b b  3 2 1 0 3 2 1 0 cyclic_reg_RX 3 2 1 0 3 2 1 0 c a a a b b bb c c c c b b bb 3 2 1 0 3 2 1 3 2 1 0 3 2 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 a a a a b b c c c bb c data_on_link b b bb 3 2 1 0 3 2 2 1 0 a a a a c c c c 1 0 3 data_to_input_buffer 3 2 1 0 3 2 1 0 3 2 1 0 high_reg_state flit_1_recovered low_reg_state flit_2_recovered data_acceptable Computer Engineering NOCS 2012 12 12 12

  13. Flit Serialization Link Latency Wire fault probability    sec tion _ number flit _ number Even distribution:   Latency  proposed   fault _ free _sec tion _ number     n n       n k k k     P p 1 p p  N k e e e        k k e Latency ( cluster _ size 1) flit _ number PFLRM Cluster Faults:  sec tion _ number flit _ number  Faults may stay in one link section Latency SFQS available _sec tion _ number Table 1 Link latency overheads when flits are transmitted continuously Proposed Proposed PFLRM PFLRM Fault number SFQS 4 sections 8 sections Best situation Worst situation 0 0% 0% 0% 0% 0% 1 33.3% 14.3% 100% 100% 100% 2 100% 33.3% 100% 200% 100% 3 300% 60.0% 100% 300% 300% 4 -- 100% 100% 400% -- 5 -- 167% 100% 500% -- 6 -- 300% 100% 600% -- 7 -- 700% 100% 700% -- Computer Engineering NOCS 2012 13

  14. ECC Integration • Two ways to collaborate with ECC link_section_1 ECC ECC ... TX RX coder decoder link_section_N Router Router link_section_1 ECC ECC coder decoder ... TX RX ... ... link_section_N ECC ECC coder decoder Router Router • The realization of other parts can be conventional Computer Engineering NOCS 2012 14

  15. Evaluation Results • Platform – NoC Topology: 8 X 8 2D mesh – Router: • 3 pipeline stages – Look ahead routing & VC/Switch allocation – Switch traversal – Link Traversal • 5 physical channels with 5 virtual channels in each • Each VC is 4-flit deep and 32-bit wide – Realized at RTL level by using Verilog HDL – Synopsys Design Compiler, TSMC 65nm, 500MHz Computer Engineering NOCS 2012 15

Recommend


More recommend