fault tolerance of the i nput output ports in massively
play

Fault Tolerance of the I nput/ Output Ports in Massively Def ective - PowerPoint PPT Presentation

The 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Second Workshop on D Dependable and Secure Nanocomputing Friday June 27, 2008 Anchorage, AK, USA Fault Tolerance of the I nput/ Output Ports in Massively


  1. The 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Second Workshop on D Dependable and Secure Nanocomputing Friday June 27, 2008 — Anchorage, AK, USA Fault Tolerance of the I nput/ Output Ports in Massively Def ective Multicore Processor Chips Piotr Zaj � c, Jacques H Piotr Zaj Jacques Henri C enri Collet, Jean A ollet, Jean Arlat, and Yves rlat, and Yves Crouzet Crouzet {firstname.lastname}@laas.fr

  2. From Multi- Cores Architectures To Multi- Multi- Cores Architectures Now Soon Source: Intel � Multi- Core: � perf ormance while coping with power dissipation issues (very high clock f requency) � Still, � transitor size f or including many of such cores —> signif icant % of def ective cores (more than 10% ?) � Current context: � Chips are sorted according t o f requency � Single core processor = “Downgraded” dual core circuits … � How to go f urther: On- line reconf iguration to cope with f aults? 2

  3. Example Target Architecture (5x9- node Network — Connectivity: 4) C Disconnected Single R Zone Connected Zone P. Zaj � c, J. H. Collet, IOP IOP J. Arlat, Y. Crouzet, Mutual “Resilience through Self-Configuration in Diagnosis Future Massively Defective Nanochips”, v Supplemental Volume Bad Cores DSN2007, Edinburgh, Scotland, UK, I solated pp.266-271, 2007 IOP C Core Processor Router I/O Port R Failed Core Processor Inhibited Inter-router link � The I / O I nterf ace (I OP) is a Hardcore and a “Blottle Neck” 3

  4. Preliminary Analysis of Several Options � I ncrease the number of I / O ports � Consider redundant I OPs � Extend I OP connectivity with grid (adjacent nodes) � … 4

  5. I ncreasing the Number of I OPs Example of a 4- I OP Grid I ncluding 14 Def ective Cores 5

  6. Redundant I OP Architecture Example: Validation probability 4- connect R I OP with R = 3 Redundant I / O Modules (Mi) Example: Case of a 4- port Chip f or R = 5, 6, 7 and r = 3 Chip Validation Criteria? At least r out of R modules are f ault- f ree at start- up in each R I OP 6

  7. Modif ication of Grid Topology around each R I OP Prob. k / nc nodes adj / R I OP are OK f,N Connectivity n c = 4 Example of Overhead Analysis N = 300; N I O = 4; n c = 8 VC1: To protect communication bandwidth of each R I OP, at least 3/ 8 neighboring nodes must be f ault- f ree. Connectivity n c = 6 VC2: Validation yield threshold: L ( k , n c , p f , N ) � 80%. P W, I OP x P ( ) N IO � A IO Q = R � 1 N � A Connectivity n c = 8 R I OP 7

  8. Concluding Remarks � St udy of t he prot ect ion of t he I OPs in mult iport grid archit ect ures � Analysis of t he dependabilit y gain and overhead induced: redundancy, connect ivit y and chip area � Grid t opology and connect ivit y � Self - diagnosis and coverage � Applicat ion reconf igurat ion 8

Recommend


More recommend