Joseph Paturel, Simon Rokicki, Olivier Sentieys Univ. Rennes, Inria, IRISA
Why care about Fault Tolerance? • Modern technologies – Lower node capacitances – Denser layouts High SET sensitivity – Increased frequencies • Energy efficiency – Lower supply and threshold voltages 2
Vulnerability Analysis • Fault injection, simulation or emulation most often: – Only injects single-bit faults – Does not model the microarchitecture – Ignores combinational logic • Memory/register fault injection is not enough – Need to model microarchitecture – Need to consider combinational logic [1] • New technologies exhibit multi-bit error behaviors – Need to model MBUs as well as SEUs [1] N. N. Mahatme et al , «Comparison of Combinational and Sequential Error Rates for a Deep 3 Submicron Process», IEEE Trans. On Nuclear Science , Dec. 2011
Contributions • MBUs are present and are here to stay • Fault injection methodology and flow (Part II) – From gate to microarchitecure – MBU-aware – Fast and accurate • Use case: Comet RISC-V processor core (Part I) 4
Part I: Comet a HLS designed RISC-V Core
SW HW Design • Traditional Processor Design Flow Validation & Verification – Maintain two coherent models: RTL Compiler • RTL and simulation (ISS) models Simulation Compiled code ISS RTL Synthesis Physical Design 6
HW Design C++ SW • Traditional Processor Design Flow & Verification Model Validation – Maintain two coherent models: • RTL and simulation (ISS) models HLS Compiler Compiled RTL SW code Compiled ISS • Proposed Flow Simulation ISS – Design the processor as well as its RTL Synthesis software validation flow from a single high-level model Physical Design 7
Explicitly Pipelined Simulator (1/2) • Comet core – 32-bit RISC-V instruction set RV32IM – In-order 5-stage pipeline micro-architecture • Pipelined stages are explicit • Main loop is pipelined (II=1) • Explicit stall mechanism • Explicit forwarding 8
Explicitly Pipelined Simulator (2/2) RegFile Forward Instruction Cache ALU Decode Fetch Mem Data Branch Unit Cache Fetch Decode Execute Memory Write Back 9
Design and Validation Flow Simulation performance • 26 Millions cycles per sec. • MiBench C compiler Simulator • 8 th -gen. Intel core i7 ASIC Flow Floor- Xilinx plan Vivado HLS core.c rtl.v Mentor Catapult HLS FPGA Flow Bitstream What about quality of the hardware? 10
Synthesis Results • Target technology is STMicro 28nm FDSOI Area • All cores are configured for rv32i 6% 11% 42% 36% 5% Fetch Decode Execute Memory Writeback (includes RF) 11
Advantages and Limitations Advantages Limitations • Improves readability, • Pipeline stages and some productivity, maintainability, and features (e.g. multi-cycle flexibility of the design operators) have to be explicit • Fast simulation (~20.10 6 cycles/s) • HLS tools may have trouble synthesizing large multi-core • Object-Oriented processor model systems… can be easily modified, expanded and verified 12 12
Part II: Vulnerability Analysis Flow
Proposed Approach to Vulnerability Analysis Gate-level .v/.vhdl Error Analysis Patterns HLS uArch Workload Injection C++ Model C++ Compilation Vulnerability Metrics 14
Gate- 1/ Gate-level Analysis Error .v/.vhdl level Patterns Analysis • Inject SETs in the Technology Gate-level gate-level netlist library netlist Error Parameters: probability • Resolution • Duration Fault injector • Type • N_inj • N_sim Error insertions Gate-level Bit position Log Input generation netlist Logging TestBench 15
Gate- 1/ Gate-level Analysis Error .v/.vhdl level Patterns Analysis • Logging patterns and error probability (SEUs + MBUs) : : : : 16
Results: Comet Execution Stage Number of erroneous bits in output Dest. Register, register Opcode Forwarding, etc. SEUs 94.9% Output register per bit error MBUs 5.1% probability ALU outputs 17 1 Million injections
Influence of SET Width and Frequency on MBUs • Fixed width (400ps) • Fixed frequency (500MHz) 18
2/ Microarchitectural-Level Fault Injection • Augmented simulator allows for Error injection of gate-level fault patterns Patterns • Injection is guided by the area of uArch Workload the different pipeline stages Injection • Fault classes considered: Vulnerability Metrics – Crashes and Hangs – ISM, AOM, ISM & AOM ISM: Internal State Mismatch AOM: Application Output Mismatch 19
Comet Vulnerability Analysis Results • Error class proportions • Standard vs. proposed approach 1,6 1,4 1,2 1 0,8 Standard 0,6 Proposed 0,4 0,2 0 Masked Crash + ISM + Masked Crash + ISM + Masked Crash + ISM + Masked Crash + ISM + Hang AOM Hang AOM Hang AOM Hang AOM matmul qsort blowfish average 20
Conclusion on Vulnerability Analysis • MBUs are present and are here to stay • MBUs significantly impact AVF – more than 50% critical errors (crashes & hangs) • Fault injection methodology and flow – From gate to microarchitecure – Conscious of MBU patterns and error probability – Fast and accurate 21
Conclusion & Roadmap on Comet • Efficient processor core design (HW µarch + SW simulator) from a single C++ code • Current projects – Dynamic Binary Translation, Non-Volatile Processor, Fault-Tolerant Multicore, etc. • Perspectives – Automatic source-to-source transformations for HLS • From ISS-like specification to HLS-optimized C code – Support for floating point extension – RTOS Support (process, interrupt controller, peripherals) – Multi-core system with cache coherency (Q4 2019) – Many-core system with NOC (2020) 22
Questions Thank you for your attention! ? https://gitlab.inria.fr/srokicki/Comet 23
Recommend
More recommend