Implementation of Self-Healing Asynchronous Circuits at the Example - PowerPoint PPT Presentation

Implementation of Self-Healing Asynchronous Circuits at the Example of a Video-Processing Algorithm T. Panhofer W. Friesenbichler A. Steininger Vienna University of Technology

Outline � Motivation & Objective � Asynchronous Logic � Self-Healing Concept � Case Study: SH implementation of video processing algorithm � Experimental Results (& Lessons Learnt) � Conclusion & Outlook 2

The Nanoscale Challenges � significant parameter variations � threshold voltages, delays, leakages,… � increased rate of transient faults � lower voltage, smaller critical charge,… � increasing danger of permanent faults � more functions/chip, higher temperature � … 3

Resulting Needs � significant parameter variations need robust design methods that are inherently able to cope with these variations � increased rate of transient faults need fault tolerance or robustness � increasing danger of permanent faults need self-repair or „self-healing“ � … 4

Why Use Asynchronous Logic? � „delay insensitive“ operation � based on local handshaking (closed loop), � not on global clock (open loop) high robustness in time domain � two-rail coded data high robustness in value domain 5

FSL – How does it work? � dual-rail encoded data � two representations for HI/LO � tokens in alternating „phases“ implicit request explicit acknowledge 6

How far does this get us? � significant parameter variations delay-insensitive logic has a robust timing that can tolerate (virtually) all variations � increased rate of transient faults two-rail coding, robust timing � increasing danger of permanent faults still need self-repair or „self-healing“ 7

Requirements for „Self-Healing“ � detection of (permanent) error ☺ DI logic tends to stop working in this case � identification of faulty cell ☺ handshake signals tend to point there � fault removal ☺ temporal robustness makes re-routing easier 8

Self-Healing Concept (1) 9

Self-Healing Concept (2) Transformation Self-Healing Cell 10

What‘s the Benefit over TMR? � both approaches tolerate first fault � TMR without interruption of service (2oo3) � selfhealing possibly with interruption (1oo2) � self-healing is more fine-grained � more options to bypass defective element � no need to rely on „luck“ (next defect not in remaining operative nodes) 11

Why not use dynamic Reconfig.? � for FPGAs only � config interface = single point of failure � how derive new configuration? � static => too memory intensive need config for each defect set � dynamic => too performance intensive need PPR tool on mission 12

How control Reconfiguration? � Simple (=robust) solution: [initial idea] � „random repair“ without diagnosis � bits of a counter control switches � count up upon watchdog timeout => new configuration � if defect not removed => circuit still halted => next timeout => new try � with first valid configuration circuit operation continues 13

Application Study: GAIA VPU Part of the video processing algorithm used in the ESA space mission GAIA GAIA VPU = GAIA Video Processing Unit linear correction dead column correction 14

Why use this Application? � real-world circuit structure and size � pipeline with forks, joins and loops � typical space application � long mission time � extreme environment � high dependabiltiy required � no manual repair possible => self-healing is attractive 15

Environment for HW-Experiments …embedded into the fault injection environment STEFAN = Synthesizeable Test Environment For Asynchronous Networks 16

HW Experiments – Results � Autonomous reconfiguration � Single stuck-at fault injected at internal acknowledge signal � Counter used as reconfiguration controller 17

HW Experiments – Resources � # of 4-input LUTs (Xilinx Virtex-4) � Standard FPGAs can be used for prototyping of asynchronous logic, but are not efficient � 207% resources but multiple fault tolerance � Reconfiguration Unit might have significant impact 18

Lessons Learnt � In principle the idea works, BUT � reconfiguration controller problematic � counter causes overhead => use LFSR � too many values to try => split controllers � ineffective repair attempts may corrupt state => need diagnosis and systematic repair � better solution: � block-wise diagnosis � with local „random“ repair 19

Conclusion � asynchronous logic can solve some of the problems associated with nanoscale � permanent faults require self-repair, asynchronous design aids in � detection � reconfiguration and � recovery � fine-grain repair beneficial over component-level repair � presented solution shown to work in 20 principle but reconfiguration controller

Thank you for your attention!

Environment for Experiments Self-Healing implementation… 22

SHC Reliability vs. Overhead Example: fine/coarse granular SHC adder coarse grain: constant overhead fine grain: decreasing relative overhead of switches 23

Implementation of Self-Healing Asynchronous Circuits at the Example - PowerPoint PPT Presentation

Implementation of Self-Healing Asynchronous Circuits at the Example of a Video-Processing Algorithm T. Panhofer W. Friesenbichler A. Steininger Vienna University of Technology Outline Motivation & Objective Asynchronous Logic

Title: Healing Class 103 Week 1 Healing 103 Week 1 Jesus Healing Individuals Part 5

AN ASYNCHRONOUS DIVIDER IMPLEMENTATION Navaneeth Jamadagni and Jo Ebergen 2 Asynchronous

How to Design Fast Asynchronous How to Design Fast Asynchronous Routers for Asynchronous Routers

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Asynchronous Replication

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without

The Healing Journey (Healing from within) Alastair Cunningham OCI/PMH/UHN OCI/PMH/UHN Healing:

Self Healing in Streaming Systems #UW Database Day Dec 2nd, 2016 Karthik Ramasamy

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Jeff Chase CPS 212, Fall

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Jeff Chase CPS 212, Fall

An Adiabatic Power-Supply Controller for An Adiabatic Power-Supply Controller for Asynchronous

Stevenage Circuits Group Incorporating: Stevenage Circuits Tru-Lon Printed Circuits March 2011

Lecture 14: Boolean Circuits I Arijit Bishnu 17.04.2010 Introduction Boolean Circuits and P

Automated Verification of Automated Verification of asynchronous CIRCUITS USING CIRCUIT

Performance Bounds of Asynchronous Circuits with Mode-Based Conditional Behavior Mehrdad Najibi

Resurgence: Healing by Loving Blackness BY JAMILA DANIEL NOVEMBER 30, 2017 Resurgence: Healing

Optimal Healing Environments A Key Component of Your Personal Resiliency Plan Personal Healing

WHO WATCHES THE WATCHMEN? Protecting Operating System Reliability Mechanisms Bj orn D

Effective but Lightweight Online Selftest for Energy-Constrained WSNs SenseApp 2018 Ulf Kulau,

A Fault-Tolerant Alternative to Lockstep Triple Modular Redundancy Andrew L. Baldwin, BS 09,

SMT-based Analysis of Reli liability Architectures Alessandro Cimatti Fondazione Bruno Kessler,

Resource Efficient Europe Harry Lehmann SUSTAINABLE INNOVATION 2012 Resource Efficiency, Innovation

Building Buggy Chips - That Work! Building Buggy Chips - That Work! Todd Austin Advanced

New Directions For Neurorehabilitation Karunesh Ganguly, MD PhD Assistant Professor, Department

TXDOT TRAFFIC MANAGEMENT CENTER (TMC) PERFORMANCE METRICS Evolution by Performance Metrics

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us