Defending Distributed Cyber-Physical Systems with Bounded Time - PowerPoint PPT Presentation

Defending Distributed   Cyber-Physical Systems with   Bounded Time Recovery Bri Brian Sa Sandler, Neeraj Gandhi, Linh Thi Xuan Phan, Andreas Haeberlen NSF/Intel CPS PI Meeting July 2018 1

Machines in Control • Vulnerable CPS can cause Bellingham, WA disaster. Oil pipeline explosion after the two controlling computers failed. • Explosion • Equipment damage • Power outages Iran Stuxnet vulnerability destroyed • … centrifuges used for nuclear enrichment. We want to pre reve vent Ivano-Frankivsk, Ukraine Controlling power grid systems disa sast ster. were compromised leaving residents in the dark. 2 BTR - NSF/Intel PI Meeting - July 2018

Goal: General Defense Non-Crash Bugs Hacking Crashes Byzantine Faults 3 BTR - NSF/Intel PI Meeting - July 2018

Example: Industrial Automation Let’s take a simple example system… N 1 N 2 N 4 N 3 S 1 S 2 A 1 A 2 A 3 A 4 4 BTR - NSF/Intel PI Meeting - July 2018

Example: Industrial Automation This system will run four applications. 5 1 7 4 N 1 N 2 N 4 N 3 S 1 S 2 3 2 A 1 A 2 A 3 A 4 6 8 5 BTR - NSF/Intel PI Meeting - July 2018

Example: Industrial Automation We’ll focus on the burner control application… 5 1 7 4 N 1 N 2 N 4 N 3 S 1 S 2 3 2 A 1 A 2 A 3 A 4 6 8 6 BTR - NSF/Intel PI Meeting - July 2018

Example: Impact of Failures What can go wrong? N 4 can dro rop or delay delay messages and ruin the chemical processing. 5 1 N 4 can send an inco corre rrect ct 7 4 value to A 1 and light the va N 1 building on fire. N 2 N 4 N 3 S 1 S 2 3 2 A 1 A 2 A 3 A 4 6 8 7 BTR - NSF/Intel PI Meeting - July 2018

State of the Art: Byzantine Fault Tolerance • Be Benefit fits • Adversarial Scenarios • Strong Guarantees • Nice Programming Model 8 BTR - NSF/Intel PI Meeting - July 2018

Is continuous perfection required? • How bad is it if the adversary gains control? • Many CPS have properties Chemical that resist quick changes Vat • inertia • thermal capacity N 4 • We don’t have to always be perfect We ca can leve vera rage this! s! 9 BTR - NSF/Intel PI Meeting - July 2018

For how long is faulty behavior okay? • Different applications have different tolerances. DC/DC converters (STM) 20 μ s Direct torque control (ABB) 25 μ s AC/DC converters 50 μ s Electronic throttle control (Ford) 5ms Traction control (Ford) 20ms Micro-scale race cars 40ms Autonomous vehicle steering 50ms Energy-efficient building control 500ms Source: M. Morari. Fast model predictive control (mpc). A time me peri riod usu sually y exi xist sts s where re faulty y behavi vior r is s ok k so so long as s the syst system m re return rns s to its s co corre rrect ct behavi vior r within that peri riod. 10 BTR - NSF/Intel PI Meeting - July 2018

Approach: Bounded Time Recovery • BTR guarantees that system recovers from any fault within a short period of time, so that the end goal will be met • Weaker guarantee is often sufficient Recovery Correct Operation Correct Operation Period Time Fault Recovered 11 BTR - NSF/Intel PI Meeting - July 2018

So, how do we make this happen? REBOUND 12 BTR - NSF/Intel PI Meeting - July 2018

REBOUND 1. Planning • Before system is compromised, think about what it should do. • System operates in different modes for any given set of faults. • Can drop less critical tasks as necessary. N 1 N 1 N 2 fails N 2 N 2 N 3 N 3 N 1 : N 3 : N 4 N 4 N 4 : 13 BTR - NSF/Intel PI Meeting - July 2018

REBOUND 2. Detection Nodes watch over each other to detect faults. Evidence N 4 is SEND… SEND… faulty 5 3 1 3 RECV… RECV… … … 7 4 N 4 is faulty. N 1 N 2 N 4 N 3 S 1 S 2 3 2 A 1 A 2 A 3 A 4 6 8 14 BTR - NSF/Intel PI Meeting - July 2018

REBOUND 3. Consistency Flood evidence throughout the system. N 4 is faulty 1 5 3 3 7 4 N 1 N 2 N 4 N 3 S 1 S 2 3 2 A 1 A 2 A 3 A 4 6 8 15 BTR - NSF/Intel PI Meeting - July 2018

REBOUND 4. Adaptation Each node independently transitions to a new mode N 4 is All nodes N 4 is 1 5 faulty faulty OK All node N 4 is 7 8 4 faulty OK N 1 N 2 l nodes N 4 is OK faulty N 4 N 3 S 1 S 2 3 2 3 A 1 A 2 A 3 A 4 All no N 4 6 8 All nodes N 4 is faulty faulty OK All nodes N 4 is All nodes N 4 is All nodes N 4 is OK faulty 16 faulty OK BTR - NSF/Intel PI Meeting - July 2018 faulty OK

Outline • Problem Introduction • Bounded Time Recovery • REBOUND • Technical Components 1. Planning 2. Detection 3. Consistency 4. Adaptation • Results 17 BTR - NSF/Intel PI Meeting - July 2018

1. Planning For every* mode, we have a precomputed schedule and plan for every node. No Faults • Schedule generated offline • When tasks should run and where Node 1 Link 1-2 • Many constraints Faulty Faulty • Dependent scheduling problem Nodes … … 1&4 Faulty • Builds a tree * Can limit the number of faults to improve computation time. 18 BTR - NSF/Intel PI Meeting - July 2018

2. Detection I declare Omission Faults link N 1 – N be fault • Declare link faulty if an expected message from a neighbor is not received X N 1 N 2 • Declaration causes other nodes to change mode. • Leverage synchrony. RECV… Commission Faults 2 4 SEND… Audit/Witne • Witness/Audit Nodes and Replicas RECV… Task • If fault found, log is used as a proof of (runs a replica misbehavior. • Large improvement over PeerReview 2 4 2 4 RECV… RECV… • Adding synchrony SEND… SEND… RECV… Challenge: Bounding Time of Detection RECV… 19 BTR - NSF/Intel PI Meeting - July 2018

3. Consistency We need a solution where… • Any two good nodes agree on the state of the system or • The two become aware they cannot X communicate St Stra rawma man: flood the system periodically with signed attestations of current mode • Actual solution is more efficient 20 BTR - NSF/Intel PI Meeting - July 2018

4. Adaptation • Each node individually transitions when its mode changes. • When evidence is received a mode change occurs within a bounded period of time. N 1 N 1 N 1 N 1 N 4 fails N 1 fails N 2 fails N 2 N 2 N 2 N 2 N 3 N 3 N 3 N 3 N 3 : N 1 : N 3 : N 3 : N 4 N 4 N 4 N 4 N 4 : N 4 : N 1 & N 2 Faulty N 2 Faulty N 1 ,N 2 ,N 4 Faulty 21 BTR - NSF/Intel PI Meeting - July 2018

Challenges • Bounding every step of the algorithms • Overhead of periodic flood • Multisignatures � drastically reduce traffic • Handling equivocation • Different nodes notifying of different faults to their neighbors • Proving everything • Correctness … • Completeness • Bounded detection • Bounded stabilization … • Planning • Unique problem … 22 BTR - NSF/Intel PI Meeting - July 2018

Outline • Problem Introduction • Bounded Time Recovery • REBOUND • Technical Components 1. Planning 2. Detection 3. Consistency 4. Adaption • Results 23 BTR - NSF/Intel PI Meeting - July 2018

Overhead of Schedule Tree f = # of faulty nodes protected against • Time depends on: • The number of nodes. • Degree of network. • Number of faulty nodes, f. • Only compute once for the lifetime of the system. • Subtrees easily parallelizable. 24 BTR - NSF/Intel PI Meeting - July 2018

Recovery Unprotected System, N 2 Compromised 25 BTR - NSF/Intel PI Meeting - July 2018

Recovery Protected System, N 2 Compromised Recovery Period 26 BTR - NSF/Intel PI Meeting - July 2018

Recovery Protected System, N 1 , N 2 , N 3 Compromised 27 BTR - NSF/Intel PI Meeting - July 2018

Ke Key y Idea: Period of Imperfection Many CPS can tolerate a short period of aulty behavior. Appro Ap roach ch: Bounded Time Recovery Bounded time recovery guarantees that the system quickly returns to correct behavior fter a fault. So Solution: REBOUND Algorithms and protocols to provide BTR or distributed systems. Thank you. 28 BTR - NSF/Intel PI Meeting - July 2018

Defending Distributed Cyber-Physical Systems with Bounded Time - PowerPoint PPT Presentation

Defending Distributed Cyber-Physical Systems with Bounded Time Recovery Bri Brian Sa Sandler, Neeraj Gandhi, Linh Thi Xuan Phan, Andreas Haeberlen NSF/Intel CPS PI Meeting July 2018 1 Machines in Control Vulnerable CPS can cause

Cyber-Physical Systems 07/24/2019 Heechul Yun University of Kansas 1 Modern Cyber-Physical

CYBER CYBER-SAFETY CYBER CYBER SAFETY SAFETY SAFETY BASICS BASICS Engineering Staff College

Formal Verification of Cyber-Physical Systems Matthew Chan , Daniel Ricketts, Sorin Lerner,

between Cyber-Physical Systems (CPS) and Smart Systems and Smart System Integration paradigms,

Mark Fernandes Principal, Cyber Risk Services + FUTURE OF CYBER CYB CYBER SINGU GULAR ARIT

6.02 Fall 2012 Lecture #12 Bounded-input, bounded-output stability Frequency response 6.02

Important Examples of Cyber-Physical Systems Cyber-Physical Systems under Attack Models,

Bounded Radius Routing Perform bounded PRIM algorithm Under = 0, = 0.5, and =

Identifying Implicit Component Interactions in Distributed Cyber-Physical Systems 50th Hawaii

Practical approaches to managing and securing cyber-physical systems Sanjiv Doshi, Principal

Logical Foundations of Cyber-Physical Systems Andr Platzer Andr Platzer (CMU) LFCPS/01:

Reconfiguration in Cyber-Physical Systems Sebastian Wtzoldt System Analysis and Modeling Group

Cyber-Physical Systems Verification with KeYmaera X Andr Platzer Andr Platzer Logical

Verified Cyber-Physical Systems [FM11] 2 Verified Cyber-Physical Systems x l x j

Smart and Adaptive Cyber-Physical Systems Chapters 1,2 Cyber-Physical Systems Smart mobility

90% OF CYBER ATTACKS ARE 90% OF CYBER ATTACKS ARE SUCCESSFUL DUE TO HUMAN SUCCESSFUL DUE TO

Shorting for Fun and Profit Nah, Just Profit Michael Shulman Editor, ChangeWave Shorts and

First Quarterly Report September 10, 2020 Summary The First Quarterly Report is on track

Collision Attack on 5 Rounds of Grstl Martin Schl Florian Mendel Vincent Rijmen affer The

Evaluating and Treating DNAPL in Fractured Rock Charles Schaefer, Ph.D. David Lippincott

Third Quarter Results 2009 Zurich October 22, 2009 Cautionary statement Cautionary statement

PANDEMICS AND MARKETS: WHAT TO DO NOW By Craig Price, CFP, CTFA Price Wealth Management March

Bouncing Balls the tennis ball bounce? To approximately its original height A. Much higher than

Dynamic Rebinding for Marshalling and Update, with Destruct-time Gavin Bierman Michael Hicks

Sambuz

Useful Links

Newsletter

Mail Us