Design Optimization of Time- and Cost-Constrained Fault-Tolerant Distributed Embedded Systems Viaceslav Izosimov, Paul Pop, Petru Eles, Zebo Peng Embedded Systems Lab (ESLAB) Linköping University, Sweden 1/21 1 of 14
Motivation � Faults � Hard real-time applications � Predictable � Timing constraints � Transient � Cost constraints � Intermittent � Hardware solutions vs. � Software solutions � Software solutions � Re-execution/rollback recovery � Re-execution/rollback recovery � MARS, TTA, X-by-Wire � Checkpointing/rollback recovery � Checkpointing/rollback recovery � Permanent faults � Replication, primary-backup… � Replication, primary-backup… � Costly for transient faults � Online preemptive vs. � Off-line non-preemptive � Flexible � Predictable 2/21 2 of 14
Outline � Motivation � System architecture and fault-model � Fault-tolerance techniques � Problem formulation � Motivational examples � Tabu-search optimization strategy � Experimental results � Contributions and Message 3/21 3 of 14
Fault-Tolerant Time-Triggered Systems Transient faults Processes: Processes: Re-execution and replication Static cyclic scheduling ... Messages: Messages: Fault-tolerant protocol Static schedule table Time Triggered Protocol ( TTP ) S 1 S 3 S 2 S 4 S 1 S 3 S 2 S 4 Bus access scheme: � Slot time-division multiple-access (TDMA) Schedule table located in each TTP � TDMA Round controller: message descriptor list (MEDL) Cycle of two rounds 4/21 4 of 14
Fault-Tolerant Techniques 2 N 1 P 1 N 1 P 1 P 1 N 1 N 2 P 1 P 1 P 1 P 1 N 2 P 1 N 3 P 1 Re-executed Re-execution Replication replicas 5/21 5 of 14
Problem Formulation � Given � Fault model � Number of transient faults in the system period � System architecture � Application � WCETs, message sizes, periods, deadlines � Determine Fault-model : transient faults � Schedulable and fault-tolerant design implementation ... � Fault-tolerance policy assignment � Mapping of processes and messages � Schedule tables for processes and messages Application : set of process graphs Architecture : time-triggered system 6/21 6 of 14
Static Scheduling [Kandasamy et al. 03] Contingency Transparent Recovery schedules re-execution slack P 2 P 2 P 2 P 3 P 3 P 4 P 4 N 1 : S 1 N 1 : S 2 m 1 m 1 P 1 P 1 P 1 N 2 : S 12 N 2 : S 11 m 2 m 2 P 5 P 5 N 3 : S 14 N 3 : S 14 2 N 1 N 2 N 3 N 1 N 2 N 3 Root schedules S 1 S 1 S 11 S 11 S 14 P 2 P 3 P 4 P 5 P 5 P 1 S 2 S 2 S 6 S 9 S 12 S 12 S 15 S 18 m 2 Contingency P 5 P 2 P 3 P 4 P 3 P 4 P 4 P 1 m 1 schedules P 1 P 2 S 3 S 4 S 5 S 7 S 8 S 10 S 13 P 3 P 4 7/21 7 of 14
Re-execution vs. Replication Deadline Deadline N 1 P 1 P 2 P 3 Missed N 1 P 1 P 2 P 3 N 2 P 1 P 2 P 3 Met N 2 P 1 P 2 P 3 TTP TTP S 1 S 2 m 2 m 1 m 1 m 2 S 1 S 2 m 1 m 1 Re-execution is better Replication is better N 1 P 1 P 2 P 3 N 1 P 1 P 2 Met Missed N 2 N 2 P 3 TTP S 1 S 2 TTP S 1 S 2 N 1 N 2 P 1 N 1 N 2 m 1 m 2 P 1 40 50 1 P 1 P 2 P 3 m 1 A 2 A 1 P 3 P 2 40 50 P 3 60 70 P 2 8/21 8 of 14
Fault-Tolerant Policy Assignment Deadline N 1 N 1 P 1 P 1 P 2 P 2 P 4 P 4 No fault-tolerance: application crashes N 2 N 2 P 3 P 3 TTP TTP S 1 S 2 S 1 S 2 m 2 m 2 N 1 P 1 P 2 P 4 N 2 P 3 Missed TTP S 1 S 2 m 2 1 N 1 N 2 m 2 P 3 P 1 40 50 N 1 N 2 P 1 P 2 60 80 m 3 P 3 60 80 P 2 P 4 m 1 P 4 40 50 9/21 9 of 14
Fault-Tolerant Policy Assignment Deadline N 1 N 1 P 1 P 1 P 2 P 2 P 4 P 4 No fault-tolerance: application crashes N 2 N 2 P 3 P 3 TTP TTP S 1 S 2 S 1 S 2 m 2 m 2 N 1 N 1 P 1 P 1 P 2 P 2 P 4 P 3 P 4 Missed N 2 P 1 P 2 P 3 P 4 N 2 P 3 Missed TTP TTP S 1 S 2 S 1 S 2 m 1 m 1 m 2 m 2 m 2 m 3 m 3 1 N 1 N 2 m 2 P 3 P 1 40 50 N 1 N 2 P 1 P 2 60 80 m 3 P 3 60 80 P 2 P 4 m 1 P 4 40 50 10/21 10 of 14
Fault-Tolerant Policy Assignment Deadline N 1 N 1 P 1 P 1 P 2 P 2 P 4 P 4 No fault-tolerance: application crashes N 2 N 2 P 3 P 3 TTP TTP S 1 S 2 S 1 S 2 m 2 m 2 Optimization N 1 P 1 P 2 P 4 N 1 N 1 P 1 P 1 P 2 P 2 P 4 P 3 P 4 Met Missed of fault-tolerance N 2 N 2 P 1 P 1 P 3 P 2 P 3 P 4 N 2 P 3 Missed policy assignment TTP TTP S 1 S 2 TTP S 1 S 2 S 1 S 2 m 2 m 1 m 1 m 1 m 2 m 2 m 2 m 3 m 3 1 N 1 N 2 m 2 P 3 P 1 40 50 N 1 N 2 P 1 P 2 60 80 m 3 P 3 60 80 P 2 P 4 m 1 P 4 40 50 11/21 11 of 14
Mapping and Fault-Tolerance Best mapping without N 1 P 1 P 2 P 4 considering fault-tolerance N 2 P 3 TTP S 1 S 2 m 4 m 2 Deadline N 1 P 1 P 2 P 4 N 2 P 3 Missed TTP S 1 S 2 m 2 m 4 P 1 m 1 m 2 N 1 N 2 1 P 1 40 X N 1 N 2 P 2 P 3 P 2 60 70 P 3 60 70 m 3 m 4 P 4 40 X P 4 12/21 12 of 14
Mapping and Fault-Tolerance Best mapping without N 1 P 1 P 2 P 4 considering fault-tolerance N 2 P 3 Simultaneous TTP S 1 S 2 m 4 m 2 mapping and fault-tolerance Deadline Deadline N 1 N 1 P 1 P 1 P 2 P 2 P 3 P 4 P 4 Met N 2 N 2 P 3 Missed TTP TTP S 1 S 2 S 1 S 2 m 2 m 2 m 4 m 4 P 1 m 1 m 2 N 1 N 2 1 P 1 40 X N 1 N 2 P 2 P 3 P 2 60 70 P 3 60 70 m 3 m 4 P 4 40 X P 4 13/21 13 of 14
Optimization Strategy Design optimization: � Fault-tolerance policy assignment � Tabu-search Mapping of processes and messages � List scheduling Root schedules � Three tabu-search optimization algorithms: � 1. Mapping and Fault-Tolerance Policy assignment ( MRX ) Re-execution, replication or both � 2. Mapping and only Re-Execution ( MX ) 3. Mapping and only Replication ( MR ) 14/21 14 of 14
MRX Tabu-Search Example N 1 P 1 P 2 P 4 Current P 1 P 1 P 2 P 2 P 3 P 3 P 4 P 4 Tabu Tabu 1 1 2 2 0 0 0 0 solution N 2 P 3 Wait Wait 1 1 0 0 1 1 1 1 TTP S 1 S 2 S 2 m 2 Design transformations N 1 P 2 P 4 Tabu move & P 1 P 1 P 2 P 2 P 3 P 3 P 4 P 4 worse than Tabu Tabu 1 1 2 2 0 0 0 0 N 2 P 1 P 3 best-so-far Wait Wait 1 1 0 0 1 1 1 1 TTP S 1 S 2 S 1 m 1 N 1 N 2 1 m 2 P 3 P 1 40 50 N 1 N 2 P 2 60 75 P 1 m 3 P 3 60 75 P 2 P 4 m 1 P 4 40 50 15/21 15 of 14
MRX Tabu-Search Example N 1 P 1 P 2 P 4 Current P 1 P 1 P 2 P 2 P 3 P 3 P 4 P 4 Tabu Tabu 1 1 2 2 0 0 0 0 solution N 2 P 3 Wait Wait 1 1 0 0 1 1 1 1 TTP S 1 S 2 S 2 m 2 Design transformations N 1 N 1 P 1 P 2 P 4 P 2 P 4 Tabu move & Tabu move & P 1 P 1 P 1 P 1 P 2 P 2 P 2 P 2 P 3 P 3 P 3 P 3 P 4 P 4 P 4 P 4 better than worse than Tabu Tabu Tabu Tabu 1 1 2 2 1 2 2 1 0 0 0 0 0 0 0 0 N 2 N 2 P 1 P 1 P 3 P 3 best-so-far best-so-far Wait Wait Wait Wait 0 1 1 0 0 0 0 0 2 2 1 1 1 1 1 1 TTP TTP S 1 S 2 S 1 S 2 S 1 m 2 m 1 m 1 N 1 N 2 1 m 2 P 3 P 1 40 50 N 1 N 2 P 2 60 75 P 1 m 3 P 3 60 75 P 2 P 4 m 1 P 4 40 50 16/21 16 of 14
MRX Tabu-Search Example N 1 P 1 P 2 P 4 Current P 1 P 1 P 2 P 2 P 3 P 3 P 4 P 4 Tabu Tabu 1 1 2 2 0 0 0 0 solution N 2 P 3 Wait Wait 1 1 0 0 1 1 1 1 TTP S 1 S 2 S 2 m 2 Design transformations N 1 N 1 N 1 P 1 P 1 P 2 P 2 P 3 P 4 P 2 P 4 P 4 Non-tabu & Tabu move & Tabu move & P 1 P 1 P 1 P 1 P 1 P 1 P 2 P 2 P 2 P 2 P 2 P 2 P 3 P 3 P 3 P 3 P 3 P 3 P 4 P 4 P 4 P 4 P 4 P 4 worse than worse than better than Tabu Tabu Tabu Tabu Tabu Tabu 1 1 2 1 1 2 2 2 1 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 N 2 N 2 N 2 P 1 P 1 P 3 P 3 best-so-far best-so-far best-so-far Wait Wait Wait Wait Wait Wait 1 1 1 0 0 1 0 0 0 0 0 0 1 1 1 1 2 2 1 1 1 1 1 1 TTP TTP TTP S 1 S 2 S 1 S 2 S 1 S 2 S 2 S 1 m 2 m 1 m 2 m 1 N 1 N 2 1 m 2 P 3 P 1 40 50 N 1 N 2 P 2 60 75 P 1 m 3 P 3 60 75 P 2 P 4 m 1 P 4 40 50 17/21 17 of 14
MRX Tabu-Search Example N 1 P 1 P 2 P 4 Current P 1 P 1 P 2 P 2 P 3 P 3 P 4 P 4 Tabu Tabu 1 1 2 2 0 0 0 0 solution N 2 P 3 Wait Wait 1 1 0 0 1 1 1 1 TTP S 1 S 2 S 2 m 2 Design transformations N 1 N 1 N 1 N 1 P 1 P 1 P 1 P 2 P 2 P 2 P 4 P 3 P 4 P 2 P 3 P 4 P 4 Non-tabu & Tabu move & Tabu move & Non-tabu & P 1 P 1 P 1 P 1 P 1 P 1 P 1 P 1 P 2 P 2 P 2 P 2 P 2 P 2 P 2 P 2 P 3 P 3 P 3 P 3 P 3 P 3 P 3 P 3 P 4 P 4 P 4 P 4 P 4 P 4 P 4 P 4 worse than worse than worse than better than Tabu Tabu Tabu Tabu Tabu Tabu Tabu Tabu 1 1 2 2 1 1 1 1 2 2 1 2 2 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N 2 N 2 N 2 N 2 P 1 P 1 P 3 P 3 P 3 best-so-far best-so-far best-so-far best-so-far Wait Wait Wait Wait Wait Wait Wait Wait 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 TTP TTP TTP TTP S 1 S 2 S 1 S 2 S 1 S 2 S 1 S 2 S 2 S 2 S 1 m 2 m 1 m 2 m 1 m 2 N 1 N 2 1 m 2 P 3 P 1 40 50 N 1 N 2 P 2 60 75 P 1 m 3 P 3 60 75 P 2 P 4 m 1 P 4 40 50 18/21 18 of 14
Recommend
More recommend