Real-Time Edge Computing Chenyang Lu
Industrial Internet of Things (IIoT) Ø Synergizing sensing, analytics, and control ü Cloud computing for high capacity ü Edge computing for timely performance Condition monitoring, Cloud Emergency response, Predictive maintenance, Machine learning Database ... training … Private cloud for training and storage ... Edge 1 Edge 2 Edge N IIoT Applications services ... ... ... Wireless sensor network (e.g., in a wind farm) 2
Research challenge #1: timeliness Ø Timing constraints: q IIoT applications have latency requirements q Events carrying physical data have temporal semantics Application example: condition monitoring Image source: https://www.maintwiz.com/what-is-condition-monitoring/ 3
Research challenge #1: timeliness Ø Timing constraints: q IIoT applications have latency requirements q Events carrying physical data have temporal semantics Contribution #1 : Cyber-Physical Event Processing Architecture • latency differentiation • time consistency enforcement Application example: condition monitoring Image source: https://www.maintwiz.com/what-is-condition-monitoring/ 4
Research challenge #2: loss-tolerance Ø An IIoT service must deliver messages reliably, but q fault-tolerant systems can be slow or costly q heterogeneous traffic and platforms can increase pessimism cloud Primary service applications IIoT devices edge applications Backup service 5
Research challenge #2: loss-tolerance Ø An IIoT service must deliver messages reliably, but q fault-tolerant systems can be slow or costly q heterogeneous traffic and platforms can increase pessimism Contribution #2 : Fault-Tolerant Real-Time Messaging Architecture cloud Primary service co-scheduling fault-tolerant real-time activities • applications IIoT devices traffic/platform-aware service configuration • edge applications Backup service 6
Research challenge #3: efficiency Ø Efficiency atop loss-tolerance and timeliness: q costly to backup many in-band small computations q costly to recompute for fault recovery Example of in-band computations: AWS Lambda function for IIoT inference Image source: https://aws.amazon.com/lambda/ 7
Research challenge #3: efficiency Ø Efficiency atop loss-tolerance and timeliness: q costly to backup many in-band small computations q costly to recompute for fault recovery Contribution #3 : Adaptive Real-Time Reliable Edge Computing selective lazy data replication • proactive cleanup of obsolete data • Example of in-band computations: AWS Lambda function for IIoT inference Image source: https://aws.amazon.com/lambda/ 8
Contributions Ø Three new IIoT middleware design and implementations: q Real-time cyber-physical event processing (CPEP) q Fault-tolerant real-time messaging (FRAME) q Adaptive real-time reliable edge computing (ARREC) efficiency efficiency efficiency efficiency All have been implemented and validated within the TAO real-time event service [1] . Supplier Proxies CPEP ARREC Subscription & Filtering Event Correlation Dispatching Consumer Proxies loss-tolerance loss-tolerance loss-tolerance loss-tolerance s s s s s s s s e e e e n n n n original TAO i i i i l l l l e e e e m m m m i i i i t t t t FRAME [1] Harrison, T.H., Levine, D.L. and Schmidt, D.C., 1997. The design and performance of a real-time 9 CORBA event service. ACM SIGPLAN Notices , 32 (10), pp.184-200.
Outline Ø CPEP: real-time cyber-physical event processing Ø FRAME: fault-tolerant real-time messaging Ø ARREC: adaptive real-time reliable edge computing Supplier Proxies efficiency Subscription & Filtering Event Correlation Dispatching Consumer Proxies original TAO Supplier Proxies loss-tolerance CPEP s s e n i l e m i t Consumer Proxies with CPEP 10
Cyber-physical event processing model IIoT devices IIoT event service IIoT applications s 1 o 1 o 5 c 1 High priority s 2 o 2 o 6 c 2 Middle priority s 3 o 3 o 7 s 4 c 3 Low priority o 4 s 5 c 4 Low priority O i : operations (filtering, transformation, encryption, …) Ø Temporal semantics q Absolute time consistency • A bound on an event’s elapse time since its creation q Relative time consistency • A bound on the difference between events’ creation times 11
Real-time event processing Ø Processing in the order of priorities propagated from application: s 1 o 1 o 5 c 1 High priority s 2 o 2 o 6 c 2 Middle priority s 3 o 3 o 7 s 4 c 3 Low priority o 4 s 5 c 4 Low priority Ø Temporal semantics enforcement and shedding: q Absolute time consistency S2 S1 S3 C2 t 1 t 2 t 3 t 4 t 5 t 6 t 7 q Relative time consistency • Track both the earliest and the latest event creations, per operator 12
The CPEP processing architecture s 1 o 1 o 5 c 1 High priority s 2 o 2 o 6 c 2 Middle priority s 3 o 3 o 7 s 4 c 3 Low priority o 4 s 5 c 4 Low priority Both workers and movers are further prioritized, enabling an appropriate activity ordering. 13
Enforcing Absolute Time Consistency Ø Tracking the earliest end time of validity interval s 1 o 1 o 5 c 1 e s 1 e s 2 e s 3 s 2 o 2 o 6 c 2 s 3 o 3 o 7 s 4 c 3 o 4 s 5 c 4 Ø Responses to consistency violation q Marking: deferring the handling to consumers (Improving efficiency) q Shedding: cancelling all subsequent processing 14
Enforcing Relative Time Consistency Ø Maintaining an ordered list of events’ timestamp q One timestamp per event type q Comparing the maximum time difference with validity interval s 1 o 1 o 5 e s 1 e s 2 e s 3 e s 1 c 1 ' s 2 o 2 o 6 c 2 s 3 o 3 o 7 s 4 c 3 o 4 s 5 c 4 Ø Responses to consistency violation q Marking: deferring the handling to consumers (Improving efficiency) q Shedding: cancelling all subsequent processing 15
Experiment design Ø IIoT workload: 200 Hz s 1 q Filtering c 1 EKF 1 AES 1 s 2 High priority q Data transform Middle priority s 3 100 Hz EKF 2 FFT 1 s 4 q Encryption s 5 c 2 CAT 1 AES 2 EKF 3 FFT 2 s 6 Low priority 50 Hz s 7 c 3 EKF 4 FFT 3 AES 3 s 8 Ø Test-bed configuration: Machine 1 Machine 2 Machine 3 Suppliers CPEP Consumers Ø Comparison baseline: q Apache Flink streaming processing framework [1] [1] https://flink.apache.org 16
Latency performance 99th percentile latency (unit: ms) High Middle Low CPEP maintained high-priority latency CPEP differentiated latency performance as workload increased. according to priority level. # 17
Benefits of shedding inconsistent events Improve the throughput of consistent events. Save CPU utilization. 18
Effectiveness of CPEP Sharing Ø Experiment setup s 1 c 1 EKF 1 FFT 1 CAT 1 AES 1 s 2 100 Hz s 3 EKF 2 FFT 2 s 4 High priority s 5 100 Hz c 2 EKF 3 FFT 3 CAT 2 AES 2 s 6 s 7 EKF 4 FFT 4 AES 3 s 8 Middle priority Low priority c 3 CAT 3 AES 4 Ø Results of sharing vs. non-sharing CPEP sharing helped reduce latency Latency of low-priority processing 19
Effectiveness of Sharing Ø Results of sharing vs. non-sharing Latency of low-priority processing CPU utilization 20
Outline Ø CPEP: Real-time cyber-physical event processing Ø FRAME: Fault-tolerant real-time messaging Ø ARREC: Adaptive real-time reliable edge computing Supplier Proxies efficiency Subscription & Filtering Event Correlation Dispatching Consumer Proxies original TAO Supplier Proxies l o FRAME s timeliness s - t o l e r a n c Consumer Proxies e 21
Message loss-tolerance requirement Ø Application-specific requirements to an IIoT service : the tolerable number of consecutive losses for topic i q Value of Application examples 0 emergency response; predictive maintenance k > 0 condition monitoring (Within the tolerable number, applications may use estimates for the missing data.) Image source: https://www.originlab.com/doc/Origin-Help/Math-Inter-Extrapoltate-YfromX 22
Fault-tolerance model Ø A crash failure may happen to an IIoT service host (fail-stop) Ø Lost messages may be recovered via retransmissions from message publishers 1. via a backup service [1] 2. cloud Primary service applications IIoT devices edge applications Backup service [1] Budhiraja, N., Marzullo, K., Schneider, F.B. and Toueg, S., 1993. The primary-backup approach. 23 Distributed systems , 2 , pp.199-216.
Fault-tolerant real-time processing Ø Specify provable deadlines for message replication and dispatch Ø Co-schedule replication and dispatch using, e.g., earliest-deadline-first (EDF) Primary service cloud applications IIoT devices dispatch replication edge applications Backup service 24
Necessary condition for a message loss Ø A message may loss only if both publisher has deleted its copy 1. a copy of message has not been replicated to the Backup 2. Events between message creation and its delivery: 25
Deadlines for dispatch and replication Ø Deadline for dispatch: Publisher The deadline specifications The deadline specifications Primary Broker aid to configuration of IIoT help in configuring IIoT Subscriber traffic/platform parameters. traffic/platform parameters. � PB � BS Ø Deadline for replication: ( N i + L i ) T i x Publisher ... crash Primary Broker Backup Broker Subscriber � PB � BB : topic’s sending period : # of most-recent messages a publisher can retransmit : latency requirement : loss-tolerance requirement 26
Recommend
More recommend