TransparentCheckpointofClosed DistributedSystemsin Emulab - PowerPoint PPT Presentation

Transparent Checkpoint of Closed  Distributed Systems in  Emulab  Anton Burtsev, Prashanth Radhakrishnan,   Mike Hibler, and Jay Lepreau  University of Utah, School of CompuEng 

Emulab  • Public testbed for network experimentaEon  • Complex networking experiments within minutes  2 

Emulab — precise research tool  • Realism:   – Real dedicated hardware  • Machines and networks  – Real operaEng systems  – Freedom to configure any component of the soNware  stack  – Meaningful real‐world results  • Control:  – Closed system  • Controlled external dependencies and side effects  – Control interface  – Repeatable, directed experimentaEon  3 

Goal: more control over execuEon  • Stateful swap‐out  – Demand for physical resources exceeds capacity  – PreempEve experiment scheduling  • Long‐running   • Large‐scale experiments  – No loss of experiment state  • Time‐travel  – Replay experiments  • DeterminisEcally or non‐determinisEcally  – Debugging and analysis aid  4 

Challenge  • Both controls should preserve fidelity of  experimentaEon  • Both rely on  transparency  of distributed checkpoint  5 

Transparent checkpoint  • TradiEonally, semanEc transparency:  – Checkpointed execuEon is one of the possible correct  execuEons  • What if we want to preserve performance  correctness?   – Checkpointed execuEon is one of the correct execuEons  closest  to a non‐checkpointed run  • Preserve measurable parameters of the system  – CPU allocaEon  – Elapsed Eme  – Disk throughput  – Network delay and bandwidth  6 

TradiEonal view  • Local case  – Transparency = smallest possible downEme  – Several milliseconds [Remus]  – Background work  – Harms realism  • Distributed case  – Lamport checkpoint  • Provides consistency  – Packet delays, Emeouts, traffic bursts, replay buffer  overflows  7 

Main insight  • Conceal checkpoint from the system under test  – But sEll stay on the real hardware as much as possible  • “Instantly” freeze the system  – Time and execuEon  – Ensure atomicity of checkpoint  • Single non‐divisible acEon   • Conceal checkpoint by Eme virtualizaEon  8 

ContribuEons  • Transparency of  distributed checkpoint  • Local atomicity   – Temporal firewall   • ExecuEon control mechanisms for Emulab  – Stateful swap‐out  – Time‐travel  • Branching storage  9 

Challenges and implementaEon  10 

Checkpoint essenEals  • State encapsulaEon   – Suspend execuEon  – Save running state of the  system  • VirtualizaEon layer  11 

Checkpoint essenEals  • State encapsulaEon   – Suspend execuEon  – Save running state of the  system  • VirtualizaEon layer  – Suspends the system  – Saves its state  – Saves in‐flight state  – Disconnects/reconnects to  the hardware  12 

First challenge: atomicity  • Permanent encapsulaEon is  harmful  – Too slow  – Some state is shared   • Encapsulated upon  checkpoint  • Externally to VM   – Full memory virtualizaEon  – Needs declaraEve descripEon  ?  of  shared state  • Internally to VM   – Breaks atomicity  13 

Atomicity in the local case  • Temporal firewall  – SelecEvely suspends  execuEon and Eme  – Provides atomicity inside  the firewall  • ExecuEon control in the  Linux kernel  – Kernel threads  – Interrupts, excepEons,  IRQs  • Conceals checkpoint   – Time virtualizaEon  14 

Second challenge: synchronizaEon  • Lamport checkpoint  $%#!  – No synchronizaEon         ???  Timeout  – System is parEally  suspended  • Preserves consistency   – Logs in‐flight packets  • Once logged it’s  impossible to remove  • Unsuspended nodes  – Time‐outs  15 

Synchronized checkpoint  • Synchronize clocks  across the system  • Schedule  checkpoint   • Checkpoint all  nodes at once  • Almost no in‐flight  packets  16 

Bandwidth‐delay product  • Large number of in‐ flight packets   • Slow links dominate  the log  • Faster links wait for  the enEre log to  complete  • Per‐path replay?  – Unavailable at Layer 2  – Accurate replay  engine on every node  17 

Checkpoint the network core  • Leverage Emulab delay  nodes  – Emulab links are no‐delay  – Link emulaEon done by    delay nodes  • Avoid replay of in‐flight  packets  • Capture all in‐flight packets  in core  – Checkpoint delay nodes  18 

Efficient branching storage  • To be pracEcal stateful  swap‐out has to be fast  • Mostly read‐only FS  – Shared across nodes and  experiments  • Deltas accumulate  across swap‐outs  • Based on LVM  – Many opEmizaEons  19 

EvaluaEon 

EvaluaEon plan  • Transparency of the checkpoint  • Measurable metrics  – Time virtualizaEon  – CPU allocaEon  – Network parameters  21 

Time virtualizaEon  Timer accuracy is 28 μsec  do {       usleep(10 ms)  Checkpoint every 5 sec  Checkpoint adds ±80 μsec       germeofday()  (24 checkpoints)  error  } while ()  sleep + overhead = 20 ms  22 

CPU allocaEon  Checkpoint adds 27 ms error  do {  Normally within 9 ms        stress_cpu()  of  average  Checkpoint every 5 sec       germeofday()  (29 checkpoints)  } while()  stress + overhead = 236.6 ms  ls /root    –  7ms overhead  xm list     –   130 ms  23 

Network transparency: iperf   Throughput drop is  due to background  acEvity  ‐ 1Gbps, 0 delay network,   Checkpoint every 5 sec  ‐  iperf between two VMs  Average inter‐packet Eme: 18 μsec  (4 checkpoints)  ‐  tcpdump inside one of VMs  Checkpoint adds: 330 ‐‐ 5801  μsec  ‐  averaging over 0.5 ms  No TCP window change  No packet drops  24 

Network transparency: BitTorrent  Checkpoint every 5 sec  100Mbps, low delay   (20 checkpoints)  1BT server + 3 clients   3GB file  Checkpoint preserves  average throughput  25 

Conclusions  • Transparent distributed checkpoint  – Precise research tool  – Fidelity of distributed system analysis  • Temporal firewall  – General mechanism to change percepEon of Eme for the  system  – Conceal various external events  • Future work is Eme‐travel  26 

Thank you  aburtsev@flux.utah.edu 

Backup  28 

Branching storage  • Copy‐on‐write as a redo log  • Linear addressing  • Free block eliminaEon  • Read before write eliminaEon  29 

Branching storage  30 

TransparentCheckpointofClosed DistributedSystemsin Emulab - PowerPoint PPT Presentation

TransparentCheckpointofClosed DistributedSystemsin Emulab AntonBurtsev,PrashanthRadhakrishnan, MikeHibler,andJayLepreau UniversityofUtah,SchoolofCompuEng Emulab

Classification of curves Simple, not closed Simple, closed Closed, not simple Not simple, not

iOmx Therapeutics Announces Discovery of Novel, Druggable Immune-Checkpoint Targets iOTarg

ICD-10 Checkpoint: Update for NJ-HFMA Jim Hennessy June 2015 e4 Services LLC Discussion Topics

Logistics Assignments Crossover and Mutation Checkpoint 1 -- Problem Graded --

Oasys PRIMER Did you know? Back to Contents Top Tips Demo Slide 2 Slide 2 Checkpoint

Paper Summaries Any takers? Procedural Shading Announcement Logistics Checkpoint 2

Logistics Checkpoint 2 Mostly graded. Note on grading -- Regaining points

Logistics The Renderman Shading Language Checkpoint 3 Grading underway Checkpoint 4

Simulating Transparent Migration in Java Java doesnt provide transparent migration. non

Transparent Assessment Providing transparent goals and expectations for students Jonathon Adams

Checkpoint/Recovery 18-849b Dependable Embedded Systems John DeVale February 4, 1999 Required

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand Qi Gao, Weikuan Yu,

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Challenges of water utilities in the cities Distribution of water in Chandigarh B.

Trial of tele-medicine to promote the fetal diagnosis of congenital heart disease (CHD) Motoyoshi

Probabilistic Inference in BN2T Models by Weighted Model Counting Jirka Vomlel Institute of

VELOC: Very Low Overhead Checkpointing System Bogdan Nicolae, Rinku Gupta, Franck Cappello (ANL)

Mission Objective: Compromise Nuclear Facility Using Virtual Reality to Improve Cyber Security and

The The Hadoop Di adoop Dist stri ributed buted Fi File le System System Konstantin

Proposed 2019-2022 CAPITAL BUDGET THE CITY OF EDMONTON CITY COUNCIL October 23, 2018 1 OUR

INTERMITTENT HARDWARE ERRORS RECOVERY: MODELING AND EVALUATION L A Y A L I R A S H I D , K A R T

TransparentCheckpointofClosed DistributedSystemsin Emulab - PowerPoint PPT Presentation

TransparentCheckpointofClosed DistributedSystemsin Emulab AntonBurtsev,PrashanthRadhakrishnan, MikeHibler,andJayLepreau UniversityofUtah,SchoolofCompuEng Emulab

Classification of curves Simple, not closed Simple, closed Closed, not simple Not simple, not

iOmx Therapeutics Announces Discovery of Novel, Druggable Immune-Checkpoint Targets iOTarg

ICD-10 Checkpoint: Update for NJ-HFMA Jim Hennessy June 2015 e4 Services LLC Discussion Topics

Logistics Assignments Crossover and Mutation Checkpoint 1 -- Problem Graded --

Oasys PRIMER Did you know? Back to Contents Top Tips Demo Slide 2 Slide 2 Checkpoint

Paper Summaries Any takers? Procedural Shading Announcement Logistics Checkpoint 2

Logistics Checkpoint 2 Mostly graded. Note on grading -- Regaining points

Logistics The Renderman Shading Language Checkpoint 3 Grading underway Checkpoint 4

Simulating Transparent Migration in Java Java doesnt provide transparent migration. non

Transparent Assessment Providing transparent goals and expectations for students Jonathon Adams

Checkpoint/Recovery 18-849b Dependable Embedded Systems John DeVale February 4, 1999 Required

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand Qi Gao, Weikuan Yu,

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Challenges of water utilities in the cities Distribution of water in Chandigarh B.

Trial of tele-medicine to promote the fetal diagnosis of congenital heart disease (CHD) Motoyoshi

Probabilistic Inference in BN2T Models by Weighted Model Counting Jirka Vomlel Institute of

VELOC: Very Low Overhead Checkpointing System Bogdan Nicolae, Rinku Gupta, Franck Cappello (ANL)

Mission Objective: Compromise Nuclear Facility Using Virtual Reality to Improve Cyber Security and

The The Hadoop Di adoop Dist stri ributed buted Fi File le System System Konstantin

Proposed 2019-2022 CAPITAL BUDGET THE CITY OF EDMONTON CITY COUNCIL October 23, 2018 1 OUR

INTERMITTENT HARDWARE ERRORS RECOVERY: MODELING AND EVALUATION L A Y A L I R A S H I D , K A R T

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges