Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis Ceze, Steven D. Gribble University of Washington
A Nondeterministic Program global x=0 Thread 1 Thread 2 t := x t := x x := t + 1 x := t + 1 What is x ? x == 2 x == 2 x == 1 2
Nondeterministic IPC Process 0 send(msg=A) send(msg=B) Process 1 Process 2 recv(..) recv(..) Who gets msg A ? recv(msg=A) recv(msg=A) recv(msg=B) recv(msg=B) 3
Nondeterminism In Real Systems shared-memory disks why nondeterministic : why nondeterministic : multiprocessor hardware is drive latency is unpredictable unpredictable network IPC (e.g. pipes) why nondeterministic : why nondeterministic : packets arrive from multiprocessor hardware is external sources unpredictable posix signals why nondeterministic : . . . unpredictable scheduling , also can be triggered by users 4
The Problem • Nondeterminism makes programs . . . ➡ hard to test ‣ same input, different outputs ➡ hard to debug ‣ leads to heisenbugs ➡ hard to replicate for fault-tolerance ‣ replicas get out of sync • Multiprocessors make this problem much worse! 5
Our Solution • OS support for deterministic execution ➡ of arbitrary programs ➡ attack all sources of nondeterminism ( not just shared-memory ) ➡ even on multiprocessors New OS abstraction: Deterministic Process Group (DPG) Thread 1 Thread 2 Thread 3 Process A Process B deterministic box 6
Key Questions 1 What can be made deterministic? 2 What can we do about the remaining sources of nondeterminism? 7
Key Questions 1 What can be made deterministic? - distinguish internal vs. external nondeterminism 2 What can we do about the remaining sources of nondeterminism? 8
Internal External nondeterminism nondeterminism • arises from scheduling • arises from interactions artifacts (hw timing, etc) with the external world (networks, users, etc) NOT Fundamental Fundamental can be eliminated! can not be eliminated 9
Internal External Determinism Nondeterminism users real time network deterministic box 10
Internal External Determinism Nondeterminism shared users real time memory network Process 1 pipes private files Process 2 a programmer-defined process group Process 3 deterministic box 11
Internal External Determinism Nondeterminism shared users real time memory network Process 1 pipes ? private pipe files Process 2 shared file Process 3 Process 4 deterministic box 12
Internal External Determinism Nondeterminism shared users real time memory shim program network Process 1 pipes private Precisely controls pipe files Process 2 all external inputs shared file • value of input data Process 3 • time input data arrives Process 4 deterministic box 13
Internal External Determinism Nondeterminism users real time network user-space apps An entire virtual machine could go inside the deterministic box! operating system - too inflexible - too costly (virtual machine) deterministic box 14
Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box OS ensures: • internal nondeterminism is eliminated (for shared-memory, pipes, signals, local files, ...) • external nondeterminism funneled through shim program Shim Program: • user-space program that precisely controls all external nondeterministic inputs 15
Contributions Conceptual: - identify internal vs. external nondeterminism - key: internal nondeterminism can be eliminated! Abstraction: - Deterministic Process Groups (DPGs) - control external nondeterminism via a shim program Implementation: - dOS, a modified version of Linux - supports arbitrary, unmodified binaries Applications: - deterministic parallel execution - record/replay - replicated execution 16
Outline • Example Uses ➡ a parallel computation ➡ a webserver • Deterministic Process Groups ➡ system interface ➡ conceptual model • dOS: our Linux-Based Implementation • Evaluation 17
A Parallel Computation local input parallel program files deterministic box This program executes deterministically! • even on a multiprocessor • supports parallel programs written in any language ‣ no heisenbugs! ‣ test input files , not interleavings 18
A Webserver webserver shim network, etc (many threads/processes) deterministic box Deterministic Record/Replay • implement in shim program • requires no webserver modification Advantages ‣ significantly less to log ( internal nondeterminism is eliminated) ‣ log sizes 1,000x smaller! 19
A Webserver shim webserver deterministic box network, etc shim webserver deterministic box Fault-tolerant Replication • implement replication protocol in shim programs (paxos, virtual synchrony, etc) Advantage ‣ easy to replicate multithreaded servers ( internal nondeterminism is eliminated) 20
A Webserver Using DPGs to construct applications deterministic part nondeterministic part (in a DPG) (in a shim) low-level request network I/O processing (bundle into requests) webserver • behaves deterministically w.r.t. requests rather than packets Shim program defines the nondeterministic interface 21
Outline • Example Uses ➡ a parallel computation ➡ a webserver • Deterministic Process Groups ➡ system interface ➡ conceptual model • dOS: our Linux-Based Implementation • Evaluation 22
Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box System Interface • New system call creates a new DPG: sys_makedet() ‣ DPG expands to include all child processes • Just like ordinary linux processes ‣ same system calls, signals, and hw instruction set ‣ can be multithreaded 23
Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box Two questions: • What are the semantics of internal determinism? • How do shim programs work? 24
Deterministic Process Groups shim program network Thread 1 Thread 2 Thread 3 user I/O Process A Process B deterministic box Internal Determinism • OS guarantees internal communication is scheduled deterministically • Conceptually: executes as if serialized onto a logical timeline ‣ implementation is parallel 25
Internal Determinism Logical Thread 1 Thread 2 Timeline wr x t=1 always reads same value of x rd x t=2 read(pipe) t=3 rd y t=4 always blocks for 3 time steps blocking call always returns same data rd z t=5 read(pipe) t=6 wr y t=7 Each DPG has a logical timeline ‣ instructions execute as if serialized onto the logical timeline ‣ internal events are deterministic 26
Internal Determinism Logical Thread 1 Thread 2 Timeline wr x t=1 rd x t=2 arbitrary delays in physical time are possible read(pipe) t=3 rd y t=4 blocking call rd z t=5 read(pipe) t=6 wr y t=7 Physical time is not deterministic ‣ deterministic results , but not deterministic performance 27
External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time external channel wr x t=1 rd x t=2 read(socket) t=3 rd y t=4 packet blocking call rd z t=5 arrival read(socket) t=6 wr y t=7 Two sources of nondeterminism: • data returned by read() • blocking time of read() 28
External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time external channel wr x t=1 rd x t=2 read(socket) t=3 packet blocking call rd y t=4 arrival read(socket) t=5 wr y t=6 rd z t=7 Two sources of nondeterminism: • data returned by read() • blocking time of read() 29
External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time external channel wr x t=1 rd x t=2 packet read(socket) t=3 arrival blocking call read(socket) t=4 wr y t=5 rd y t=6 rd z t=7 Two sources of nondeterminism: • data returned by read() • blocking time of read() 30
External Nondeterminism Logical Physical Thread 1 Thread 2 Timeline Time wr x t=1 rd x t=2 shim program read(socket) t=3 packet rd y t=4 blocking call arrival rd z t=5 read(socket) t=6 wr y t=7 Two sources of nondeterminism: • data returned by read() ‣ the what • blocking time of read() ‣ the when 31
Shim Example: Read Syscall Logical DPG Shim Timeline Thread Program OS t=2 1 read() t=3 t=4 “hello” return(“hello”) t=10 Shim can either . . . t=11 Monitor call (e.g., for record) 1 Control call (e.g., for replay) 2 32
Shim Example: Read Syscall Logical DPG Shim Timeline Thread Program OS t=2 1 t=3 2 t=4 t=10 “hello” “hello” return(“hello”) t=10 Shim can either . . . t=11 Monitor call (e.g., for record) 1 Control call (e.g., for replay) 2 33
Shim Example: Replication Key idea: We have implemented this idea (see paper) • protocol delivers (time,msg) replication pairs to replicas protocol • ensure replicas see same input at same logical time shim shim shim multithreaded multithreaded multithreaded server server server DPG Replica 3 DPG Replica 1 DPG Replica 2 34
Outline • Example Uses ➡ a parallel computation ➡ a webserver • Deterministic Process Groups ➡ system interface ➡ conceptual model • dOS: our Linux-Based Implementation • Evaluation 35
Recommend
More recommend