Towards Scalable Parallelization of Functional System Simulation with SimuBoost GI Fachgruppentreffen Betriebssysteme (BS) 2016 Marc Rittinghaus , Frank Bellosa OPERATING SYSTEMS GROUP DEPARTMENT OF COMPUTER SCIENCE Virtualization Node Simulation Node Node 0 Node 1 Management Node Simulation Node Analysis Node Central Storage Trace Trace Virtualization Simu Non-Det. Events Trace Data Simu Storage Provider Storage Provider Custom Analysis [Core 0] Processor Phase 1 Phase 2 Phase 3 Input Checkpoints Results Virtualization Logs Simulation Traces Analysis Results Virtualization Simu Trace Simu Trace Custom Analysis [Core 1] Non-Det. Events Trace Data Checkpoints/Logs Checkpoints/Logs Checkpoints/Logs Checkpoints/Logs Checkpoints/Logs Checkpoints/Logs Checkpoints/Logs Checkpoints/Logs Trace Data Trace Data Trace Data Trace Data Trace Data Trace Data Trace Data Trace Data SimuTrace SimuTrace SimuTrace SimuTrace SimuTrace SimuTrace SimuTrace SimuTrace Simulation Simulation Simulation Simulation Simulation Simulation Simulation Simulation [Interval 0] [Interval 1] [Interval 2] [Interval 3] [Interval 4] [Interval 5] [Interval 6] [Interval 7] KIT – University of the State of Baden-Wuerttemberg and www.kit.edu National Research Center of the Helmholtz Association
Motivation Study properties of redundant memory contents [Miller13] Origin? Lifetime? Sharing possible? Analyze memory contents after each modification But: Analysis should not affect workload Analyze memory access patterns on system interfaces [Jurczyk13, Wilhelm15] Detect vulnerabilities in Windows 8 and Xen (CVE-2015-8550) Trace individual memory reads and writes We want detailed runtime information 2 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Motivation Operating system research Debugging Application, OS, and hardware interaction Malware and vulnerabilities Functional Full System Simulation But: It is slow Virtualization Simulation KVM QEMU Simics ~ 1x ~ 100x ~ 1000x Average slowdowns for: kernel build, SPECint_base06, LAMMPS • Not practical for long-running workloads • Loss of interactivity (users and remote hosts) 3 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Basic Acceleration Approach (1) Split simulation into time intervals (2) Simulate intervals simultaneously Does not trade accuracy for speed Applicable to single-CPU simulations Scales with run-time of workload • How to bootstrap the simulation of i [1 ..n ] ? [1 ..n • Still no interactivity 4 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
SimuBoost i [0 i [ k i [ n vNode ] ] ] Virtualization i [0 Node 0 ] Simulation i [ k ] Node k Simulation i [ n Node n ] Simulation t i [ n ] Leverage fast virtualization Checkpoints at interval boundaries bootstrap simulations Hardware acceleration provides full interactivity Speed difference drives parallelization 5 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
SimuBoost (1) Fast Checkpoint Creation: <100ms [ RbMiller68] i [0 ] i [ k ] i [ n ] vNode Virtualization i [0 Node 0 ] Simulation i [ k ] Node k Simulation i [ n Node n ] Simulation (2) Fast Checkpoint Distribution t i [ n ] Leverage fast virtualization Checkpoints at interval boundaries bootstrap simulations Hardware virtualization provides full interactivity Speed difference drives parallelization Challenges: Preserve interactivity and speedup 6 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Stop-And-Copy i [k] i [k [k + 1] 1] Virtualization suspended 7 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Stop-And-Copy i [k] i [k [k + 1] 1] Virtualization suspended VM RAM Checkpoint 8 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Stop-And-Copy i [k] i [k [k + 1] 1] Virtualization suspended Checkpoint 9 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Stop-And-Copy i [k] i [k [k + 1] 1] Virtualization suspended downtime pts_build_linux_kernel 5000 4321 4500 30% speedup loss 4000 Downtime depends on VM size 3500 Downtime (ms) 3000 2667 2500 Not suited for interactive use 2000 1555 1500 825 1000 Limited parallelization 301 500 193 101 0 256 512 1024 2048 4096 8192 16384 Memory Size (MiB) We need to drastically speedup checkpointing 10 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Stop-And-Copy i [k] i [k [k + 1] 1] Virtualization VM RAM Observation: Only some data modified per interval pts_build_linux_kernel spec_jbb 22000 pages/s (85 MiB/s) 53000 pages/s (200 MiB/s) 11 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Stop-And-Copy i [k] i [k [k + 1] 1] Virtualization suspended VM RAM Checkpoint Idea: Save only modified data Track dirty pages via page protections Use previous checkpoints to get unmodified data 12 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Stop-And-Copy i [k] i [k [k + 1] 1] Virtualization suspended Checkpoint Idea: Save only modified data Track dirty pages via page protections Use previous checkpoints to get unmodified data 13 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Stop-And-Copy Saved downtime i [k] i [k [k + 1] 1] Virtualization suspended pts_build_linux_kernel (interval = 16000 ms) Reduced downtime 200 Less dependent on VM size 150 Downtime (ms) 100 50 0 256 512 1024 2048 4096 8192 16384 Memory Size (MiB) 14 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Stop-And-Copy Saved downtime i [k] i [k [k + 1] 1] Virtualization suspended idle pts_build_linux_kernel spec_jbb Reduced downtime pts_apache pts_postmark stress Less dependent on VM size 400 350 300 Downtime (ms) But: Downtime depends on 250 200 Interval length 150 100 Workload 50 0 100 500 1000 2000 4000 8000 16000 Interval Length (ms) 15 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Stop-And-Copy Saved downtime i [k] i [k [k + 1] 1] Virtualization suspended spec_jbb (interval = 500 ms) Reduced downtime 300 25% above 100ms 250 Less dependent on VM size Downtime (ms) 200 But: Downtime depends on 150 Interval length 100 mean = 77 Workload 50 60% above 100ms But: Downtime strongly fluctuates 0 0 0 0 0 0 0 0 0 0 1 2 3 Checkpoint Index We need to further speedup checkpointing 16 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Copy-On-Write i [k] i [k [k + 1] 1] Virtualization VM RAM 17 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Copy-On-Write i [k] i [k [k + 1] 1] Virtualization Write-protect pages VM RAM Idea: Save modified pages asynchronously Use write-protection to prevent modification 18 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Copy-On-Write i [k] i [k [k + 1] 1] Virtualization VM RAM Idea: Save modified pages asynchronously Use write-protection to prevent modification 19 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Copy-On-Write i [k] i [k [k + 1] 1] Virtualization VM RAM Page Fault Checkpoint Idea: Save modified pages asynchronously Use write-protection to prevent modification Copy and release protection on pagefault 20 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Copy-On-Write i [k] i [k [k + 1] 1] Virtualization idle pts_build_linux_kernel spec_jbb Drastically reduced downtime pts_apache pts_postmark stress Pagefaults do not impede 100 90 interactivity 80 Downtime (ms) 70 60 Less dependent on 50 40 30 Interval length 20 10 Workload 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 5 0 0 0 0 0 6 1 2 4 8 1 Interval Length (ms) 21 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Incremental Copy-On-Write i [k] i [k [k + 1] 1] Virtualization spec_jbb (interval = 500 ms) 100 Drastically reduced downtime 90 Pagefaults do not impede 80 interactivity Downtime (ms) 70 60 50 Less dependent on 40 30 Interval length 20 mean = 7 Workload 10 0 0 0 0 0 0 0 0 0 0 1 2 3 Almost constant downtime Checkpoint Index We can do checkpointing fast enough 22 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Checkpoint Distribution – The Naïve Way Nodes request full checkpoints from central server Node 1 1 2 Node 2 Virtualization 4 3 Node 3 Bottleneck Node 4 But: Central server becomes bottleneck Limits parallelization and speedup 23 Marc Rittinghaus - SimuBoost Operating Systems Group Department of Computer Science
Recommend
More recommend