PLOVER: Fast, Multi-core Scalable Virtual Machine Fault-tolerance
Cheng Wang, Xusheng Chen, Weiwei Jia, Boxuan Li, Haoran Qiu, Shixiong Zhao, and Heming Cui The University of Hong Kong
1
Virtual Machine Fault-tolerance Cheng Wang, Xusheng Chen, Weiwei - - PowerPoint PPT Presentation
PLOVER: Fast, Multi-core Scalable Virtual Machine Fault-tolerance Cheng Wang, Xusheng Chen, Weiwei Jia, Boxuan Li, Haoran Qiu, Shixiong Zhao, and Heming Cui The University of Hong Kong 1 Virtual machines are pervasive in datacenters Physical
Cheng Wang, Xusheng Chen, Weiwei Jia, Boxuan Li, Haoran Qiu, Shixiong Zhao, and Heming Cui The University of Hong Kong
1
Physical machine Guest VM Guest VM VMM
…
2
Physical machine Guest VM Guest VM VMM
…
Hardware Failure
…
3
Remus [NSDI’08]
Primary Guest VM service VMM memory pages client
3
Remus [NSDI’08]
Primary Guest VM service VMM memory pages backup Guest VM service VMM memory pages client
3
Remus [NSDI’08]
Primary Guest VM service VMM memory pages backup Guest VM service VMM memory pages client
3
Remus [NSDI’08]
Primary Guest VM service VMM memory pages backup Guest VM service VMM memory pages client
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
3
Remus [NSDI’08]
Primary Guest VM service VMM memory pages backup Guest VM service VMM memory pages client
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
3
Remus [NSDI’08]
Primary Guest VM service VMM memory pages backup Guest VM service VMM memory pages client
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
3
Remus [NSDI’08]
Primary Guest VM service VMM Output buffer memory pages backup Guest VM service VMM memory pages client
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
3
Remus [NSDI’08]
Primary Guest VM service VMM Output buffer memory pages backup Guest VM service VMM memory pages client
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
3
Remus [NSDI’08]
Primary Guest VM service VMM Output buffer memory pages backup Guest VM service VMM memory pages client
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
3
Remus [NSDI’08]
Primary Guest VM service VMM Output buffer memory pages backup Guest VM service VMM memory pages client
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
when complete state has been received.
3
Remus [NSDI’08]
Primary Guest VM service VMM Output buffer memory pages backup Guest VM service VMM memory pages client
ACK
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
when complete state has been received.
3
Remus [NSDI’08]
Primary Guest VM service VMM Output buffer memory pages backup Guest VM service VMM memory pages client
ACK
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
when complete state has been received.
released.
3
Remus [NSDI’08]
Primary Guest VM service VMM Output buffer memory pages backup Guest VM service VMM memory pages client
ACK
Synchronize primary/backup every 25ms
transmit all changed state (e.g., memory pages) to backup.
when complete state has been received.
released.
4
# of concurrent clients Page transfer size (MB) 16 20.9 48 68.4 80 110.5
100 200 300 400 500 600 16 48 80
Latency (us) Number of concurrent clients
Redis latency with varied # of clients (4 vCPUs per VM)
unreplicated Remus (25ms synchronization interval)
5
Primary Guest VM KVS VMM Output buffer page Backup Guest VM KVS VMM page client1 client2
5
Primary Guest VM KVS VMM Output buffer page Backup Guest VM KVS VMM page client1 client2
5
Primary Guest VM KVS VMM Output buffer page Backup Guest VM KVS VMM page client1 client2 Outdated primary New primary
5
Primary Guest VM KVS VMM Output buffer page Backup Guest VM KVS VMM page client1 client2
X=5
x=7 Outdated primary New primary
5
Primary Guest VM KVS VMM Output buffer page Backup Guest VM KVS VMM page client1 client2
X=5
x=7 Outdated primary New primary
5
Primary Guest VM KVS VMM Output buffer page Backup Guest VM KVS VMM page client1 client2
x =5 x =7
X=5
x=7 Outdated primary New primary
6
service backup client1 client2 service primary service backup consensus log consensus log consensus log
6
service backup client1 client2 service primary service backup consensus log consensus log consensus log
6
service backup client1 client2 service primary service backup consensus log consensus log consensus log
6
service backup client1 client2 service primary service backup consensus log consensus log consensus log
6
service backup client1 client2 service primary service backup consensus log consensus log consensus log
6
service backup client1 client2 service primary service backup consensus log consensus log consensus log
7
Primary/backup approach Pros:
Cons:
large amount of state
State machine replication Pros:
execution states
Cons:
PLOVER need only copy and transfer a small portion of the memory
memory pages?
8
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
consensus Output buffer consensus
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
consensus Output buffer consensus
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
consensus Output buffer consensus RDMA-based input consensus:
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
consensus Output buffer consensus
RDMA (<10us)
RDMA-based input consensus:
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
consensus Output buffer consensus
RDMA (<10us)
RDMA-based input consensus:
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
consensus Output buffer consensus
RDMA (<10us)
RDMA-based input consensus:
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
RDMA-based VM synchronization: consensus Output buffer consensus
RDMA (<10us)
RDMA-based input consensus:
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
RDMA-based VM synchronization:
consensus Output buffer consensus
RDMA (<10us) RDMA
RDMA-based input consensus:
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
RDMA-based VM synchronization:
consensus Output buffer consensus
RDMA (<10us) RDMA
RDMA-based input consensus:
VMM
9
Primary Backup Witness VM Sync VM
consensus
Output buffer VMM
service
Sync VM VM
service
Client
log log
page page
RDMA-based VM synchronization:
consensus Output buffer consensus
RDMA (<10us) RDMA
RDMA-based input consensus:
VMM
10
Primary Backup VM Sync VM VMM
service
Sync VM VM
service
page page VMM
10
Primary Backup VM Sync VM VMM
service
Sync VM VM
service
page page VMM
10
Primary Backup VM Sync VM VMM
service
Sync VM VM
service
page page VMM
10
Primary Backup VM Sync VM VMM
service
Sync VM VM
service
page page VMM
Issue of not choosing synchronization timing carefully
Synchronize when processing is almost finished!
processing
nearly zero
11
synchronization if network outputs from two VMs are the same
12
13
service Program type Benchmark Workload Redis Key value store self 50% SET, 50% GET SSDB Key value store self 50% SET, 50% GET MediaTomb Multimedia storage server ApacheBench Transcoding videos pgSQL Database server pgbench TPC-B DjCMS (Nginx, Python, MySQL) Content management system ApacheBench Web requests on a dashboard page Tomcat HTTP web server ApacheBench Web requests on a shopping store page lighttpd HTTP web server ApacheBench Watermark image with PHP Node.js HTTP web server ApacheBench Web requests on a messenger bot
14
15
16
17
Interval Dirty Page Same Transfer 86ms 33.9K 97% 2.8ms Sync-interval Dirty Page Transfer 25ms (Remus-Xen default) 33.3K 53.5ms 100ms (Remus-KVM default) 33.9K 55.7ms PLOVER: Remus: Analysis: PLOVER needs to transfer only 33.9k * 3% = 1.0K pages, But Remus, STR, and COLO need to transfer all or most of the 33K dirty pages. E.g., since most network outputs from two VMs differ, COLO has to do synchronizations for almost every output packet. lighttpd + PHP
18
PLOVER is slower than COLO on pgSQL
most network outputs from two VMs are the same
Remus, 1.0X higher than COLO, 1.4X higher than STR
average
19
20
21
22