Disk-Durable Protocols Update Recovery if ack’d anyone, data on disk – safe Committed Client ready immediate fsync completed on a majority ? A=1 A=2 A=1 A=2 fsync fsync fsync recovery: just read from local disk lagging ready immediate A=1 A=1 A=2 A=2 A=1 A=2 Follower Leader Follower A=1 A=1 Safe and available 8 OSDI ‘18
Disk-Durable Protocols Update Recovery if ack’d anyone, data on disk – safe Committed Client ready immediate fsync completed on a majority ? A=1 A=2 A=1 A=2 fsync fsync fsync recovery: just read from local disk lagging ready immediate A=1 A=1 A=2 A=2 A=1 A=2 Follower Leader Follower A=1 A=1 Safe and available But poor performance due to fsync – 50x on HDDs, 2.5x on SSDs 8 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Client A=2 Memory Memory Memory A=1 A=1 A=1 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Committed Client buffered on a majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Committed Client buffered on a majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client buffered on a majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client buffered on a Memory majority ? A=1 A=2 Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client buffered on a Memory majority ? A=1 A=2 Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client buffered on a Memory majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client ready immediate buffered on a Memory Memory majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client ready immediate buffered on a Memory Memory majority ? Memory Memory Memory e.g., ZooKeeper with forceSync = false A=1 A=2 A=1 A=2 A=1 A=2 practitioners do use this config! Follower Leader Follower 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client ready immediate buffered on a Memory Memory majority ? Memory Memory Memory e.g., ZooKeeper with forceSync = false A=1 A=2 A=1 A=2 A=1 A=2 practitioners do use this config! Follower Leader Follower Performant 9 OSDI ‘18
Memory-Durable Protocols (Oblivious Recovery) Update Recovery Oblivious: doesn’t realize loss on failure Committed Client ready immediate buffered on a Memory Memory majority ? Memory Memory Memory e.g., ZooKeeper with forceSync = false A=1 A=2 A=1 A=2 A=1 A=2 practitioners do use this config! Follower Leader Follower Performant But can lead to data loss 9 OSDI ‘18
Data Loss Example in Oblivious Approach 10 OSDI ‘18
Data Loss Example in Oblivious Approach 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 committed 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 committed two nodes slow or failed 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 committed two nodes slow or failed 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 A=1 A=1 crashes A=1 committed two nodes slow or failed 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 A=1 A=1 crashes , recovers A=1 committed two nodes slow or failed 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 A=1 crashes , recovers A=1 committed loses its data two nodes slow or failed but oblivious: immediately joins 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 crashes , recovers A=1 committed loses its data two nodes slow or failed but oblivious: immediately joins 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 crashes , recovers A=1 committed loses its data two nodes slow or failed but oblivious: immediately joins 10 OSDI ‘18
Data Loss Example in Oblivious Approach A=1 A=1 A=1 A=1 majority do not know of previously A=1 A=1 A=1 committed update crashes , recovers lagging nodes along with recovered A=1 committed node form majority; loses its data two nodes slow or failed but oblivious: lose committed update immediately joins 10 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Committed Client buffered on a majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Client buffered on a majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client buffered on a majority ? Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client buffered on a majority ? Memory A=1 A=2 Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client buffered on a majority ? Memory A=1 A=2 Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client buffered on a majority ? Memory Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client buffered on a majority ? recovering wait for majority responses Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client ready buffered on a responses majority majority ? recovering Memory wait for majority A=1 A=2 responses Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client ready buffered on a responses majority majority ? recovering Memory wait for majority A=1 A=2 responses Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 e.g., Viewstamped replication Follower Leader Follower 11 OSDI ‘18
Memory-Durable Protocols (Loss-Aware Recovery) Update Recovery Committed Loss-aware: realizes loss, waits for majority Client ready buffered on a responses majority majority ? recovering Memory wait for majority A=1 A=2 responses Memory Memory Memory A=1 A=2 A=1 A=2 A=1 A=2 e.g., Viewstamped replication Follower Leader Follower Avoids loss (unlike oblivious) but can lead to unavailability 11 OSDI ‘18
Unavailability Example in Loss-Aware Approach 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 committed two nodes crashed 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 committed two nodes crashed 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 committed crashes two nodes crashed 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 committed crashes , recovers two nodes crashed 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 A=1 A=1 committed crashes , recovers two nodes crashed cannot collect majority responses although majority up – unavailable 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 A=1 committed crashes , recovers two nodes crashed cannot collect majority responses although majority up – unavailable 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 failed nodes recover A=1 committed crashes , recovers two nodes crashed cannot collect majority responses although majority up – unavailable 12 OSDI ‘18
Unavailability Example in Loss-Aware Approach A=1 A=1 A=1 A=1 A=1 A=1 A=1 failed nodes recover A=1 committed crashes , recovers two nodes crashed cannot collect stay in recovering unavailable even after majority responses all nodes recover although majority up – unavailable 12 OSDI ‘18
Outline Introduction Distributed updates and crash recovery Situation-aware updates and crash recovery SAUCR insights, guarantees, and overview situation-aware updates situation-aware crash recovery Results Summary and conclusion 13 OSDI ‘18
SAUCR Intuition and Insight 14 OSDI ‘18
SAUCR Intuition and Insight Existing protocols are static in nature: do not adapt to failures 14 OSDI ‘18
SAUCR Intuition and Insight Existing protocols are static in nature: do not adapt to failures always Memory-durable buffer even with many failures poor reliability 14 OSDI ‘18
SAUCR Intuition and Insight Existing protocols are static in nature: do not adapt to failures always always Memory-durable Disk-durable buffer even with persist even many failures when no failures poor reliability poor performance 14 OSDI ‘18
SAUCR Intuition and Insight Existing protocols are static in nature: do not adapt to failures always always Memory-durable Disk-durable buffer even with persist even many failures when no failures poor reliability poor performance Insight: reacting to failures and adapting to situation can achieve reliability and performance 14 OSDI ‘18
SAUCR Intuition and Insight Existing protocols are static in nature: do not adapt to failures common case always always when many or all up Memory-durable Disk-durable Memory-durable buffer even with persist even buffer in many failures when no failures memory poor reliability poor performance Insight: reacting to failures and adapting to situation can achieve reliability and performance when no or few failures could buffer in memory 14 OSDI ‘18
SAUCR Intuition and Insight Existing protocols are static in nature: do not adapt to failures common case with failures always always when many or all up when only minimum up Memory-durable Disk-durable Memory-durable Disk-durable buffer even with flush to persist even buffer in many failures when no failures disk memory poor reliability poor performance Insight: reacting to failures and adapting to situation can achieve reliability and performance when no or few failures could buffer in memory when failure arise, flush 14 OSDI ‘18
Guarantees Depend upon Simultaneity of Failures 15 OSDI ‘18
Guarantees Depend upon Simultaneity of Failures With non-simultaneous, gap exists, SAUCR can react and ensures durability 15 OSDI ‘18
Guarantees Depend upon Simultaneity of Failures With non-simultaneous, gap exists, SAUCR can react and ensures durability independent: likelihood of many nodes failing together is negligible 15 OSDI ‘18
Guarantees Depend upon Simultaneity of Failures With non-simultaneous, gap exists, SAUCR can react and ensures durability independent: likelihood of many nodes failing together is negligible correlated: many nodes fail together although many nodes fail, not necessarily simultaneous; most cases, non-simultaneous 15 OSDI ‘18
Guarantees Depend upon Simultaneity of Failures With non-simultaneous, gap exists, SAUCR can react and ensures durability independent: likelihood of many nodes failing together is negligible correlated: many nodes fail together although many nodes fail, not necessarily simultaneous; most cases, non-simultaneous With simultaneous correlated, no gap, SAUCR cannot react, unavailable 15 OSDI ‘18
Guarantees Depend upon Simultaneity of Failures With non-simultaneous, gap exists, SAUCR can react and ensures durability independent: likelihood of many nodes failing together is negligible correlated: many nodes fail together although many nodes fail, not necessarily simultaneous; most cases, non-simultaneous With simultaneous correlated, no gap, SAUCR cannot react, unavailable We conjecture they are extremely rare: a gap exists between failures correlated but a few seconds apart [Ford et al., OSDI ‘10] analysis reveals a gap of 50 ms or more almost always 15 OSDI ‘18
Recommend
More recommend