Things break. riak bends. Justin Sheehy justin@basho.com
Perfection is Unattainable A system cannot perform as well during a storm of component failure as it can on a sunny day.
Know How You Degrade Plan it and understand it before your users do. You might prevent whole system failure if you’re lucky and good, but what happens during partial failure? 3
Know How You Degrade Plan it and understand it before your users do. You think you know which parts will break. 4
Know How You Degrade Plan it and understand it before your users do. You think you know which parts will break. You are wrong. 5
Harvest and Yield harvest : a fraction data available / complete data yield : a probability queries completed / q's requested in tension with each other: (harvest * yield) ~ constant goal: failures cause known linear reduction to one of these 6
Harvest and Yield traditional design demands 100% harvest but success of modern applications is often measured in yield plan ahead, know when you care! 7
Perfection is Unattainable A system cannot perform as well during a storm of component failure as it can on a sunny day.
Perfection is Unattainable failures will happen. A system cannot perform as well during a storm of component failure as it can on a sunny day.
Resilience is Attainable failures will happen. Assume that Designing whole systems and components Designing whole systems and components with individual failures in mind with individual failures in mind is a plan for predictable success. is a plan for predictable success. 10
Resilience is Attainable Layered, multi-scale resilience is key! Designing whole systems and components with individual failures in mind is a plan for predictable success. 11
Component Failure: reboot of live database Worst case: whole DB corrupted! Typical mitigation: write-ahead logging for repair 12
Component Failure: reboot of live database Worst case: whole DB corrupted! Typical mitigation: write-ahead logging for repair Drawbacks: logging adds I/O, repair can be slow 13
Component Failure: reboot of live database Alternative: append-only main storage "log-structured" databases Example: bitcask 14
Bitcask simple append-only file format 15
Bitcask t a m r r e o g f g e i b l fi g y n l n i h o t - e d m n o e p s p f o a e t n l p e m n o i s p m o c a s a 16
Component Failure: reboot during record write What about a half-written write? Two problems: detection, minimization. 17
Component Failure: reboot during record write What about a half-written write? Two problems: detection, minimization. minimum-length check, CRC-check per record 18
Component Failure: reboot during record write What about a half-written write? Two problems: detection, minimization. invalidate only the end-failed record, not the file 19
Zoom Out: Bitcask is one part of Riak 20
Component Failure: internal subsystem crash Bugs can lurk anywhere. X? Unpredictability, eek. X? X? Typical mitigation: X? complex exception-management X? X? X? 21
Component Failure: internal subsystem crash Stronger mitigation: supervision trees and "let it crash" Added bonus: simpler and clearer code 22
Zoom Out: Virtual Nodes Many storage instances per server. If one fails, whole system is okay. ... 23
Zoom Out: Virtual Nodes Many storage instances per server. If one fails, whole system is okay. Also good for operational sanity when adding or removing hosts. ... 24
Zoom Out: Riak is a Distributed System 25
Component Failure: reboot during record write What about a half-written write? Two problems: detection, minimization. invalidate only the end-failed record, not the file Isn't this still a busted record? 26
Mitigation: quorum reads {ok, Value} {ok, Value} {error, not_found} 27
Mitigation: quorum reads {ok, Value} client {ok, Value} {ok, Value} {error, not_found} 28
Mitigation: quorum reads {ok, Value} client {ok, Value} {ok, Value} helps with nearly any local error: {error, bad_crc} 29
Mitigation: read-repair {ok, Value} client {ok, Value} {ok, Value} {error, bad_crc} {ok, Value} 30
Component Failure: server down! From a distributed system's point of view, a whole server can be seen as "a component." Computers fail all the time . How can the overall system continue to perform? 31
Mitigation: quorum reads {ok, Value} client {ok, Value} {ok, Value} X 32
What about writes? PUT Value client ok ok X 33
Mitigation: sloppy quorum PUT Value client ok ok ok X ok 34
Mitigation: sloppy quorum {ok, Value} client {ok, Value} {ok, Value} X {ok, Value} sloppy quorums work for reads too! 35
sloppy quorums are sloppy {ok, Value} {ok, Value} ? {ok, Value} 36
Mitigation: hinted handoff 37
Mitigation: hinted handoff also a fix for inconsistent view of membership! 38
Zoom out: multiple clusters 39
Component Failure: datacenter-level outage X 40
Mitigation: masterless replication X still live! 41
Mitigation: masterless replication (will catch up later) 42
Things break. riak bends. Justin Sheehy justin@basho.com
Recommend
More recommend