4/13/2008 Amitanand S. Aiyer, Lorenzo Alvisi, Allen Clement, Mike Dahlin, Jean-Philippe Martin, Carl Porth Award Paper in the 20 th ACM Symposium on Operating Systems Principles (SOSP 2005) . Presented to: Dr. Ayman Abdel-Hamid By: Shaimaa Lazem Outline � Overview � Byzantine-Altruistic-Rational (BAR) model � System Architecture � Principles of Operations � Level 1: BART State Machine � Level 2: Partitioning Work � Level 3: The Application � BAR-B 4/14/2008 1
4/13/2008 Overview � Cooperative service in Multiple Administrative Domains (MAD): � Nodes collaborate to provide some service that benefits each node, but there is no central authority that controls the nodes’ actions (Internet routing, cooperative backup). � Problem � Nodes may depart from protocols . Failure, broken, security compromise, selfish nodes. � Not sufficient to verify experimentally that a protocol tolerates a collection of attacks identified by the protocol’s creator. � It is necessary to design protocols that provably meet their goals, no matter what strategies nodes may concoct . 4/14/2008 Contributions � Formal model for reasoning about systems in the presence of nodes’ deviated behavior (BAR Model). � General architecture and a set of design principles which, together, make it possible to build and reason about BAR tolerant systems. � The implementation of BAR-B, a cooperative backup system within the BAR model. 4/14/2008 2
4/13/2008 Byzantine-Altruistic-Rational (BAR) model � Three classes of nodes: � Rational nodes participate in the system to gain some net benefit and can depart from a proposed program in order to increase their net benefit. � Byzantine nodes can depart arbitrarily from a proposed program whether it benefits them or not. � A ltruistic nodes that execute a proposed program even if the rational choice is to deviate. 4/14/2008 Byzantine-Altruistic-Rational (BAR) model (cont.) � Two classes of protocols : � Incentive-Compatible Byzantine Fault Tolerant (IC-BFT) A protocol is IC-BFT if it guarantees the specified set of safety and liveness properties and if it is in the best interest of all rational nodes to follow the protocol exactly. � Byzantine Altruistic Rational Tolerant (BART) A protocol is BART if it guarantees the specified set of safety and liveness properties in the presence of all rational deviations from the protocol. 4/14/2008 3
4/13/2008 Replicated State Machine (RSM) � Technique for supporting service replication. � The service is written as a deterministic state machine; replicated on several machines. � An RSM substrate coordinates the behavior of the separate state machines so that their executions proceed consistently, even if some of the computers fail. � A key task of the RSM substrate is to establish a task ordering. 4/14/2008 Replicated State Machine (RSM) (cont.) A typical RSM-based client-server computer system [2]. 4/14/2008 4
4/13/2008 Replicated State Machine (RSM) (cont.) RSM timing diagram [2]. 4/14/2008 System Model Assumptions � BART protocols that do not depend on the existence of altruistic nodes in the system. � Trusted authority controls which nodes may enter the system. � Each member has a unique identity corresponding to a cryptographic public key. � Nodes have an incentive to stay as synchronized as possible through a “penance” mechanism. 4/14/2008 5
4/13/2008 System Model Assumptions (cont.) � Rational Nodes: � Receive a long term benefit from participating in the protocol. � Conservative when computing the impact of Byzantine nodes on their utility. � Colluding nodes are classified as Byzantine. � Byzantine Nodes: � Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages. � At most ((n-2)/3) of the nodes in the system are Byzantine. � Every non-Byzantine node is rational. 4/14/2008 System Architecture � Level 1, key abstractions for reliable distributed services. � RSM gives the abstraction of a correct (reliable and altruistic) node. � Level 2, build a system in which work can be assigned to specific nodes instead of executed by all replicas in the RSM. � Level 3, implements a desired service using the levels underneath. 4/14/2008 6
4/13/2008 Principles of Operations � Accountability, nodes are accountable for their behavior, then rational peers have an incentive to behave correctly. � Strong identities and restricted membership are parts of the solution. � How should a system detect and react to incorrect behavior? � Aggressively Byzantine node, easy to address: � A node signs a promise to store a file with a particular cryptographic hash and then responds to a request to read the file with a signed message that contains the wrong data. 4/14/2008 Principles of Operations (cont.) � Passive aggressively node: � A node may decline to send a message that it should send. The receiver is in a position to accuse the node of wrongdoing, but it becomes a case of “he said/she said”. � A node may exploit non-determinism to provide incomplete information that interfere with the protocol’s operation but are difficult to conclusively prove wrong. � A node transmits a signed copy of the request, but for liveness it is permitted to transmit a signed timeout message instead. � Self-interested nodes may choose to send the timeout message rather than transmit the request. 4/14/2008 7
4/13/2008 Principles of Operations- Addressing the challenges � Level 1 (primitives) � Nodes unilaterally deny service to nodes that fail to send expected messages. This low-level, local tit-for-tat technique provides incentives for cooperation without requiring a third party to judge which node is to blame. � The protocol balances costs so that when nodes have a choice between two messages, there is no incentive to choose the “wrong” one. � Nodes can unilaterally impose extra work (called penance) when they judge that another node’s response is not timely. 4/14/2008 Principles of Operations - Addressing the challenges (cont.) � Level 2 (work assignment) � If a node fails to reply to a request issued via the underlying state machine, then a quorum of nodes in the state machine generates a proof of misbehavior (POM) against the node. � Level 3 (application) � Applications make use of reliable work assignment, each request is bound to a reply or timeout. � The application protocol must be designed so that requests and responses include sufficient information for any node to judge the validity of a request/response pair. 4/14/2008 8
4/13/2008 Level 1: BART State Machine � Terminating Reliable Broadcast (TRB) � Each TRB instance is organized in a series of turns. � The sender for instance i is the first leader for instance i . � If nodes receive the messages on time they accept the value, otherwise nodes send a “set-turn” message. � Nodes other than the sender are selected round-robin for the leader role. � Each participant thus has a periodic opportunity to propose values to the state machine (ensure long term benefit). � An instance can terminate only in two ways to limit non- determinism (sender’s value, default value) 4/14/2008 Level 1: BART State Machine(cont.) 4/14/2008 9
4/13/2008 Level 1: BART State Machine(cont.) � Message Queue � The message queue used by x contains entries for the messages that x intends to send to y, interleaved with “bubbles”. � A bubble must be filled with an appropriate message from y before x can proceed to send the messages in the queue. � Incentive for rational nodes to send messages expected by protocol. � Balanced Messages: � Whenever the node has the opportunity to choose the message to send next, the intended message is never more expensive than the alternatives. 4/14/2008 Level 1: BART State Machine(cont.) � Penance � Each node maintains an untimely vector that tracks their perception of other nodes timeliness. � A node is considered untimely if any timeout message electing a new leader arrives significantly earlier or later than expected according to the receiver’s local clock. � When a node x becomes the sender, it includes its untimely vector with the value it proposes. � After agreeing on the proposal, all nodes except the sender expect a penance message from each node indicted in the untimely vector. � Because of the message queues, the untimely nodes must send the penance message to all non-sender nodes. 4/14/2008 10
Recommend
More recommend