distributed systems principles and paradigms
play

Distributed Systems Principles and Paradigms Maarten van Steen VU - PowerPoint PPT Presentation

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 13: Distributed Coordination-Based Systems Version: December 2, 2009 Contents Chapter 01: Introduction


  1. Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 13: Distributed Coordination-Based Systems Version: December 2, 2009

  2. Contents Chapter 01: Introduction 02: Architectures 03: Processes 04: Communication 05: Naming 06: Synchronization 07: Consistency & Replication 08: Fault Tolerance 09: Security 10: Distributed Object-Based Systems 11: Distributed File Systems 12: Distributed Web-Based Systems 13: Distributed Coordination-Based Systems 2 / 17

  3. Coordination-Based Systems 13.1 Coordination Models Coordination models Essence We are trying to separate computation from coordination; coordination deals with all aspects of communication between processes, as well as their cooperation. Couplings Make a distinction between Temporal coupling: Are cooperating/communicating processes alive at the same time? Referential coupling: Do cooperating/communicating processes know each other explicitly? 3 / 17

  4. Coordination-Based Systems 13.1 Coordination Models Coordination models Temporal Coupled Decoupled Coupled Direct Mailbox Referential Meeting Generative Decoupled oriented communication 4 / 17

  5. Coordination-Based Systems 13.2 Architectures Architectures: Overview Essence A data item is described by means of attributes. When made available, it is said to be published. A process interested in reading an item, must provide a subscription: a description of the items it wants. Middleware must match published items and subscriptions. Publisher Subscriber Subscriber Read/Delivery Data item Subscription Notification Publish/subscribe middleware Match 5 / 17

  6. Coordination-Based Systems 13.2 Architectures Example: Jini/Javaspaces Coordination model Temporal and referential uncoupling by means of JavaSpaces, a tuple-based storage system. A tuple is a typed set of references to objects Tuples are stored in serialized, that is, marshaled form into a JavaSpace To read a tuple, construct a template, with some fields left open Match a template against a tuple through a field-by-field comparison 6 / 17

  7. Coordination-Based Systems 13.2 Architectures Example: Jini/Javaspaces A Write A B Write B T Read T C Look for Insert a Insert a tuple that copy of B copy of A matches T B Return C A A (and optionally remove it) B B Tuple instance C A JavaSpace Write: A copy of a tuple (tuple instance) is stored in a JavaSpace Read: A template is compared to tuple instances; the first match returns a tuple instance Take: A template is compared to tuple instances; the first match returns a tuple instance and removes the matching instance from the JavaSpace 7 / 17

  8. Coordination-Based Systems 13.2 Architectures Example: TIB/Rendezvous Coordination model Uses of subject-based addressing ⇒ publish-subscribe system. Receiving a message on subject X is possible only if the receiver had subscribed to X Publishing a message on subject X ⇒ message is sent to all (currently running) subscribers to X . Publ. on A Subs. to A Subs. to A Subs. to A Subs. to B Publ. on B Subs. to B Subj: A Subj: B RV lib RV lib RV lib RV lib RV lib RV RV RV RV RV daemon daemon daemon daemon daemon Network Multicast message on A to subscribers Multicast message on B to subscribers 8 / 17

  9. Coordination-Based Systems 13.2 Architectures Example: Lime Lime Every node has its own dataspace: When P and Q are in each other’s proximity, dataspaces become shared Published data items are stored locally, until removed P can publish data items from specific process Reactions describe what to do when a match is found Transient, shared dataspace Process Process Process Local� Local� Local� dataspace dataspace dataspace Wireless link 9 / 17

  10. Coordination-Based Systems 13.4 Communication Content-based routing Observation When a coordination-based system is built across a wide-area network, we need an efficient routing mechanism (centralized solutions won’t do). Solution Naive: Broadcast subscriptions to all nodes in the system and let servers prepend destination address when data item is published Refinement: Forward subscriptions to all routers and let them compute and install filters. 10 / 17

  11. Coordination-Based Systems 13.4 Communication Content-based routing: naive solution 1 1 2 R1 3 1 5 R2 3 3 3 4 11 / 17

  12. Coordination-Based Systems 13.7 Consistency and Replication Replication: Static approaches Note Replicating data items to all machines implies broadcasting removals. Process doing Tuple broadcast a write broadcasts Network (a) Process doing a take examines local JavaSpace Tuple delete Subspaces Network (b) 12 / 17

  13. Coordination-Based Systems 13.7 Consistency and Replication Balancing read/write operations Problem Find a balance between the costs for reads, and writes/removals ⇒ organize dataspace as 2D grid A broadcasts A C tuple to these machines Example A writes a data item; B wants to B read it. B broadcasts template to these machines 13 / 17

  14. Coordination-Based Systems 13.7 Consistency and Replication Dynamic replication Observation: Not all data items are equal Decide on replication on a per-type basis Refinement: Let a central component observe read/write patterns and decide on replication strategy (self-replication) Application Distribution� Policy� Distribution� Invocation� Dataspace� Distribution� manager table manager handler slice manager Local OS To network 14 / 17

  15. Coordination-Based Systems 13.8 Fault Tolerance Fault tolerance Observation In many cases, fault tolerance is achieved by using a primary-backup approach for a central dataspace server. Refinement Decide per data type the required availability, and replicate based on availability of nodes: MTTF: mean time to failure MTTR: mean time to repair Node availability: MTTF MTTF + MTTR Let nodes estimate MTTF and MTTR by logging the current time. 15 / 17

  16. Coordination-Based Systems 13.9 Security Security Dilemma We wanted anonymity between processes, but security requires that we authenticate publishers and subscribers ⇒ we need to trust the servers that establish the matching between the two. Information confidentiality: the middleware is not allowed to see what data is published. In practice, only restricted number of fields can be used. Subscription confidentiality: the middleware is not allowed to see what subscriptions look like. Solution: Match on encrypted data fields, although this alone will often reveal too much info on publishers and subscribers. Publication confidentiality: ensure that specific processes are not even allowed to see certain messages. 16 / 17

  17. Coordination-Based Systems 13.9 Security Secure decoupling Solution Let an accounting service manage keys, and re-encrypt a data item before it is forwarded to a subscriber ⇒ (1) routers work on encrypted data, (2) publisher and subscriber need not share a key. Transform Accounting service (AS) Provide encryption key Obtain encryption key Publisher Subscriber Broker Message encrypted� Publish/subscribe middleware Message encrypted� with publisher's key with subscriber's key Dilemma Is security the show-stopper for publish/subscribe systems? 17 / 17

Recommend


More recommend