a group membership service for large scale grids
play

A Group Membership Service for Large-Scale Grids* Fernando Castor - PowerPoint PPT Presentation

A Group Membership Service for Large-Scale Grids* Fernando Castor Filho 1,4 , Raphael Y. Camargo 2 , Fabio Kon 3 , and Augusta Marques 4 1 Informatics Center, Federal University of Pernambuco 2 School of Arts, Sciences, and Humanities, University


  1. A Group Membership Service for Large-Scale Grids* Fernando Castor Filho 1,4 , Raphael Y. Camargo 2 , Fabio Kon 3 , and Augusta Marques 4 1 Informatics Center, Federal University of Pernambuco 2 School of Arts, Sciences, and Humanities, University of São Paulo 3 Department of Computer Science, University of São Paulo 4 Department of Computing and Systems, University of Pernambuco *Supported by CNPq/Brazil, grants #481147/2007-1 and #550895/2007-8

  2. Faults in Grids  Important problem  Waste computing and network resources  Waste time (resources might need to be reserved again)  Scale worsens matters  Failures become common events  Opportunistic grids  Shared grid infrastructure  Nodes leave/fail frequently  Fault tolerance can allow for more efficient use of the grid 2 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  3. Achieving Fault Tolerance  Fist step: detecting failures...  And then doing something about them  Other grid nodes must also be aware  Otherwise, progress might be hindered  More generally: each node should have an up- to-date view of group membership  In terms of correct and faulty processes 3 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  4. Requirements for Group Membership in Grids 1 Scalability 2 Autonomy 3 Efficiency 4 Capacity of handling dynamism 5 Platform-independence 6 Distribution (decentralization) 7 Ease of use 4 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  5. Our Proposal  A group membership service that addresses the aforementioned requirements  Very lightweight  Assuming a crash-recovery fault model  Deployable in any platform that has an ANSI C compiler  Leveraging recent advances in  Gossip/infection-style information dissemination  Accrual failure detectors 5 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  6. Gossip/Infection-Style Information Dissemination  Based on the way infectious diseases spread  Or, alternatively, on how gossip is disseminated  Periodically, each participant randomly infects some of its neighbors  Infects = passes information that (potentially) modifies its state  Weakly-consistent protocols  Sufficient for several practical applications  Highly scalable and robust 6 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  7. Accrual Failure Detectors  Decouple monitoring and interpretation  Output values on a continuous scale  Suspicion level  Eventually strongly accurate failure detectors  Heartbeat interarrival times define a probability distribution function  Several thresholds can be set  Each triggering different actions  As good as “regular” adaptive FDs  More flexible and easier to use 7 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  8. Architecture of the Group Membership Service Node2 Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 Node1 … Failure Monitor Membership Handler N Management Node3 Monitored process Node4 Each computer runs an instance of the group membership service 8 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  9. Membership Management  Handles membership requests Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 …  Disseminates information about Failure Monitor Membership Handler N Management new members Monitored process  Informs them about existing members  Removes failed members from the group  Failed processes can also rejoin  Epoch mechanism  Only 32 extra bits in each heartbeat message 9 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  10. Failure Detector  Collects data about k processes Failure Detector Failure Accrual Handler 1 failure Information  Push heartbeats Failure detector Dissemination Handler 2 … Failure Monitor Membership  G ossiped periodically ( T hb ) Handler N Management Monitored process  if p 1 monitors p 2 then there is a TCP connection between them  Accrual Failure Detector  Keeps track of the last m interarrival times for a given process  Derives a probability that a process has failed  Calculation is performed in O (log| S |) steps 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 10 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  11. Collecting Enough Information  Adaptive FDs need to receive Failure Detector Failure Accrual information about monitored Handler 1 failure Information Failure detector Dissemination Handler 2 processes regularly … Failure Monitor Membership Handler N Management  Also applies to accrual FDs Monitored process  Traditional gossip protocols are not regular  Solution: persistent monitoring relationships between processes  Established randomly  Exhibit the desired properties of gossip protocols 11 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  12. Failure Handlers  For each monitored process, Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 a set of thresholds is set … Failure Monitor Membership Handler N Management  For example: 85, 90, and 95% Monitored process  A handler is associated to each one  Several handling strategies are possible  Each executed when the corresponding threshold is reached  It is easy to define application-specific handlers 12 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  13. Information Dissemination  Responsible for gossiping Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 information … Failure Monitor Membership Handler N Management  About failed nodes (specific messages) Monitored process  Important for failure handling  About correct members (piggybacked in heartbeat messages)  Dissemination speed is based on parameter j  j should be O (log( N )) 13 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  14. Implementation  Written in Lua  Compact, efficient, extensible, and platform- independent  The service is packaged as a reusable Lua module  Uses a lightweight CORBA ORB (OiL) for IPC  Also written in Lua  Approximately 80KB of souce code 14 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  15. Initial Evaluation  Main goal: to assess scalability and resilience to failures  20-140 concurrent nodes  Distributed accross three machines equipped with 1GB RAM  100Mbps Fast Ethernet Network  Emulated WAN  latency = 500ms and jitter = 250ms  Parameters T hb = 2s, k = 4, j = 6, 15 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  16. Initial Evaluation  Two situations:  When no failures occur  20, 40, 60, 80, 100, 120, 140 processes  When processes fail, including realistically large numbers of simultaneous failures  140 processes  10, 20, 30, and 40% of failures  Number of sent messages per process as a measure of scalability 16 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  17. Scenario 1: No failures 17 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  18. Scenario 2: 10-40% of process failures No process became isolated.  Almost 95% were still monitored by at least k – 1 processes  18 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  19. Scenario 2: 40% of process failures 19 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  20. Concluding Remarks  Main contribution: to combine gossip-based information dissemination and accrual FDs  while guaranteeing that the AFD collects enough information ;  scalably; and  in a timely and fault-tolerant way  Ongoing work:  More experiments  Self-organizing for better resilience and better scalability  Periodic dissemination of failure information 20 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  21. Thank You! Contact: Fernando Castor fcastor@acm.org 21 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

Recommend


More recommend