ken birman i
play

Ken Birman i Cornell University. CS5410 Fall 2008. Gossip and - PowerPoint PPT Presentation

Ken Birman i Cornell University. CS5410 Fall 2008. Gossip and Network Overlays A topic that has received a lot of recent attention Today well look at three representative approaches Scribe, a topic based pub sub system that


  1. Ken Birman i Cornell University. CS5410 Fall 2008.

  2. Gossip and Network Overlays � A topic that has received a lot of recent attention � Today we’ll look at three representative approaches � Scribe, a topic ‐ based pub ‐ sub system that runs on the Pastry DHT (slides by Anne ‐ Marie Kermarrec) � Sienna a content subscription overlay system (slides by � Sienna, a content ‐ subscription overlay system (slides by Antonio Carzaniga) � T ‐ Man, a general purpose system for building complex network overlays (slides by Ozalp Babaoglu)

  3. Scribe � Research done by the Pastry team, at MSR lab in Cambridge England � Basic idea is simple B i id i i l � Topic ‐ based publish/subscribe � Use topic as a key into a DHT � Use topic as a key into a DHT � Subscriber registers with the “key owner” � Publisher routes messages through the DHT owner � Optimization to share load � If a subscriber is asked to forward a subscription, it doesn’t do so and instead makes note of the subscription Later it will so and instead makes note of the subscription. Later, it will forward copies to its children

  4. Architecture Scalable communication Subscription management SCRIBE service Event notification P2P location and PASTRY DHT routing layer Internet TCP/IP 20/12/2002 4

  5. Design � Construction of a multicast tree based on the Pastry network � Reverse path forwarding R h f di � Tree used to disseminate events � Use of Pastry route to create and join groups � Use of Pastry route to create and join groups 20/12/2002 5

  6. SCRIBE: Tree Management � Create: route to Root groupId j join( groupId) ( g p ) � Join: route to groupId J i Id groupId Id Forwards two copies � Tree: union of Pastry routes from members routes from members Multicast ( groupId) to the root. � Multicast: from the root down to the d h leaves Low link stress Low link stress join( groupId) Low delay 20/12/2002 6

  7. SCRIBE: Tree Management d467c4: root d467c4: root 26b20d d471f1 d467c4: root Proximity space y p d13da3 65a1fc 65a1fc 65a1fc d13da3 Name space 26b20d 20/12/2002 7

  8. Concerns? � Pastry tries to exploit locality but could these links send a message from Ithaca… to Kenya… to Japan… � What if a relay node fails? Subscribers it serves Wh if l d f il S b ib i will be cut off � They refresh subscriptions but unclear how often this � They refresh subscriptions, but unclear how often this has to happen to ensure that the quality will be good � (Treat subscriptions as “leases” so that they evaporate if not refreshed… no need to unsubscribe…)

  9. SCRIBE: Failure Management � Reactive fault tolerance � Tolerate root and nodes failure � Tree repair: local impact � Fault detection: heartbeat messages � Local repair l i 20/12/2002 9

  10. Scribe: performance � 1500 groups, 100,000 nodes, 1msg/group � Low delay penalty � Good partitioning and load balancing G d titi i d l d b l i � Number of groups hosted per node : 2.4 (mean) 2 (median) � Reasonable link stress: � Mean msg/link : 2.4 (0.7 for IP) � Maximum link stress: 4*IP M i li k *IP 20/12/2002 10

  11. Topic distribution Windows Update oup Size Stock Gro Alert Alert Instant Messaging Topic Rank 20/12/2002 11

  12. Concern about this data set � Synthetic, may not be terribly realistic � In fact we know that subscription patterns are usually power ‐ law distributions, so that’s reasonable l di t ib ti th t’ bl � But unlikely that the explanation corresponds to a clean Zipf ‐ like distribution of this nature (indeed, totally p ( , y implausible) � Unfortunately, this sort of issue is common when evaluating very big systems using simulations l i bi i i l i � Alternative is to deploy and evaluate them in use… but only feasible if you own Google ‐ scale resources! only feasible if you own Google scale resources!

  13. Delay penalty 1500 f Topics 1250 e Number of 1000 Mean = 1.66 Median =1.56 Cumulative 750 500 C 250 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Delay Penalty Relative to IP 20/12/2002 13

  14. Node stress: 1500 topics Mean = 6.2 es er of node Median =2 Numbe Total number of children table entries 20/12/2002 14

  15. Scribe Scribe Link stress 40000 Scribe 35000 Mean = 1.4 IPMulticast Median = 0 30000 er of Links 25000 20000 Numbe 15000 Maximum stress 10000 5000 0 1 10 100 1000 10000 Link stress Link stress 20/12/2002 15

  16. Anycast � Supports highly dynamic groups � Suitable for decentralized resource discovery (can add predicate during DFS) predicate during DFS) � Results (100k nodes/.5M network): � Join: 4.1 msgs (empty group); avg 3.5 msgs (2,500 members) � 1,000 anycasts: 4.1 msg (empty group); avg 2.3 msgs (2,500 t ( t ) ( members) � Locality: For >90% of anycasts, <7% of member were closer than the receiver receiver 20/12/2002 16

  17. Fireflies Fireflies ppt Fireflies.ppt

  18. T ‐ Man T ‐ Man T Man

Recommend


More recommend