wide area placement of data replicas
play

Wide Area Placement of Data Replicas for Fast and Highly Available - PowerPoint PPT Presentation

Wide Area Placement of Data Replicas for Fast and Highly Available Data Access Fan Ping Xiaohu Li, Christopher McConnell Rohini Vabbalareddy, Jeong-Hyon Hwang State University of New York - Albany Outline Background Network Coordinate


  1. Wide Area Placement of Data Replicas for Fast and Highly Available Data Access Fan Ping Xiaohu Li, Christopher McConnell Rohini Vabbalareddy, Jeong-Hyon Hwang State University of New York - Albany

  2. Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

  3. Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

  4. Data Intensive Distributed Systems • Google, Amazon, Facebook, Microsoft…

  5. Data Intensive Distributed Systems • Google, Amazon, Facebook, Microsoft… • Dynamo, Cassandra, PNUTS…

  6. Data Replica Placement • Given a replication degree (e.g., 3), where should we put those data replicas in order to effectively improve the overall data access speed and availability? • Challenges - Scalability - Certain SLA

  7. Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

  8. Network Coordinate Systems - Based on the network latencies between each other, nodes are embedded into a virtual space so that their distances in this virtual space are close to the network latencies. - E.g., Vivaldi, RNP

  9. Network Coordinate Systems

  10. Network Coordinate Systems

  11. Network Coordinate Systems

  12. Network Coordinate Systems

  13. Network Coordinate Systems

  14. Network Coordinate Systems

  15. Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

  16. Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

  17. Servers on the map

  18. Servers in the coordinate system

  19. Clients in the coordinate system

  20. Cluster the clients in the coordinate system

  21. Cluster the clients in the coordinate system

  22. Centroids of the clusters

  23. Servers near centroids of the clusters

  24. Simulation Settings • Java simulator • ~200 Planetlab-node trace as input • A certain number of nodes are selected as servers • The other nodes are used as clients

  25. Performance VS. Number of Replicas

  26. Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

  27. Conditional Failure vs. Angle S1 R C S2

  28. Conditional Failure vs. Angle S1 R 0.05% 0.05% C S2

  29. Conditional Failure vs. Angle The conditional probability of the failure of (C, R, S 2 ) given the failure of (C,R,S 1 ) is more than 50% !! S1 R 0.05% 0.05% C S2

  30. Conditional Failure vs. Angle S1 (30,35) C Ɵ S2 (0,-20) (-10,50)

  31. Conditional Failure vs. Angle

  32. Conditional Failure vs. Angle

  33. Conditional Failure vs. Angle Ɵ

  34. Conditional Failure vs. Angle

  35. Estimations for Latency and Availability • Per-client latency L(c, S) = dist(c,s) • Per-client availability A(c,S) = 1- ( F(c,S 1 )*F(c,S 2 |S 1 )*…*F(c,S r |S r-1 ) ) • Utility function to combine latency and availability 𝑩 U = 𝑴

  36. Simulation Settings • Java simulator • Traceroute and ping data collected from ~ 100 PlanetLab nods for a month • Randomly select some nodes as servers • The rest are clients

  37. Unavailability vs. Number of Replicas

  38. Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

  39. Conclusion and Future Work • Improves the average user access latency by 35% • Improves the overall availability • Designs the utility function to take into account both latency and availability • Needs more realistic dataset • Better utility function • Non- exponential algorithm (Greedy…) • Take inter-datacenter cost into account

  40. Questions? THANK YOU!

Recommend


More recommend