Wide Area Placement of Data Replicas for Fast and Highly Available - PowerPoint PPT Presentation

Wide Area Placement of Data Replicas for Fast and Highly Available Data Access Fan Ping Xiaohu Li, Christopher McConnell Rohini Vabbalareddy, Jeong-Hyon Hwang State University of New York - Albany

Outline • Background • Network Coordinate System • Data Replication - Data Replication for Performance - Data Replication for Performance and Availability • Conclusion

Data Intensive Distributed Systems • Google, Amazon, Facebook, Microsoft…

Data Intensive Distributed Systems • Google, Amazon, Facebook, Microsoft… • Dynamo, Cassandra, PNUTS…

Data Replica Placement • Given a replication degree (e.g., 3), where should we put those data replicas in order to effectively improve the overall data access speed and availability? • Challenges - Scalability - Certain SLA

Network Coordinate Systems - Based on the network latencies between each other, nodes are embedded into a virtual space so that their distances in this virtual space are close to the network latencies. - E.g., Vivaldi, RNP

Network Coordinate Systems

Servers on the map

Servers in the coordinate system

Clients in the coordinate system

Cluster the clients in the coordinate system

Centroids of the clusters

Servers near centroids of the clusters

Simulation Settings • Java simulator • ~200 Planetlab-node trace as input • A certain number of nodes are selected as servers • The other nodes are used as clients

Performance VS. Number of Replicas

Conditional Failure vs. Angle S1 R C S2

Conditional Failure vs. Angle S1 R 0.05% 0.05% C S2

Conditional Failure vs. Angle The conditional probability of the failure of (C, R, S 2 ) given the failure of (C,R,S 1 ) is more than 50% !! S1 R 0.05% 0.05% C S2

Conditional Failure vs. Angle S1 (30,35) C Ɵ S2 (0,-20) (-10,50)

Conditional Failure vs. Angle

Conditional Failure vs. Angle Ɵ

Conditional Failure vs. Angle

Estimations for Latency and Availability • Per-client latency L(c, S) = dist(c,s) • Per-client availability A(c,S) = 1- ( F(c,S 1 )*F(c,S 2 |S 1 )*…*F(c,S r |S r-1 ) ) • Utility function to combine latency and availability 𝑩 U = 𝑴

Simulation Settings • Java simulator • Traceroute and ping data collected from ~ 100 PlanetLab nods for a month • Randomly select some nodes as servers • The rest are clients

Unavailability vs. Number of Replicas

Conclusion and Future Work • Improves the average user access latency by 35% • Improves the overall availability • Designs the utility function to take into account both latency and availability • Needs more realistic dataset • Better utility function • Non- exponential algorithm (Greedy…) • Take inter-datacenter cost into account

Questions? THANK YOU!

Wide Area Placement of Data Replicas for Fast and Highly Available - PowerPoint PPT Presentation

Wide Area Placement of Data Replicas for Fast and Highly Available Data Access Fan Ping Xiaohu Li, Christopher McConnell Rohini Vabbalareddy, Jeong-Hyon Hwang State University of New York - Albany Outline Background Network Coordinate

Lube : Mitigating Bottlenecks in Hao Wang* Wide Area Data Analytics Baochun Li i Qua Wide Area

GORDIAN Placement Perform GORDIAN placement Uniform area and net weight, area balance

VLSI Placement Sadiq M. Sait & Habib Youssef December 1995 Placement Placement is the

TimberWolf 7.0 Placement Perform TimberWolf placement Based on the given standard cell

Wide Area Networking A short introduction to High-Speed Wide-Area-Networking August 31, 2005 1

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

BonnPlace : A Self-Stabilizing Placement Framework Ulrich Brenner, Anna Hermann, Nils Hoppmann,

Optimizing Shuffle in Wide-Area Data Analytics Shuhao Liu * , Hao Wang, Baochun Li Department of

Student Placement Task Force Student placement option presentation Maize Board of Education |

College Placement Presentation October 30, 2019 Dave Bucciero Director of College Placement

ADVANCED PLACEMENT The purpose of the Advanced Placement program is to provide the students with

Advanced Placement Physics 1 Advanced Placement Physics 2 Dr. Matt Frederickson Dr. Kevin

College Placement Presentation October 24, 2018 Dave Bucciero Director of College Placement

INCREASING CIRCULATION BOOK DISPLAYS THROUGH 2 Placement PLACEMENT LIBRARY GEOGRAPHY

The ISPD 2006 Placement Contest and Benchmark Suite Gi-Joon Nam, Charles J. Alpert, Paul G.

Using machine learning Learning knot methods in geometric modeling placement SVM knot placement

A Cloud-native Architecture for Replicated Data Services Hemant Saxena, Jeffery Pound University

Distributed Databases 1 19.1 Distributed Database System A distributed database system

Data Replication and Power Consumption in Data Grids Karl Smith, Susan Vrbsky, Ming Lei, Jeff

Data Preprocessing Data Mining and Exploration: Preprocessing Data preparation is a big issue for

CS 744: GEODE Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Assignment 2 grades - Midterm

Principles of Software Construction: Objects, Design, and Concurrency Distributed System Design,

Vembu Technologies 100+ Decade + G2 crowd Countries Experience Top Leaders-2019

Distributed Databases Chapter 16 1 What is a Distributed Database? Database whose relations

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Wide Area Placement of Data Replicas for Fast and Highly Available - PowerPoint PPT Presentation

Wide Area Placement of Data Replicas for Fast and Highly Available Data Access Fan Ping Xiaohu Li, Christopher McConnell Rohini Vabbalareddy, Jeong-Hyon Hwang State University of New York - Albany Outline Background Network Coordinate

Lube : Mitigating Bottlenecks in Hao Wang* Wide Area Data Analytics Baochun Li i Qua Wide Area

GORDIAN Placement Perform GORDIAN placement Uniform area and net weight, area balance

VLSI Placement Sadiq M. Sait &amp; Habib Youssef December 1995 Placement Placement is the

TimberWolf 7.0 Placement Perform TimberWolf placement Based on the given standard cell

Wide Area Networking A short introduction to High-Speed Wide-Area-Networking August 31, 2005 1

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

BonnPlace : A Self-Stabilizing Placement Framework Ulrich Brenner, Anna Hermann, Nils Hoppmann,

Optimizing Shuffle in Wide-Area Data Analytics Shuhao Liu * , Hao Wang, Baochun Li Department of

Student Placement Task Force Student placement option presentation Maize Board of Education |

College Placement Presentation October 30, 2019 Dave Bucciero Director of College Placement

ADVANCED PLACEMENT The purpose of the Advanced Placement program is to provide the students with

Advanced Placement Physics 1 Advanced Placement Physics 2 Dr. Matt Frederickson Dr. Kevin

College Placement Presentation October 24, 2018 Dave Bucciero Director of College Placement

INCREASING CIRCULATION BOOK DISPLAYS THROUGH 2 Placement PLACEMENT LIBRARY GEOGRAPHY

The ISPD 2006 Placement Contest and Benchmark Suite Gi-Joon Nam, Charles J. Alpert, Paul G.

Using machine learning Learning knot methods in geometric modeling placement SVM knot placement

A Cloud-native Architecture for Replicated Data Services Hemant Saxena, Jeffery Pound University

Distributed Databases 1 19.1 Distributed Database System A distributed database system

Data Replication and Power Consumption in Data Grids Karl Smith, Susan Vrbsky, Ming Lei, Jeff

Data Preprocessing Data Mining and Exploration: Preprocessing Data preparation is a big issue for

CS 744: GEODE Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Assignment 2 grades - Midterm

Principles of Software Construction: Objects, Design, and Concurrency Distributed System Design,

Vembu Technologies 100+ Decade + G2 crowd Countries Experience Top Leaders-2019

Distributed Databases Chapter 16 1 What is a Distributed Database? Database whose relations

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

VLSI Placement Sadiq M. Sait & Habib Youssef December 1995 Placement Placement is the