globalfs a strongly consistent multi site filesystem
play

GlobalFS: A Strongly Consistent Multi-Site Filesystem Leandro - PowerPoint PPT Presentation

GlobalFS: A Strongly Consistent Multi-Site Filesystem Leandro Pacheco Raluca Halalai Valerio Schiavoni Fernando Pedone Etienne Rivire Pascal Felber RainbowFS Workshop May 3rd, 2017 Distributed applications GlobalFS: A Strongly Consistent


  1. GlobalFS: A Strongly Consistent Multi-Site Filesystem Leandro Pacheco Raluca Halalai Valerio Schiavoni Fernando Pedone Etienne Rivière Pascal Felber RainbowFS Workshop May 3rd, 2017

  2. Distributed applications GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

  3. Distributed applications GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

  4. Distributed applications GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

  5. Distributed applications ? GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

  6. Distributed applications Distributed Storage GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

  7. Distributed applications Distributed Storage SQL Databases NoSQL Databases Key-value storage Caches Coordination Systems File Systems GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

  8. Distributed applications Distributed Storage SQL Databases NoSQL Databases Key-value storage Caches Coordination Systems Easy interoperability File Systems File Systems for existing aplications GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 2

  9. Global infrastructure Amazon’s AWS global infrastructure GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 3

  10. CAP theorem Weak Consistency Strong Consistency Lower latency Clear semantics and guarantees Higher availability Easier to reason about Possibly incorrect/unexpected Block instead of providing incorrect results results GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 4

  11. What is GlobalFS? Geographically distributed filesystem Familiar interface (POSIX) Strong consistency Fault-tolerance through replication Flexible performance through locality GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 5

  12. Overall design Separate data and metadata Partial replication Metadata protocol exploiting atomic multicast Causal reads GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 6

  13. Separate data and metadata Metadata Immutable data Controls file contents, Variable sized blobs properties and filesystem structure Metadata refers to data blobs 1 | 2 | 3 | 4 | … GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 7

  14. Partial replication Immutable data is simple to replicate consistently Metadata is partitioned between replica groups (i.e., partitions) GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 8

  15. Partial replication EU US SA GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 9

  16. Partial replication EU US / www bin etc home SA alice bob mark GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 10

  17. Partial replication EU US / www bin etc home SA alice bob mark US SA EU GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 11

  18. Partial replication EU US Global Replication / www bin etc home SA alice bob mark US SA EU GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 12

  19. Partial replication EU US Global Replication / www bin etc home SA alice bob mark Local multicast US SA EU - fast updates - local or remote reads GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 13

  20. Partial replication EU Global multicast (global replication) US - costly updates - fast local reads Global Replication / www bin etc home SA alice bob mark Local multicast US SA EU - fast updates - local or remote reads GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 14

  21. Partial ordering GlobalFS exploits atomic multicast Atomic delivery to groups of processes Partial ordering: messages for different groups don’t have to be ordered betweem themselves Partial ordering is critical for scalability GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 15

  22. Architecture Metadata replicas Atomic Send read or update multicast commands Application Client Data store (FUSE) Insert or fetch immutable data GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 16

  23. Consistent update operations Step 1 Write data blobs to data store Step 2 Issue a metadata update GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 17

  24. Consistent update operations Step 1 Write data blobs to data store Step 2 Issue a metadata update Single-partition Uncoordinated Coordinated multi-partition multi-partition Reply Reply Reply Req Req Req G 1 G 1 G 1 G 2 G 2 G 2 write to file in G 1 write to file in { G 1 , G 2 } move file from G 1 to G 2 Atomic Multicast Execution GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 17

  25. Causal read operations Causally related updates are seen in the same order e.g., operations done by the same client GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 18

  26. Causal read operations Causally related updates are seen in the same order e.g., operations done by the same client Client A Creates an image cat.jpg Modifies a page pets.html to include the image cat.jpg GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 18

  27. Causal read operations Causally related updates are seen in the same order e.g., operations done by the same client Client A Client B Creates an image cat.jpg Opens the pets.html page and finds a broken image reference Modifies a page pets.html to include the image cat.jpg Where is the cat? GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 18

  28. Causal read operations Step 1 Contact a metadata replica for a list of blob ids Step 2 Get the data from the data store Approach inspired by vector clocks Vector is composed of one counter per replica group GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 19

  29. Evaluation Complete prototype in Java https://github.com/pacheco/GlobalFS Filesystem in Userspace (FUSE) URingPaxos for atomic multicast Global deployment using Amazon EC2 GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 20

  30. Maximum throughput by operation GlobalFS throughput 60000 1800 GlobalFS CalvinFS 1600 50000 1400 Operations/sec 40000 1200 1000 30000 800 20000 600 Locality 400 10000 200 0 0 read 1KB local create 1KB local write 1KB glob. create 1KB glob. write 1KB 3 region deployment US west, US east and Europe GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 21

  31. Geographical scalability 1 Region 3 Regions 6 Regions 9 Regions Geographical Scalability s p s s p p o o o 1 2 2 8 0 8 7 Ideal 8 0 6 1 6 3 1 0.8 0.6 0.4 0.2 read 1KB create write 1KB Normalized throughput per region as more regions are added 9 regions uses all EC2 regions available at the time GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 22

  32. GlobalFS: Summary Strong consistency at global scale Simple and familiar API (POSIX) Flexible performance through partial replication and locality Cheap causal read operations GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 23

  33. GlobalFS: Summary Strong consistency at global scale Simple and familiar API (POSIX) Flexible performance through partial replication and locality Cheap causal read operations Thank you! Leandro Pacheco pachecol@usi.ch GlobalFS: A Strongly Consistent Multi-Site Filesystem - Leandro Pacheco 23

Recommend


More recommend