approche algorithmique des syst emes r epartis aasr
play

Approche Algorithmique des Syst` emes R epartis (AASR) Guillaume - PowerPoint PPT Presentation

Approche Algorithmique des Syst` emes R epartis (AASR) Guillaume Pierre guillaume.pierre@irisa.fr Dapr` es un jeu de transparents de Maarten van Steen VU Amsterdam, Dept. Computer Science 07b: Consistency & Replication (2/2)


  1. Approche Algorithmique des Syst` emes R´ epartis (AASR) Guillaume Pierre guillaume.pierre@irisa.fr D’apr` es un jeu de transparents de Maarten van Steen VU Amsterdam, Dept. Computer Science 07b: Consistency & Replication (2/2)

  2. Contents Chapter 01: Introduction 02: Architectures 03: Processes 04: Communication (1/2) 04: Communication (2/2) 05: Naming (1/2) 05: Naming (2/2) 06: Synchronization (1/2) 06: Synchronization (2/2) 07: Consistency & Replication (1/2) 07: Consistency & Replication (2/2) 2 / 39

  3. Web applications 3 / 39

  4. Web applications 4 / 39

  5. Web applications 5 / 39

  6. Scaling relational databases Relational databases have many benefits: A very powerful query language (SQL) Strong consistency Mature implementations Well-understood by developers Etc. But also a few drawbacks: Poor elasticity (ability to change the processing capacity easily) Poor scalability (ability to process arbitrary levels of load) Behavior in the presence of network partitions 6 / 39

  7. Elasticity of relational databases Relational databases were designed in the 1970s Designed for mainframes (a single expensive machine) Not for clouds (many weak machines being created/stopped at any time) Master-slave replication: 1 master database processes and serializes all updates N slaves receive updates from the master and process all reads Designed mostly for fault-tolerance, not performance How can we add a replica at runtime? Take a snapshot of the database (very well supported by relational databases) Copy the snapshot into the new replica Apply all updates received since the snapshot Add the new replica in the load balancing group This may take hours depending on the size of the database 7 / 39

  8. Scalability of relational databases Assuming an unlimited number of machines, can we process arbitrary levels of load? 20000 CloudTPS+HBase+DAS3 15000 (transactions/second) Throughput PostgreSQL+DAS3 10000 5000 0 0 10 20 30 40 50 60 Number of server machines Problem: full replication Each replica must process every update Solution: partial replication Each server contains a fraction of the total data Updates can be confined to a small number of machines 8 / 39

  9. Sharding Sharding = shared nothing architecture The programmer splits the database into independent partitions Customers A-M → Database server 1 Customers N-Z → Database server 2 Advantage: scalability Each partition can work independently without processing the updates of other partitions Drawback: all the work is left for the developer Defining the partition criterion Routing requests to the correct servers Implementing queries which span multiple partitions Implementing elasticity Etc. Implementing sharding correctly is very difficult! 9 / 39

  10. The CAP Theorem In a distributed system we want three important properties: Consistency: readers always see the result of previous 1 updates Availability: the system always answers client requests 2 Partition tolerance: the system doesn’t break down if the 3 network gets partitioned Brewer’s theorem: you cannot get all three simultaneously You must pick at most two out of three Relational databases usually implement AC 10 / 39

  11. NoSQL takes the problem upside down NoSQL is designed with scalability in mind: The database must be elastic The database must be fully scalable The database must tolerate machine failures The database must tolerate network partitions What’s the catch? NoSQL must choose between AP and CP Most NoSQL systems choose AP: they do not guarantee strong consistency NoSQL do not support complicated queries They do not support the SQL language Only very simple operations! Different NoSQL systems apply these principles differently 11 / 39

  12. NoSQL data stores rely on DHT techniques NoSQL data stores split data across nodes. . . Excellent elasticity and scalability . . . and replicate each data item on m nodes For fault-tolerance If the network gets partitioned: serve requests within each partition The system remains available But clients will miss updates issued in the other partitions (bad consistency) When the partition is resolved, updates from different partitions get merged 12 / 39

  13. The two meanings of “Consistency” 1 For database experts: Consistency == Referential integrity in a single database To make things simple: unique keys are really unique, foreign keys map on something etc. This is the “C” from ACID 2 For distributed systems experts: Consistency = a property of replicated data To make things simple: all copies of the same data seem to have the same value at any time 13 / 39

  14. Flexible consistency models Some NoSQL data stores allow users to define the level of consistency they want Replicate each data item over N servers Associate each data item with a timestamp Issue writes on all servers, consider a write to be successful when m servers have acknowledged Read data from at least n servers (and return the freshest version to the client) If m + n > N then we have strong consistency For example: m = N , n = 1 But other possibilities exist: m = 1, n = N Or anything in between: m = N 2 + 1, n = N 2 + 1 If m + n ≤ N then we have weak consistency Faster Example: Amazon Dynamo 14 / 39

  15. Why do people use NoSQL? 15 / 39

  16. Flexible data schemas In NoSQL data stores there is no need to impose a strict data schema Anyway the data store treats each row as a (key,value) pair No requirement for the value ⇒ no fixed data schema Not the same as empty values! 16 / 39

  17. Scaling the database tier Repl. SQL Sharding NoSQL (e.g., MySQL) (e.g., Bigtable) Scalability Complex queries / / Fault Tolerance Consistency / 17 / 39

  18. Consistency issues in NoSQL databases NoSQL databases scale because of heavy data partitioning Minimum coordination between partitions Consistency (worst case): eventual consistency Updates will become visible at some point in the future Multiple updates are propagated independently from each other E.g., Amazon’s SimpleDB Consistency (best case): single-row transactions Transactional updates to a single database row No support for multiple-row transactions E.g., Google’s Bigtable, Cassandra, etc. 18 / 39

  19. Position We can guarantee multiple-row transactions in NoSQL databases without compromising their scalability or fault-tolerance properties. The secret: exploit the properties of Web applications Transactions are short-lived Transactions span a limited number of well-identified data items Question In fact this statement cannot be true. Why? 19 / 39

  20. Availability vs. Consistency Strictly speaking, it is impossible to fulfill my promises entirely The CAP theorem states that one cannot support strong Consistency and high Availability in the presence of network Partitions A scalable system necessarily faces occasional partitions NoSQL databases favorize high availability And deliver best-effort consistency CloudTPS focuses on consistency first At the cost of unavailability in extreme failure/partition cases Note: a machine failure is not an extreme case. . . 20 / 39

  21. System Model 21 / 39

  22. System Model 22 / 39

  23. System Model 23 / 39

  24. Atomicity Atomicity: All operations succeed or none of them does No partially executed transactions! Solution: 2-phase commit across the LTMs which contain relevant data items 24 / 39

  25. Atomicity Atomicity: All operations succeed or none of them does No partially executed transactions! Solution: 2-phase commit across the LTMs which contain relevant data items 25 / 39

  26. Atomicity Atomicity: All operations succeed or none of them does No partially executed transactions! Solution: 2-phase commit across the LTMs which contain relevant data items 26 / 39

  27. Atomicity Atomicity: All operations succeed or none of them does No partially executed transactions! Solution: 2-phase commit across the LTMs which contain relevant data items 27 / 39

  28. Atomicity Atomicity: All operations succeed or none of them does No partially executed transactions! Solution: 2-phase commit across the LTMs which contain relevant data items 28 / 39

  29. Atomicity Atomicity: All operations succeed or none of them does No partially executed transactions! Solution: 2-phase commit across the LTMs which contain relevant data items 29 / 39

  30. Consistency Consistency: Each transaction leaves the database in an internally consistent state “Consistency” in this context means: logical consistency of different data items Very different than consistency of a single replicated data item Solution: we assume that transactions are semantically correct 30 / 39

  31. Isolation Isolation: The system behaves as if transactions were processed sequentially If the system allows concurrent transactions, then conflicting transactions must be serialized Solution: Timestamp ordering 31 / 39

  32. Isolation Isolation: The system behaves as if transactions were processed sequentially If the system allows concurrent transactions, then conflicting transactions must be serialized Solution: Timestamp ordering 32 / 39

  33. Isolation Isolation: The system behaves as if transactions were processed sequentially If the system allows concurrent transactions, then conflicting transactions must be serialized Solution: Timestamp ordering 33 / 39

Recommend


More recommend