towards automated polyglot
play

Towards Automated Polyglot Persistence Michael Schaarschmidt, Felix - PowerPoint PPT Presentation

Towards Automated Polyglot Persistence Michael Schaarschmidt, Felix Gessert, Norbert Ritter gessert@informatik.uni-hamburg.de Poly lyglot Persistence Current best practice Application Layer Nested Billing Data Files Session data


  1. Towards Automated Polyglot Persistence Michael Schaarschmidt, Felix Gessert, Norbert Ritter gessert@informatik.uni-hamburg.de

  2. Poly lyglot Persistence Current best practice Application Layer Nested Billing Data Files Session data Application Data Friend Google Cloud Storage network Recommen- Cached data Search Index dation Engine & metrics Amazon Elastic MapReduce

  3. Poly lyglot Persistence Current best practice Application Layer Research Question : Nested Billing Data Files Session data Application Data Can we automate the mapping problem? Friend Google Cloud Storage database data network Recommen- Cached data Search Index dation Engine & metrics Amazon Elastic MapReduce

  4. Vis ision Sch chemas can be be annotated wit ith requirements - Write Throughput > 10,000 RPS - Read Availability > 99.9999% - Scans = true - Full-Text-Search = true - Monotonic Read = true Schema DBs Tables Fields

  5. Vis ision Th The Poly lyglot Persistence Mediator ch chooses th the database Application Data and Operations Annotated Database Schema Polyglot Persistence Metrics Mediator Latency < 30ms db 1 db 2 db 3

  6. Towards Automated Poly lyglot Persis istence Nece cessary ry steps  Goal: ◦ Extend classic workload management to polyglot persistence ◦ Leverage hetereogeneous (NoSQL) databases 1. Requirements 2. Resolution 3. Mediation Tenant specifies Find or provision a Mediate data and requirements as Service- suitable combination database operations Level-Agreements of databases

  7. Servic ice Level Agreements Expressing application requirements Functional Service Level Objectives Fu ◦ Guarantee a „ feature “ ◦ Determined by database system ◦ Examples : transactions, join Non-Functional Service Level Objectives ◦ Guarantee a certain quality of service (QoS) ◦ Determined by database system and service provider ◦ Examples :  Con ontinuous: response time (latency), throughput  Bin inary ry: Elasticity, Read-your-writes

  8. Servic ice Level Agreements Refining th the utili tility of of each ch SLO Utility expresses „ value “ of a continuous non-functional requirement: 𝑔 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 𝑛𝑓𝑢𝑠𝑗𝑑 → [0,1]

  9. SLA Example For MongoDB Functional Non-Functional Re Requirements Requirements Re Scan-Querys Scalability of Data Volume Write Scalability Transactions Read Scalability Elasticity Conditional Updates Read-Availability Consistency Joins Write-Availability Durability Query by Example Read-Latency Write-Throughput Analytics Write-Latency

  10. Step I I - Requirements Expressing th the application‘s needs  Tenant annotates schema Tenant with his requirements 1 . Define 2 . Annotate schema Database Table Table Annotations  Continuous non-functional Field Field Field Field e.g. write latency < 15ms  Binary functional e.g. Atomic updates annotated  Binary non-functional Inherits continuous e.g. Read-your-writes annotations Requirements Re 1

  11. Step I I - Requirements Expressing th the application‘s needs  Tenant annotates schema Tenant with his requirements 1 . Define 2 . Annotate schema Database Table Table Annotations  Continuous non-functional Field Field Field Field e.g. write latency < 15ms  Binary functional e.g. Atomic updates annotated  Binary non-functional Inherits continuous e.g. Read-your-writes annotations Requirements Re 1

  12. Step II II - Resolution Fin Finding th the best database Provider  The Provider resolves the requirements Either:  R ANK ANK : scores available Capabilities for Refuse or available DBs Provision new DB database systems 1 . Find optimal 2a . If unsatisfiable  Routing Mod odel: defines the RANK ( schema_root, DBs ) optimal mapping from schema through recursive descent using annotated schema and metrics elements to databases 2b . Generates routing model Routing Model Route schema_element db  transform db-independent to db- specific operations Resolution Re 2

  13. DBs = { MongoDB, Riak, DBs Step II II - Resolution Cassandra, CouchDB, Redis, MySQL, S3, Hbase } Ranking alg lgorithm by by example Annotations R ANK Algorithm Schema No annotation  Lineariza- ECommerceDB bility recursive descent to child database Availability Customers Table ShoppingBasket UserName List<String> String Read latency

  14. Step II II - Resolution Ranking alg lgorithm by by example Annotations R ANK Algorithm Schema No annotation  DBs = { MongoDB, Riak, DBs Lineariza- ECommerceDB Cassandra, CouchDB, Redis, bility recursive descent to child database MySQL, S3, Hbase } Availability Binary requirement  Customers 1. Exclude DBs that do not Table support it 2. Recursive descent ShoppingBasket UserName List<String> String Read latency

  15. Step II II - Resolution Ranking alg lgorithm by by example Annotations R ANK Algorithm Schema Lineariza- ECommerceDB Da Database Avail ilability bility database 99%  0.8 MongoDB Availability 95%  0.05 Redis Customers 94%  0.04 MySQL Table 99.9%  0.9 HBase Continuous requirement  ShoppingBasket UserName ∀ databases calculate List<String> String Read latency 𝑒𝑐 → 𝑔 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 (𝑒𝑐. 𝑏𝑤𝑏𝑗𝑚𝑏𝑐𝑗𝑚𝑗𝑢𝑧)

  16. Step II II - Resolution Ranking alg lgorithm by by example Annotations R ANK Algorithm Schema Lineariza- ECommerceDB Database Da Avail ilability Latency La bility database 99%  0.8 10ms  1 MongoDB Availability 95%  0.05 1ms  1 Redis Customers 94%  0.04 40ms  0.2 MySQL Table 99.9%  0.9 50ms  0.1 HBase Continuous requirement  ShoppingBasket UserName ∀ databases calculate List<String> String 𝑒𝑐 → 𝑔 𝑣𝑢𝑗𝑚𝑗𝑢𝑧 (𝑒𝑐. 𝑚𝑏𝑢𝑓𝑜𝑑𝑧) Read latency

  17. Step II II - Resolution Ranking alg lgorithm by by example DB DB Score Sc MongoDB 0.9 Annotations R ANK Algorithm Schema Redis 0.525 Lineariza- ECommerceDB MySQL 0.12 bility database HBase 0.5 Availability Binary requirement  Customers 1. Exclude DBs that do not Table support it 2. Recursive descent 3. Pick DB with best total ShoppingBasket UserName score and add it to List<String> String routing model Read latency

  18. Step II II - Resolution Ranking alg lgorithm by by example DB DB Sc Score MongoDB 0.9 Annotations R ANK Algorithm Schema Redis 0.525 Lineariza- ECommerceDB MySQL 0.12 bility database HBase 0.5 Availability Binary requirement  Customers 1. Exclude DBs that do not Table support it 2. Recursive descent 3. Pick DB with best total ShoppingBasket UserName score and add it to List<String> String routing model Read latency Routing Model: Customers  MongoDB

  19. Step III III - Media iation Application Routing data and and operations  The PPM routes data  Operation Rewrit iting: : 1 . CRUD, queries, transactions, etc. translates from abstract to Polyglot Persistence Mediator database-specific operations  Uses Routing Model   Ru Runtime Metric ics: Latency, Triggers periodic Report materialization metrics availability, etc. are reported to the resolver 2 . route  Prim rimary ry Da Database Option: All data periodically gets db 1 db 2 db 3 materialized to designated database Mediation 3

  20. Evaluation: News Artic icle Prototype built on O RESTES Sc Scenario io: news articles with impression counts Obje jectiv ives: low-latency top-k queries, high- throughput counts, article-queries Article Counter

  21. Evaluation: News Artic icle Prototype built on O RESTES Sc Scenario io: news articles with impression counts Obje jectiv ives: low-latency top-k queries, high- throughput counts, article-queries Mediator

  22. Evaluation: News Artic icle Prototype built on O RESTES Sc Scenario io: news articles with impression counts Obje jectiv ives: low-latency top-k queries, high- throughput counts, article-queries Mediator Counter updates kill performance

  23. Evaluation: News Artic icle Prototype built on O RESTES Sc Scenario io: news articles with impression counts Obje jectiv ives: low-latency top-k queries, high- throughput counts, article-queries Mediator

  24. Evaluation: News Artic icle Prototype built on O RESTES Sc Scenario io: news articles with impression counts Obje jectiv ives: low-latency top-k queries, high- throughput counts, article-queries Mediator No powerful queries

  25. Evaluation: News Artic icle Prototype built on O RESTES Sc Scenario io: news articles with impression counts Obje jectiv ives: low-latency top-k queries, high- throughput counts, article-queries Art rticle le Im Imp. ID ID Im Imp. Title itle ID ID … Document Sorted Set Found Resolution

  26. Challgenges & Future Work Worklo load Management: during mediation actively schedule requests based on requirements Ranking: Predict future metrics from historic ones ( time-series analysis ) or from performance models Database se sele lection: minimize 𝑄 𝑇𝑀𝐵 𝑤𝑗𝑝𝑚𝑏𝑢𝑗𝑝𝑜 ∗ 𝑞𝑓𝑜𝑏𝑚𝑢𝑧 (e.g. through reinforcement learning )

Recommend


More recommend