Information Systems Group OctopusDB Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal Supervisor: Prof. Dr. Jens Dittrich May 31, 2010
Database Landscape Information Systems Group OLAP Streaming Archival System System Search OLTP Engine May 31, 2010 Alekh Jindal 2
Database Landscape Information Systems Group Information System OLAP Streaming Archival System System Search OLTP Engine May 31, 2010 Alekh Jindal 2
Example: Banking Information Systems Group Drawbacks Scenario Relational Table Integration Cost OLAP Column Layout Several Applications Licensing Cost Evolving Applications Maintenance Cost DBA Cost Business Analytics Engineering Cost Relational Table Archival OLTP System Row Layout Relational Table Zoo Row Layout Real-time Transactions Older Transactions Required Performance Eventual Integration Hard-coded optimizations ETL style data pipelines Hard-coded data layouts Streaming Search Web Pages Flat File System Engine Posting Lists Raw Text Stock Trading Customer Service Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 3
Hard-coded Data Layouts Information Systems Group Fractured Row Column Mirrors Workload Fraction of Tuple Type Attribute Selectivity Bad Good Good Query 0.2 0.001 Good Bad Good Query 1.0 0.1 Bad Bad Bad Query 0.75 1.0 Good Bad Bad Update 1.0 0.1 Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 4
Hard-coded Data Layouts Information Systems Group Fractured Inflexibility Row Column Mirrors Workload Fraction of Tuple Type Attribute Selectivity Bad Good Good Query 0.2 0.001 Good Bad Good Query 1.0 0.1 Bad Bad Bad Query 0.75 1.0 Good Bad Bad Update 1.0 0.1 Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 4
OctopusDB Core Idea Information Systems Group • No fixed store • Why not have a flexible storage depending on the workload • Pick the storage appropriate for the use-case • Emulate a variety of systems Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 5
OctopusDB Information Systems Group Mimic Several Systems Holistic Storage View Optimizer Storage Views Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 6
System Architecture Information Systems Group API Primary Log Store Storage View Store Result Log SV Purging & Checkpointing Recovery Manager Query Catalog Storage View Catalog Holistic SV Optimizer Transaction Manager Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 7
Storage Views Information Systems Group • No hard-coded store in OctopusDB • All operations recorded in a primary log on stable storage using WAL • Storage Views: arbitrary physical representations • Different storage layouts under a single umbrella Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 8
Storage View Examples Information Systems Group Primary Secondary • Partial Index SV • Log SV • Bag-partitioned SV • Row SV • Key-consolidated SV • Column SV • Vertically/Horizontally Partitioned SV • Index SV • ... any hybrid combination of the above Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 9
Use-case Scenario* Information Systems Group • Flight booking system • Tables: Tickets , Customers • Tickets : several attributes, frequently updated • Customers : fewer attributes • Queries: SELECT C.* FROM Tickets T, Customers C WHERE T.customer_id=C.id AND T.a 1 =x 1 AND T.a 2 =x 2 ... AND T.a n =x n * Inspired from Unterbrunner et al. in PVLDB, 2009. Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 10
Flight Booking System Information Systems Group customers, 01, <tom, 25, tom@abc.com, ...> SELECT C.* customers, 02, <marc, 23, marc@abc.com, ...> FROM Tickets T , Customers C customers 03, <felix, 20, felix@abc.com, ...> WHERE T.customer_id=C.id Customers customers, 03, <felix, 20, felix@xyz.com, ...> AND T.a 1 =x 1 .... AND T.a n =x n ..... tickets, 301, <paris, rome, E,...> Tickets tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... tickets.customer_id σ a 1 =x 1 .... a n =x n π customer.* ( ( )) customers.id Result Log SV customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... ..... Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 11
Bag-partitioning Information Systems Group customers, 01, <tom, 25, tom@abc.com, ...> SELECT C.* customers, 02, <marc, 23, marc@abc.com, ...> FROM Tickets T , Customers C customers 03, <felix, 20, felix@abc.com, ...> WHERE T.customer_id=C.id Customers customers, 03, <felix, 20, felix@xyz.com, ...> AND T.a 1 =x 1 .... AND T.a n =x n ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 301, <paris, rome, E,...> Tickets tickets, 303, <tokyo, beijing, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> tickets, 303, <tokyo, beijing, E,...> ..... tickets, 303, <tokyo, beijing, B,..> ..... tickets log σ bag=tickets Log SV Result Log SV tickets.customer_id ) ) ( σ σ ( π Log SV customer.id a 1 =x 1 ... a n =x n bag=customers * s . e r m o t u s c customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers log tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> customers, 01, <tom, 25, tom@abc.com, ...> tickets, 303, <tokyo, beijing, E,...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... ..... ..... Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 12
Key-consolidation Information Systems Group customers, 01, <tom, 25, tom@abc.com, ...> SELECT C.* customers, 02, <marc, 23, marc@abc.com, ...> FROM Tickets T , Customers C customers 03, <felix, 20, felix@abc.com, ...> WHERE T.customer_id=C.id Customers customers, 03, <felix, 20, felix@xyz.com, ...> AND T.a 1 =x 1 .... AND T.a n =x n ..... tickets, 301, <paris, rome, E,...> tickets, 301, <paris, rome, E,...> Tickets tickets, 302, <moscow, berlin, B,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> tickets, 303, <tokyo, beijing, E,...> ..... tickets, 303, <tokyo, beijing, B,..> ..... )) σ ( ( γ Γ bag=tickets Log SV bag,key recent Result Log SV tickets.customer_id ) ) γ ( ( σ ( σ Γ bag,key ( )) π Log SV recent customer.id a 1 =x 1 ... a n =x n bag=customers * s . e r m o t u s c customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers, 01, <tom, 25, tom@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... ..... ..... Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion May 31, 2010 Alekh Jindal 13
Recommend
More recommend