Building databases that become smarter over time Ahmad Shahab Tajik Michael Cafarella Barzan Mozafari University of Michigan, Ann Arbor Database Learning Yongjoo Park
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Our Goal: reuse the work. Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases
Users Database query Answer to query After answering queries, THE WORK is almost completely WASTED. Small exceptions: • Caching • Identical queries • Indexing/Materialization hints 1 Today’s Databases Our Goal: reuse the work.
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A n A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions 2. Formally, always more accurate 3. Popularity of analytic workloads • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Database Learning
Q 1 A 1 Q n Q n Q i (1% err) 1 (1% err) 1 (10% err) A n A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions 2. Formally, always more accurate 3. Popularity of analytic workloads • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Q i (1% err) Database Learning
Q 1 A 1 Q n Q n Q i (1% err) 1 (1% err) 1 (10% err) A n A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions 2. Formally, always more accurate 3. Popularity of analytic workloads • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Q i (1% err) Database Learning
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A n A i (1% err, 10 sec) A n 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions 2. Formally, always more accurate Users Database Inaccurate Fast, Accurate Slow, 3. Popularity of analytic workloads • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 2 A New Paradigm in AQP Setting Query Synopsis Database A i (1% err, 10 sec) Learning
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A n A n A i (1% err, 10 sec) 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions 2. Formally, always more accurate 3. Popularity of analytic workloads • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Database A i (1% err, 10 sec) Learning
Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A n A n A i (1% err, 10 sec) 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 3. Popularity of analytic workloads 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis ( Q 1 , A 1 ) Database A i (1% err, 10 sec) Learning
Q 1 A 1 Q n Q i (1% err) Q i (1% err) 1 (10% err) A n A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 3. Popularity of analytic workloads 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Q n + 1 (1% err) Database Learning
Q 1 A 1 Q n Q i (1% err) Q i (1% err) 1 (1% err) A n A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (1% err, 1 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 3. Popularity of analytic workloads 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Q n + 1 (10% err) Database Learning
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A n A i (1% err, 10 sec) A i (1% err, 10 sec) 1 (1% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 3. Popularity of analytic workloads 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Database A n + 1 (10% err, 1 sec) Learning
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (10% err, 1 sec) 1. User: enjoys 1% error bound in 1 second! Approximate solutions • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 3. Popularity of analytic workloads 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Database A n + 1 (1% err, 1 sec) Learning
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (10% err, 1 sec) Approximate solutions • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 3. Popularity of analytic workloads 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Database A n + 1 (1% err, 1 sec) Learning 1. User: enjoys 1% error bound in 1 second!
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (10% err, 1 sec) Approximate solutions • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 3. Popularity of analytic workloads 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Database A n + 1 (1% err, 1 sec) Learning 1. User: enjoys 1% error bound in 1 second!
Q 1 A 1 Q n Q n Q i (1% err) Q i (1% err) 1 (1% err) 1 (10% err) A i (1% err, 10 sec) A n A i (1% err, 10 sec) 1 (10% err, 1 sec) • BlinkDB, SnappyData, Yahoo Druid, Facebook Presto, Infobright, etc. 2. Formally, always more accurate 2 Users Database Inaccurate Fast, Accurate Slow, A New Paradigm in AQP Setting Query Synopsis Database A n + 1 (1% err, 1 sec) Learning 1. User: enjoys 1% error bound in 1 second! 3. Popularity of analytic workloads ⇒ Approximate solutions
Past Answers Future Answers The more past queries, the more Accurate and Faster Machine Learning: Past Observations Future Predictions Database Learning: 3 From Machine Learning To Database Learning ⇒
The more past queries, the more Accurate and Faster Machine Learning: Past Observations Future Predictions Database Learning: 3 From Machine Learning To Database Learning ⇒ Past Answers Future Answers ⇒
Recommend
More recommend