Computation Reuse in Analytics Job Service at Microsoft Alekh Jindal, Shi Qiao, Hiren Patel, Zhicheng Yin, Jieming Di, Malay Bag, Marc Friedman, Yifung Lin, Konstantinos Karanasos, Sriram Rao Microsoft
Computation Reuse in Analytics ! Job Service at Microsoft Alekh Jindal, Shi Qiao, Hiren Patel, Zhicheng Yin, Jieming Di, Malay Bag, Marc Friedman, Yifung Lin, Konstantinos Karanasos, Sriram Rao Microsoft
A brief history of Views Typical Materialized XML View Assumptions : Dynamic First VLDB View Selection • Tuning few databases First SIGMOD Optimizing Queries Partial • Relatively static data Logical Database View Maintenance Knowledge Bases with some updates Design Materialized Views XQuery • Views materialized a priori and offline Incremental • Accurate estimates of utility/cost of view materialization
What’s new: Analytics -as-a-Service! Also, Job Service or Serverless Analytics : Typical Materialized • Not require users to manage h/w or s/w View Assumptions : • Only provide SQL queries over stored data • Tuning few databases • Service provider takes care of the execution • Relatively static data • Users only pay for the processing cost with some updates • Views materialized Experience from SCOPE Job Service: SCOPE Job Service at Microsoft: a priori and offline • ~10 5 number of machines • Cluster-wide computation overlaps • Accurate estimates of • ~10 5 number of analytical jobs • Recurring jobs with new inputs utility/cost of view • ~10 3 developers across Microsoft • Always online with SLA requirements materialization • ~EBs data processed per day • Cost estimations very challenging
Boston -> Paris -> Tokyo Boston -> Paris Boston -> Tokyo Reassigning Passengers to Planes in Mid-Air
Boston -> Paris -> Tokyo Boston -> Tokyo Boston -> Paris Reassigning Passengers to Planes in Mid-Air
CloudViews Overview Assumption : Assumption : Recurring Workloads Exact Subexpression Matches
CloudViews Overview User Interfaces & Tooling Job Coordination Recurring Workload Synchronization Feedback Metadata Service Loop View Sel. Online Phy. Design Materialization Expiry Rewrite queries using Views
Recurring Workloads User Interfaces & Tooling Job Coordination Recurring Workload Synchronization Feedback Metadata Service Loop View Sel. Online Phy. Design Materialization Expiry Rewrite queries using Views
Recurring Workloads • Periodic queries with different inputs and parameters • Structured/unstructured data; custom user code 8:00 am 9:00 am 10:00 am June 7, 2018 June 5, 2018 June 6, 2018 Analysis Reuse Q1 Q2 Q1’’ Q2’’ Q1’ Q2’ sig sig’ sig’’
Reuse over Recurring Workloads • Problem: detect/reuse common subexpressions when new data arrives in each recurring interval • Solution: precise/normalized query signatures
Metadata Service User Interfaces & Tooling Job Coordination Recurring Workload Synchronization Feedback Metadata Service Loop View Sel. Online Phy. Design Materialization Expiry Rewrite queries using Views
Metadata Service • Materialized view lookup • Consistent view materialization • Quick view discovery
Query Rewriting / Online Materialization User Interfaces & Tooling Job Coordination Recurring Workload Synchronization Feedback Metadata Service Loop View Sel. Online Phy. Design Materialization Expiry Rewrite queries using Views
Query Rewriting / Online Materialization Query Rewriting using Views Online View Materialization
Analyzing Production Workloads • Cluster-wide overlaps: • 45% jobs • 65% users • 80% subgraphs • Operator-wise overlaps: • Up to 1000s of overlaps Shuffle Sort Joins Filters
Performance Impact • Workload: 32 queries Avg. Speedup: 43% • Latency: • Improvements depend on the critical path • Some queries slower due to materialization • Processing time: • Additional processing time for read/write Avg. Speedup: 36% • Savings in general • Overheads: • Workload analysis in an hour • ~10ms metadata service lookup • Optimization time higher/lower when creating/using views
Lessons Learned • Discovering hidden redundancies, static computations • Important to get the view physical design right in big data systems • Interesting side effects: failure recovery, cost estimates • User expectations: automatic, debuggability, privacy regulations • Even classic database concepts take a lot of time to bake in industry • Challenge: some of the assumptions may not hold • Industrial research is fun! ☺
Thanks! See you at: Poster Session 1, Wednesday 16:00-18:00, Houston 567 Coming up: Selecting Subexpressions to Materialize at Datacenter Scale Alekh Jindal, Konstantinos Karanasos, Sriram Rao, Hiren Patel VLDB 2018/PVLDB, Rio de Janeiro, Brazil
Industry 1 Computation Reuse in Analytics Job Service at Microsoft Tue, 11-12:30 Alekh Jindal (Microsoft), Shi Qiao (Microsoft), Hiren Patel (Microsoft), Zhicheng Yin (Microsoft), Jieming Di (Microsoft), Malay Bag (Microsoft), Marc Friedman (Microsoft), Yifung Lin (Microsoft), Konstantinos Karanasos (Microsoft), Sriram Rao (Microsoft) Key Ingredients • What do we mean by computation reuse? Questions • What is a “job service”? How is it different from “databases”? ✓ Materialized views over • How does a job service look like at Microsoft? recurring workloads • Why is computation reuse challenging in a job service? ✓ CloudViews Analyzer • What is our solution, key insights, and takeaways? ✓ Feedback Loop ✓ View Selection ✓ Physical Design Architecture ✓ View Expiry ✓ CloudViews Runtime ✓ Metadata Service ✓ Online Materialization ✓ Query Rewriting ✓ Synchronization ✓ Job Coordination
Recommend
More recommend