Releasing Cloud Databases from the Chains of Prediction Models Ryan Marcus and Olga Papaemmanouil Brandeis University
Cloud Databases Landscape Cloud Infrastructure as a Service (IaaS)
Deployment Challenges Q Q Q Q Data Management Application NP-hard problem Cost Performance Management Management Resource Workload Provisioning Scheduling VM VM VM VM IaaS Provider
State-of-the-art Placement Provisioning Scheduling PMAX Auto SmartSLA Shepherd (Liu et al.) (Rogers (Xiong et (Chi et al.) et al.) al.) SLATree (Chi et al.) Multi-tenant SLOs iCBS (Lang et al.) (Chi et al.) Delphi / Pythia Hypergraph (Elmore et al.) (Çatalyürek et al.) SCOPE Bazaar many traditional (Chaiken et al.) (Jalaparti et al.) methods ...
Query deadline Workload deadline State-of-the-art Average latency Percentile deadline Piecewise linear Placement Provisioning Scheduling PMAX Auto SmartSLA Shepherd (Liu et al.) (Rogers (Xiong et al.) (Chi et al.) et al.) SLATree (Chi et al.) Multi-tenant SLOs iCBS (Lang et al.) (Chi et al.) Delphi / Pythia Hypergraph (Elmore et al.) (Çatalyürek et al.) SCOPE Bazaar many traditional (Chaiken et al.) (Jalaparti et al.) methods ...
Performance Prediction Models q DBMS-related challenges q isolated vs. concurrent query execution q known vs unseen query types (“templates”) q extensive off-line training q state-of-the-art: 15-20% prediction error q Cloud-related challenges q numerous resource configurations q dynamic environment: “noisy neighbors”
Wish List Challenges complex End-to-end cost-aware service interactions (resource provisioning, workload scheduling) arbitrary Application-defined performance goals goals (per query deadline, percentile, average latency, max latency ) arbitrary Agnostic to workload characteristics (templates, arrival rates, execution times) workloads arbitrary Dynamic resource availability resources ML approach: model dynamic, complex decisions
Bandit: ML-Based Cost Management Q Q Q Q Data Management Application Cost SLA Management Management Resource Workload Provisioning Scheduling VM VM VM VM IaaS Provider
Reinforcement Learning agent internal state (past experiences) action reward observation Environment VM VM VM IaaS Provider
CMABs (Contextual Multi-Armed Bandits) agent internal state (past experiences) action reward observation Environment VM VM VM IaaS Provider
CMABs in Bandit Q Q Q Q (Contextual Multi-Armed Bandits) Data Management Application agent internal state (past experiences) action cost $$ observation Environment VM VM VM IaaS Provider
CMABs in Bandit Q Q Q Q (Contextual Multi-Armed Bandits) Data Management Application SLA internal state (past experiences) action cost $$ observation VM VM VM Tier 1 VM VM VM VM IaaS Tier 2 Provider
CMABs in Bandit Q Q Q (Contextual Multi-Armed Bandits) Data Management Application Q SLA internal state (past experiences) action cost $$ observation VM VM VM pass down Tier 1 VM VM accept VM VM IaaS Tier 2 Provider
CMABs in Bandit Q Q Q (Contextual Multi-Armed Bandits) Data Management Application Q SLA internal state (past experiences) action cost $$ observation VM VM VM Tier 1 VM VM VM VM IaaS Tier 2 Provider
CMABs in Bandit Q Q Q (Contextual Multi-Armed Bandits) Data Management Application (pass, context, $$) SLA (down, context, $$) (accept, context, $) action cost $$ observation VM VM VM Tier 1 VM Q VM VM VM IaaS Tier 2 Provider
Feature Selection Q Q Q Q Data Management Application Model Generator Context Experience Collector Collector VM VM VM VM IaaS Provider
Probabilistic Action Selection Q Q Q Q Data Management Application Model Generator action Context Experience Collector Collector VM VM VM VM IaaS Provider
Evaluation 500 1000 Bandit, one query at a time 8 templates Average cost per query (1/10 cent) Average cost per query (1/10 cent) Bandit, one query per vCPU 80 templates Bandit, two queries per vCPU 800 templates 400 800 Clairvoyant, one query at a time Clairvoyant, one query per vCPU Clairvoyant, two queries per vCPU 4% cost from solutions with Converges after few 1000s 300 600 perfect prediction model queries of 100s templates 200 400 100 200 0 0 0 1000 2000 3000 4000 5000 6000 7000 0 500 1000 1500 2000 2500 3000 3500 Queries processed Queries processed 200 800 Round-robin All new templates at once Average cost per query (1/10 cent) Clairvoyant PO2 New templates over time Converged cost (1/10 cent) 700 Bandit 150 600 Adapts quickly to new 500 Learns best execution site 100 400 unseen queries templates for partitioned data 300 50 200 100 0 0 Value-based Hash-based 0 500 1000 1500 2000 2500 3000 3500 4000 Segmentation Type Queries processed
Conclusions q Cost vs performance trade-offs are complex q human ability to derive insight is not improving q Benefits of ML-drive approach q discover customized solutions q automate decision making q adapt to dynamic environments q Future Steps q alternative learning techniques q more advanced tasks: scheduling, data movement q learning-based database as a service (DaaS) systems
Recommend
More recommend