A Cost-Sensitive Adaptation Engine for Server Consolidation of Multitier Applications Gueyoung Jung, Calton Pu Georgia Institute of Technology Kaustubh Joshi, Matti Hiltunen, Richard Schlichting AT&T Labs Research
Context • Cloud infrastructures proving resource pool shared by multiple applications • Multi-tier applications (e.g., web, application, database tier servers) • Server consolidation through virtualization allowing each physical machine to host tier servers in isolated containers (VMs) • Performance optimization in the context of end-to- end response time through resource configuration • Resource configuration: CPU capacity tuning of VM, VM migration, and increase/decrease replication level Nov. 30, 2009 Middleware 2009
Motivations • Dynamic resource configuration has become cost-oblivious 250 crucial in consolidated server environments. Response Time 200 150 • Adaptation actions such as VM migration and 100 50 replication make it more feasible. 0 15:00 15:16 15:32 15:48 16:04 16:20 16:36 16:52 17:08 17:24 17:40 17:56 18:12 18:28 18:44 19:00 19:16 19:32 19:48 20:04 20:20 20:36 20:52 21:08 21:24 21:40 21:56 22:12 22:28 Time • Challenge: indiscriminate usage of adaptations can have significant impacts on the overall QoS of hosted applications. Nov. 30, 2009 Middleware 2009
Adaptation Benefit vs. Cost Workload Workload stability interval Time t 0 t 1 Nov. 30, 2009 Middleware 2009
Adaptation Benefit vs. Cost Delta Response Time No Adaptation No Adaptation Response Time Cheap Adaptation Cheap Adaptation Optimal Adaptation Desired Desired Delay Expensive Adaptation Expensive Adaptation Time t 0 t 1 Nov. 30, 2009 Middleware 2009
Adaptation Costs • Vary with workload, adaptation type, and performance characteristics of applications and their tier servers Change of adaptation delay Change of response time 800 80000 700 70000 Adapt. delay (ms) Delta res. time (ms) 600 60000 500 50000 400 40000 300 30000 200 20000 100 10000 0 0 100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800 Number of concurrent sessions Number of concurrent sessions Nov. 30, 2009 Middleware 2009
Adaptation Costs • Affected by the workload of background applications sharing resources with adapted application (virtualization overheads) Change of adaptation delay Change of response time Δ Response Time (ms) Adaptation Delay (ms) 600 70000 60000 500 50000 400 40000 300 30000 200 20000 100 10000 0 0 500 500 500 500 400 400 400 400 300 300 300 300 200 200 200 200 # Users # Users # Users # Users 100 100 (Adapted App) (Background App) (Adapted App) (Background App) Nov. 30, 2009 Middleware 2009
Contributions • Develop a framework to generate optimal balance between various adaptations’ benefits and costs. • Demonstrate the framework in dynamic resource configuration for multiple multi-tier applications in a server consolidation environment. • Specifically, generate a set of optimal adaptation actions to maximize overall utility. Nov. 30, 2009 Middleware 2009
Architecture Adaptation Engine Estimate costs and Workload Build costs and benefits models offline benefits by solving Monitor models online with W W Application given workloads Estimator Controller p LQN RT LQN Optimizer Model c Solver u c Off-line Resource s Δ RT experiments U* a Utility Cost Cost d a Function Mapping Model a Unify cost and benefit c U Search p E Adapt. Action ARMA Algorithm Filter Estimate workload stability Optimal Adapt. Action Generate a set of optimal adaptation actions using estimates and utilities Nov. 30, 2009 Middleware 2009
Modeling • Estimation of benefits: Given workload W and configuration c , estimate response time RT using layered queueing network models. Network Ping Measurement Function call Resource use Servlet.jar Instrumentation Client VMM VMM VMM Net Net Net Web App. DB CPU CPU CPU Server Server Server Disk Disk Disk Disk Disk Disk LD_PRELOAD Instrumentation Nov. 30, 2009 Middleware 2009
Modeling (cont.) • Estimation of costs: Given workload W , configuration c , and adaptation action a , experimentally, iteratively measure delay d a and delta response time Δ RT a . Then generate a mapping table to be used online. • Estimation of workload stability CW : Given workload history, estimate how long current workload will stay within a range using ARMA filter. Nov. 30, 2009 Middleware 2009
Utility Function 80 Each application s has its own 70 Penalty / Reward 60 50 SLA that provides TRT s , base 40 30 20 R s and P s . 10 0 -10 -20 R s and P s are factored by the -30 Reward -40 -50 -60 Penalty intensity of workload W s . -70 -80 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 Request rate (per second) • Maximize U = ( CW - ∑ d a k ) ∑ u s c i+1 + ∑ ( d a k ∑ u s c k-1 ,a k ) s ∈ S a k ∈ A s ∈ S a k ∈ A Benefit utility Cost utility Where CW is the estimated length of stability interval. Nov. 30, 2009 Middleware 2009
Optimization Maximum configuration ∑ ρ ∑ , ρ − c max new old • Search the desired i , j , k i j , k − U U configuration using bin- new old ……. c new1 c new2 c new3 c new n packing and gradient-based search algorithms. ……. c new1 c new2 c new3 c new n … … - not consider adaptation costs. Desired configuration • Generate optimal actions Optimal configuration using a best first search … … algorithm. ……. c 31 c 32 c 33 c 3n a 1 a n a 2 a 3 - A* graph search algorithm. ……. c 1 c 2 c 3 c n a 1 a n a 2 a 3 c 0 Current configuration (no adaptation) Nov. 30, 2009 Middleware 2009
Test-bed Architecture • Develop a small virtualized data center • Deploy multiple 3-tier RUBiS applications Adaptation Engine Workload Estimator Controller Monitor App. Server DB Server DB Server Domain-0 Dormant Dormant Dormant Web Server DB Server App. Server Domain-0 App. Server DB Server Web Server Domain-0 App. Server Web Server DB Server Domain-0 App. Server Web Server DB Server Domain-0 VM VM VM VM VM VM Hypervisor VM2 VM VM1 VM1 VM2 VM Hypervisor Virtual Machine Pool VM2 VM VM1 VM Image Active Hosts Shared Storage Nov. 30, 2009 Middleware 2009
Workloads Time of day trace Flash crowd trace 80 80 RUBiS-1 RUBiS-1 70 70 RUBiS-2 RUBiS-2 Request rate (per sec) Request rate (per sec) 60 60 50 50 40 40 30 30 20 20 10 10 0 0 15:00 15:23 15:46 16:09 16:32 16:55 17:18 17:41 18:04 18:27 18:50 19:13 19:36 19:59 20:22 20:45 21:08 21:31 21:54 22:17 15:00 15:05 15:10 15:15 15:20 15:25 15:30 15:35 15:40 15:45 15:50 15:55 16:00 16:05 16:10 16:15 16:20 16:25 Time Time Nov. 30, 2009 Middleware 2009
Model Accuracy Estimation error of benefits and costs Estimation error of workload stability is around 15% is around 15% 14 140 monitored Exp. estimated 120 Model 12 100 10 Response time (msec) Interval (min) 80 8 60 6 40 4 20 2 0 0 15:02 15:16 15:30 15:44 15:58 16:12 16:26 16:40 16:54 17:08 17:22 17:36 17:50 18:04 18:18 18:32 18:46 19:00 19:14 19:28 1 4 7 1013161922252831343740434649525558616467 Prediction window Time Nov. 30, 2009 Middleware 2009
Comparison Evaluation • Compare 5 strategies NA: No adaptation. A Static configuration used CO: Cost-Oblivious. No consideration of adaptation costs Oracle (Desired): Simulation with adaptation costs = 0 1-hour: Adaptation every 1 hour CS: Cost-Sensitive. Nov. 30, 2009 Middleware 2009
End-to-End Response Times Time of day trace Flash crowd trace 250 250 NA CS CO NA CS CO Response time (ms) Response time (ms) 200 200 150 150 100 100 50 50 0 0 15:00 15:24 15:48 16:12 16:36 17:00 17:24 17:48 18:12 18:36 19:00 19:24 19:48 20:12 20:36 21:00 21:24 21:48 22:12 15:00 15:06 15:12 15:18 15:24 15:30 15:36 15:42 15:48 15:54 16:00 16:06 16:12 16:18 16:24 Time Time Nov. 30, 2009 Middleware 2009
Cumulative Utilities Time of day trace Flash crowd trace 18000 4000 NA NA 16000 3500 CS CS 14000 3000 CO CO 12000 2500 Oracle Oracle 10000 2000 Utility Utility 1-hour 8000 1-hour 1500 6000 1000 4000 500 2000 0 0 -500 -2000 -1000 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 22:00 15:00 15:06 15:12 15:18 15:24 15:30 15:36 15:42 15:48 15:54 16:00 16:06 16:12 16:18 16:24 Time Time Nov. 30, 2009 Middleware 2009
Adaptation Actions The number of adaptation actions triggered in flash crowd scenario Type of Action CS CO CPU Increase and 14 36 Decrease Add MySQL replica 1 4 Remove MySQL 1 4 replica Migrate Apache 4 10 Migrate Tomcat 4 10 Migrate MySQL 0 2 Nov. 30, 2009 Middleware 2009
Conclusion • Adaptation actions such as VM replication and migration can impose significant performance costs. • Our approach makes smart decision on when and how to act to enhance the satisfaction of response time SLAs. Nov. 30, 2009 Middleware 2009
On-going Work • Integrate power management into the problem formulation by considering power consumption as another cost. • Handle large-scale setup by extending the framework to multi-level hierarchical control, where each level represents different time scale and scope of control. Nov. 30, 2009 Middleware 2009
Recommend
More recommend