hani jamjoom
play

Hani Jamjoom MICHIGAN E M M Chun-Ting Chou M M Kang G. Shin M - PowerPoint PPT Presentation

M M M M M The Impact of Concurrency Gains M M M on the Analysis and Control M M M of Multi-threaded Internet Services M M Hani Jamjoom MICHIGAN E M M Chun-Ting Chou M M Kang G. Shin M M M M NGINE M Electrical


  1. M M M M M The Impact of Concurrency Gains M M M on the Analysis and Control M M M of Multi-threaded Internet Services M M Hani Jamjoom MICHIGAN E M M Chun-Ting Chou M M Kang G. Shin M M M M NGINE M Electrical Engineering and Computer Science E RING M The University of Michigan M M M

  2. M M M M M E nvironment of Interest M M M M M TYPICAL SERVER A typical shared hosting environment with: M • Multiple services time share a single system services • All requests are treated equally M M requests MICHIGAN E M M M M M M M M NGINE M Services use multi-threading or multi- processing to improve their efficiency: E • Increase throughput RING M M M • Reduce request waiting time M

  3. M M M M M Interactions of Interest M M M M M • Concurrency gains TYPICAL SERVER are workload M dependent services M • Characterize M impact of gains on server performance requests MICHIGAN E M M M M • Multiple services M M compete for server M resources M NGINE M • Predict and control E thread and service RING M M M interactions M

  4. M M M M M Why Should We Care? M M M M • Better predict how concurrency will impact perceived M performance M • Improved system configuration M • Maximize the benefits from multi-threading/multi-processing abstractions M MICHIGAN E M M M • Better predict interactions between different threads and M M M services M M • Design better QoS control mechanisms NGINE M • Avoid possible pitfalls when designing ad hoc controls E RING M M M M

  5. M M M M M Challenges in Analyzing Multi-threading M M M M • Concurrency gains are workload-dependent M • There is no fixed parameterization for all workloads M M • System time is not the only source of thread interactions M • Bottleneck resources (e.g., disk) introduce indirect interactions • Small number requests for one service can have big impact on MICHIGAN E M M performance of other services M M M M • Multi-threading or multi-processing is implemented in M M different ways (cooperative, preemptive, etc.) NGINE M E RING M M M M

  6. M M M M M Goals of Paper & Talk M M M M • Characterize concurrency gains of different workloads M M • Predict performance of a single service M M • Look at how different services can interact with each MICHIGAN E M other M M M M M • Design a generic mechanism for configuring thread limits M M to provide worst-case QoS guarantees NGINE M E RING M M M M

  7. M M M M M Characterizing Concurrency Gains M M M M • Express increase in throughput as a function of thread M allocation M • Use SPECWeb’99-like workload as examples of possible M workloads M MICHIGAN E M M M M M M M M NGINE M E RING M M M M

  8. M M M M Real Workload Measurements M M M M M M M M M M M M M M M M M M M M M M MICHIGAN E NGINE E RING

  9. M M M M M Simple Model…but Workload Dependent M M M M M M M M MICHIGAN E M M M M M M • REGION 1: Gain approximated by a linear function M • REGION 2: Flat gain since threading yields no performance gains M NGINE M • REGION 3: Dramatic drop due to system thrashing E RING M M M M

  10. M M M M M Service Model M M M M Service Service M Application A class 1 Class M Queues M M Requests MICHIGAN E CPU M M Service M class 2 schedule in M M M a round- robin fashion M M NGINE M E RING M Service M Application B M class 3 M

  11. M M M M M Analysis of Single Service Class M M M M • Continuous Time Markov Chain (CTMC): M M M M MICHIGAN E M M M M • µ (i) is the state-dependent service rate = µ G(i) / i M M • CTMC assumes scheduling quantum is infinitesimally small, but is M M NGINE M necessary to account for speedup • Use standard techniques to derive steady-state probabilities and E estimate the processing delay RING M M M M

  12. M M M M Predicting Behavior of Real Systems M M M M M M M M M M M M M M M M M M M M M M MICHIGAN E NGINE E RING

  13. M M M M M Potential Inaccuracies in CTMC Method M M M M M • Transition point between overload and underload M is affected by our simplified arrival model M M • Using infinitesimal time quanta for thread MICHIGAN E scheduling overestimates delay M M M M M M • Well-behaved service distribution does not M M NGINE M account for few long-lived requests E RING M M M M

  14. M M M M M E xtending Results to Multiple Service Classes M M M M • Assume that services are independent M • Do not consider situations when requests are processed by M multiple stages M M • Split independent services into two types: • Homogeneous: services with similar workload requirements MICHIGAN E M M (e.g., differentiating between different client groups) M • Heterogeneous: services with different workload M M M requirements (e.g., static vs. dynamic workloads) M M NGINE M • Characterize the change in speedup E RING M M M M

  15. M M M Non-predictable Interactions between M M Heterogeneous Services M M M M M • Existence of an M artificial ceiling for M static workload M when hosted with MICHIGAN E dynamic M M M workload M M M M M NGINE M • Shift in bottleneck from CPU to disk E RING M M M M

  16. M M M Proportional Resource Sharing between M M Homogenous Services M M M M M • Throughput is M proportional to M thread M allocation MICHIGAN E M M M M • The bottleneck M M resource is M M NGINE M proportionally shared E RING M M M M

  17. M M M M M Providing Worst-Case Delay Guarantees M M M M M • How to configure thread limits for each running service M • Confine performance degradation when one service gets overloaded M M • Algorithm based on analytical model of homogeneous MICHIGAN E M M services M M M M • Associate a cost function with each service class M M • Use dynamic programming to allow any cost function NGINE M • Choose the thread allocation that minimizes the total cost assuming any service can become overloaded E RING M M M M

  18. M M M M M Overview of Dynamic Programming Algorithm M M M 1 Class 1 Class 2 Class 3 M M 1 Each table corresponds to the worst-case cost 2 M of class i if given m threads. The remaining m (m max – m) threads belong to the other service classes, which are assumed to be saturated. M m max + M 2 Classes 1,2 MICHIGAN E M A new table combines tables of class 1 and 2. M 2 M Each row holds the minimum cost if classes 1 m M and 2 are given m threads. This process is M M m max repeated, combining the resulting table from the + M previous step and the next service class. M NGINE M Classes 1,2,3 3 3 m E The optimal allocation is found by tracing back the RING M M M m max allocation that produced the minimum worst-case cost. M

  19. M M M M M E xperimental Setup M M M M M M M M MICHIGAN E M M M M M M Configure one service class with a M M fixed number of threads that are NGINE M always busy, and compare predicted response time of the other service E class with the measured values for RING M M M different thread allocations. M

  20. M M M M M Predicted Allocation is Close to Measured M M M M M Low priority service is allocated 8 threads M M M MICHIGAN E M M M M M M M M NGINE M Low priority service is allocated 16 threads E RING M M M M

  21. M M M M M Summary of Results M M M • Accounting for concurrency gains M M • Improves accuracy of prediction M • Crucial when analyzing multiple service classes M • Bimodal behavior of service queues M • Queues are almost empty or totally full • Queueing only occurs when the service becomes critically loaded or overloaded MICHIGAN E M since a request is admitted quickly by a worker thread M M • Long queues are unnecessary to improve service performance M M M • Analysis of Web workloads M M • Hard to (analytically) predict performance when services have different workload NGINE M characteristics (such as I/O heavy vs. CPU heavy) • Analysis can be used to provide predictable results and worst-case delay guarantees with similar workloads E RING M M M M

  22. M M M M M Future Directions M M M M M • Fine grain analysis of heterogeneous services M • Key for providing effective and general thread-based controls M M • Multi-Staged Services MICHIGAN E M • A request may pass through multiple stages (e.g., through M M HTTP, Application, and DB servers) before completion M M M • Adds an extra level of possible interactions between threads M M NGINE M E RING M M M M

  23. M M M M M Thank You… M M M M M M HANI JAMJOOM M Dept. of EECS M The University of Michigan MICHIGAN E M M M M M jamjoom@eecs.umich.edu M M M www.eecs.umich.edu/~jamjoom NGINE M E RING M M M M

Recommend


More recommend