tales of the tail hardware os and application level
play

Tales of the Tail Hardware, OS, and Application-level Sources of - PowerPoint PPT Presentation

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma , Dan R. K. Ports and Steven D. Gribble February 2, 2015 1 Introduction What is Tail Latency? What is Tail Latency? 2 Introduction


  1. Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma , Dan R. K. Ports and Steven D. Gribble February 2, 2015 1

  2. Introduction What is Tail Latency? What is Tail Latency? 2

  3. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time 2

  4. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time 2

  5. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time In Facebook’s Memcached deployment, Median latency is 100 µ s , but 95 th percentile latency ≥ 1 ms . 2

  6. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time In Facebook’s Memcached deployment, Median latency is 100 µ s , but 95 th percentile latency ≥ 1 ms . In this talk, we will explore Why some requests take longer than expected? What causes them to get delayed? 2

  7. Introduction What is Tail Latency? Why is the Tail important? Low latency is crucial for interactive services. 500ms delay can cause 20% drop in user traffic. [Google Study] Latency is directly tied to traffic, hence revenue. 3

  8. Introduction What is Tail Latency? Why is the Tail important? Low latency is crucial for interactive services. 500ms delay can cause 20% drop in user traffic. [Google Study] Latency is directly tied to traffic, hence revenue. What makes it challenging is today’s datacenter workloads. Interactive services are highly parallel. Single client request spawns thousands of sub-tasks. Overall latency depends on slowest sub-task latency. Bad Tail ⇒ Probability of any one sub-task getting delayed is high. 3

  9. Introduction What is Tail Latency? A real-life example Nishtala et. al. Scaling memcache at Facebook, NSDI 2013. 4

  10. Introduction What is Tail Latency? A real-life example All requests have to finish within the SLA latency. Nishtala et. al. Scaling memcache at Facebook, NSDI 2013. 4

  11. Introduction What is Tail Latency? What can we do? People in industry have worked hard on solutions. Hedged Requests [Jeff Dean et. al.] Effective sometimes, but adds application specific complexity. Intelligently avoid slow machines Keep track of server status; route requests around slow nodes. 5

  12. Introduction What is Tail Latency? What can we do? People in industry have worked hard on solutions. Hedged Requests [Jeff Dean et. al.] Effective sometimes, but adds application specific complexity. Intelligently avoid slow machines Keep track of server status; route requests around slow nodes. Attempts to build predictable response out of less predictable parts. We still don’t know what is causing requests to get delayed. 5

  13. Introduction What is Tail Latency? Our Approach 1 Pick some real life applications: RPC Server, Memcached, Nginx . 2 Generate the ideal latency distribution. 3 Measure the actual distribution on a standard Linux server. 4 Identify a factor causing deviation from ideal distribution. 5 Explain and mitigate it. 6 Iterate over this till we reach the ideal distribution. 6

  14. Introduction What is Tail Latency? Rest of the Talk Introduction 1 Predicted Latency from Queuing Models 2 Measurements: Sources of Tail Latencies 3 Summary 4 7

  15. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? 8

  16. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. 8

  17. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. 8

  18. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Server 8

  19. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server 8

  20. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server 8

  21. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server 8

  22. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server Given the arrival distribution and request processing time, We can predict the time spent by a request in the server. 8

  23. Predicted Latency from Queuing Models Tail latency characteristics 10 0 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  24. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  25. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds 99th percentile ⇒ 60 µ s 9

  26. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds 99.9th percentile ⇒ 200 µ s 9

  27. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 Distribution 2 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  28. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  29. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  30. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Inherent tail latency due to request burstiness. 9

  31. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] Poisson at 90% Utilization 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Tail latency depends on the average server utilization. 9

  32. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] Poisson at 90% Utilization Poisson at 70% - 4 workers 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Additional workers can reduce tail latency, even at constant utilization. 9

  33. Measurements: Sources of Tail Latencies Introduction 1 Predicted Latency from Queuing Models 2 Measurements: Sources of Tail Latencies 3 Summary 4 10

  34. Measurements: Sources of Tail Latencies Testbed Cluster of standard datacenter machines. 2 x Intel L5640 6 core CPU 24 GB of DRAM Mellanox 10Gbps NIC Ubuntu 12.04, Linux Kernel 3.2.0 All servers connected to a single 10 Gbps ToR switch. One server runs Memcached, others run workload generating clients. Other application results are in the paper. 11

  35. Measurements: Sources of Tail Latencies Timestamping Methodology Append a blank buffer ≈ 32 bytes to each request. Overwrite buffer with timestamps as it goes through the server. Incoming After TCP/UDP Memcached thread Server NIC processing scheduled on CPU Outgoing Memcached Memcached Server NIC write() read() return Very low overhead and no server side logging. 12

  36. Measurements: Sources of Tail Latencies How far are we from the ideal? 13

  37. Measurements: Sources of Tail Latencies How far are we from the ideal? 10 0 Ideal Model 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Single CPU, single core, Memcached running at 80% utilization. 14

Recommend


More recommend