serverless performance on a budget
play

Serverless Performance on a Budget Erwin van Eyk The central - PowerPoint PPT Presentation

Serverless Performance on a Budget Erwin van Eyk The central trade-off in serverless computing High Performance Infinite scaling High availability Low latency 2 The central trade-off in serverless computing High Performance Low


  1. Serverless Performance on a Budget Erwin van Eyk

  2. The central trade-off in serverless computing High Performance “Infinite” scaling High availability Low latency � 2

  3. The central trade-off in serverless computing High Performance Low Cost No costs when idle “Infinite” scaling High availability No operational cost Low latency Granular billing � 2

  4. The central trade-off in serverless computing High Performance Low Cost No costs when idle “Infinite” scaling High availability No operational cost Low latency Granular billing How can we optimize the performance-cost trade-off? � 2

  5. Anatomy of a Functions-as-a-Service (FaaS) platform - Function Configuration - Environment variables - Arguments - Version - Source pointer - ... pods and other resources � 3

  6. Anatomy of a FaaS platform � 4

  7. Anatomy of a FaaS platform: Fission (without optimizations) � 5

  8. Anatomy of a FaaS platform: cold start � 6

  9. Anatomy of a FaaS platform: cold start 0 � 6

  10. Anatomy of a FaaS platform: cold start 1 Trigger function deployment 0 � 6

  11. Anatomy of a FaaS platform: cold start Fetch function metadata 2 1 Trigger function deployment 0 � 6

  12. Anatomy of a FaaS platform: cold start Fetch function metadata 2 3 kubectl create 1 Trigger function deployment 0 � 6

  13. Anatomy of a FaaS platform: cold start Fetch function metadata 2 Wait for K8S to deploy function 3 kubectl create 1 4 Trigger function deployment 0 � 6

  14. Anatomy of a FaaS platform: cold start Fetch function metadata 2 Wait for K8S to deploy function 3 kubectl create 1 4 Trigger function deployment 0 5 Send request � 6

  15. Anatomy of a FaaS platform: cold start Fetch function metadata 2 Wait for K8S to deploy function 3 kubectl create 1 4 Trigger function deployment 6 0 Response 5 Send request � 6

  16. Anatomy of a FaaS platform: cold start Fetch function metadata 2 Wait for K8S to deploy function 3 kubectl create 1 4 Trigger function deployment 6 0 Response 5 7 Send request � 6

  17. Anatomy of a FaaS platform: warm execution � 7

  18. Anatomy of a FaaS platform: warm execution 0 � 7

  19. Anatomy of a FaaS platform: warm execution 0 5 Send request � 7

  20. Anatomy of a FaaS platform: warm execution 6 0 Response 5 Send request � 7

  21. Anatomy of a FaaS platform: warm execution 6 0 Response 5 7 Send request � 7

  22. Cold Start Warm Execution � 8

  23. Cold Start Trigger deployer Warm Execution � 8

  24. Cold Start Trigger Fetch function deployer metadata Warm Execution � 8

  25. Cold Start Trigger Fetch function Deploy Pod deployer metadata Warm Execution � 8

  26. Cold Start Trigger Fetch function Fetch Deploy Pod deployer metadata function Warm Execution � 8

  27. Cold Start Trigger Fetch function Fetch Deploy Pod Deploy function deployer metadata function Warm Execution � 8

  28. Cold Start Trigger Fetch function Fetch Route Deploy Pod Deploy function deployer metadata function request Warm Execution � 8

  29. Cold Start Trigger Fetch function Fetch Route Function Deploy Pod Deploy function deployer metadata function request Execution Warm Execution � 8

  30. Cold Start Trigger Fetch function Fetch Route Function Deploy Pod Deploy function deployer metadata function request Execution Warm Execution Route request � 8

  31. Cold Start Trigger Fetch function Fetch Route Function Deploy Pod Deploy function deployer metadata function request Execution Warm Execution Route Function request Execution � 8

  32. Cold starts matter! Coldstart latency (in ms) over 168 hours 180 ms 500 ms 3600 ms � 9 Wang, Liang, et al. "Peeking Behind the Curtains of Serverless Platforms." 2018 USENIX ATC, 2018.

  33. How do FaaS platforms improve their performance? And, at what cost? 1. Function resource reusing 2. Function runtime pooling 3. Function prefetching 4. Function prewarming � 10

  34. Optimization 1 Function Resource Reusing Trigger Fetch function Fetch Route Function Deploy pod Deploy function deployer metadata function request Execution � 11

  35. Function Isolation vs. Function Reuse Request Response Function Instance Requests Responses Request Response Function Instance Function Instance Request Response Function Instance Full Isolation Full resource reuse � 12

  36. Function resource reusing in practice - Why performance isolation: - Performance variability - In practice: all FaaS platforms reuse resources - Per-user binpacking - Functions are isolated - Function executions share resources � 13

  37. FaaS platform with function reusing � 14

  38. Trade-off: how long to keep functions alive? - To reuse functions we have to keep them alive. - Keep-alive in practice: - AWS: ~6 hours - Google: ~6 hours - Azure: 1-4 days Long keep-alive short keep-alives More warm executions Less idle resources � 15

  39. Optimization 2 Function Runtime Pooling Trigger Fetch function Fetch Route Function Deploy pod Deploy function deployer metadata function request Execution � 16

  40. Function Instance = Runtime + Function - Insight: function instances consist out of two parts - Function-specific code : user-provided business logic. - Runtime: operational logic, monitoring, health checks... - Divide the deployment process into 2 stages: Deploy the runtime → unspecialized runtime or stem cell - Deploy the function to the runtime → specialized function - Function Instance Function Runtime Function Runtime Resources Resources Resources Runtime deployment Function deployment � 17

  41. Resource Pooling - Common in many domains (e.g. thread pools) Fn Instance Fn Instance Fn Instance Fn Instance Fn Runtime Fn Runtime Fn Runtime Fn Runtime Fn Runtime Fn Runtime Fn Runtime Pool of 3 function runtimes 2 function runtimes → function instances Pool rebalancing � 18

  42. FaaS platform with function runtime pooling � 19

  43. Trade-off: how big should the pool? Large pool Minimal pool Handle high concurrency Fast pool exhaustion Minimize pool; less idle resources Increases resource overhead Performance Minimize cost � 20

  44. Optimization 3 Function Prefetching Trigger Fetch function Fetch Route Function Deploy pod Deploy function deployer metadata function request Execution � 21

  45. Function prefetching Fetch function sources proactively and place them near resources to reduce function transfer latency - Software flow has a big impact on cold start durations - Function sources (10s of MBs) have to be retrieved and transferred to the resources - Especially important for geo-distributed and edge use cases - AWS Lambda@edge - Cloudflare Abad, Cristina L. et al. "Package-Aware Scheduling of FaaS Functions." Companion of the 2018 ACM/SPEC International � 22 Conference on Performance Engineering. ACM, 2018.

  46. Prefetching Remote Storage Cluster-level Rack/Machine-level Function-level

  47. Prefetching Higher latency Less storage costs Remote Storage Cluster-level Rack/Machine-level Function-level Lower latency More storage costs

  48. FaaS platform with prefetching � 25

  49. Optimization 4 Function Prewarming Trigger Fetch function Fetch Route Function Deploy pod Deploy function deployer metadata function request Execution � 26

  50. Function prewarming Anticipate function executions by deploying functions predictively. - Prewarming or predictive scheduling in other domains: - CPU branch predictor - Proactive autoscalers - Predictive caches van Eyk, Erwin, et al. "A SPEC RG CLOUD Group's Vision on the Performance Challenges of FaaS Cloud Architectures." � 27 Companion of the 2018 ACM/SPEC International Conference on Performance Engineering. ACM, 2018.

  51. � 28

  52. � 28

  53. Predicting function executions is hard... Active field of research (autoscaling, predictive caches…) Common approaches 1. Runtime analysis - Rule-based - Pattern recognition and machine learning - Artificial intelligence 2. Exploit additional information of functions - Dependency knowledge in function compositions - Interval triggers � 29

  54. ... and involves a trade-off. Optimistic prewarming Pessimistic prewarming Low threshold High threshold Misprediction: no prewarm Misprediction: resources wasted Ping hack More performance due to prewarming Less costs due to less mispredicted prewarming � 30

  55. ⼽戉弗 Function composition... - Connect existing functions into complex function compositions - Workflow engine takes care of the plumbing and provides fully monitorable, fault-tolerant function compositions with low overhead. Sequential execution image-recognizer translate-text validate-image image-resizer combine-image-text Parallel execution � 31

  56. ...with prewarming Fission Workflows supports horizon-based prewarming Finished Started Prewarmed Not started � 32

  57. ...with prewarming Fission Workflows supports horizon-based prewarming Finished Started Prewarmed Not started � 32

Recommend


More recommend