handling flash crowds from your garage
play

Handling Flash Crowds From Your Garage Jeremy Elson and Jon Howell - PowerPoint PPT Presentation

Handling Flash Crowds From Your Garage Jeremy Elson and Jon Howell Microsoft Research USENIX ATC 2008 Tuesday, August 3, 2010 Scaling For Many Users Tuesday, August 3, 2010 Scaling For Many Users... quickly Tuesday, August 3, 2010 Scaling


  1. Handling Flash Crowds From Your Garage Jeremy Elson and Jon Howell Microsoft Research USENIX ATC 2008 Tuesday, August 3, 2010

  2. Scaling For Many Users Tuesday, August 3, 2010

  3. Scaling For Many Users... quickly Tuesday, August 3, 2010

  4. Scaling For Many Users... quickly... on a budget Tuesday, August 3, 2010

  5. Enabled by Utility Computing • Pay proportional to use • Minimal Starting Cost (often free!) • Quickly scalable Tuesday, August 3, 2010

  6. Examples of Utility Computing Tuesday, August 3, 2010

  7. Storage Data Networks Tuesday, August 3, 2010

  8. Storage Data Networks • SDNs store static content in large clusters • Expose HTTP interface Tuesday, August 3, 2010

  9. Storage Data Networks • SDNs store static content in large clusters • Expose HTTP interface • Different from CDNs: • No startup or recurring cost • Fee based on usage Tuesday, August 3, 2010

  10. Storage Data Networks • SDNs store static content in large clusters • Expose HTTP interface • Different from CDNs: • No startup or recurring cost • Fee based on usage • Examples: S3, Nirvanix Tuesday, August 3, 2010

  11. Storage Data Networks • SDNs store static content in large clusters • Expose HTTP interface • Different from CDNs: • No startup or recurring cost • Fee based on usage • Examples: S3, Nirvanix Tuesday, August 3, 2010

  12. Compute Clouds • Allow developers to configure and save a VM image • Instantiate images as needed • Fee based on “compute units” • Examples: EC2, Flexiscale Tuesday, August 3, 2010

  13. DNS Outsourcing • Used to host authoritative DNS servers • Many replicas provide reliability and performance • Examples: UltraDNS and DynDNS Tuesday, August 3, 2010

  14. Outline • Introduction • Examples of Utility Computing • Scaling Architectures • Analyzing the Design Space • Application Experiences Tuesday, August 3, 2010

  15. Scaling Architectures Tuesday, August 3, 2010

  16. Using the bare SDN • Works only if content is predominantly static Tuesday, August 3, 2010

  17. HTTP Redirection • Uses a dedicated redirection server • Redirects to server S Re- director on first access • All future interactions with server S Tuesday, August 3, 2010

  18. HTTP Redirection • Uses a dedicated redirection server • Redirects to server S Re- director on first access • All future interactions with server S Tuesday, August 3, 2010

  19. L4/L7 Load Balancing • Spread incoming traffic across servers • Spreading based on source TCP port or HTTP session • Free/utility load balancers available: • HAProxy, Linux Virtual Server, Flexiscale, EC2 Load balancing Tuesday, August 3, 2010

  20. DNS Load Balancing • Advertise many IP addresses for single hostname • Use DNS to spread load across servers • Example: RightScale, or build your own using UltraDNS • Relies on good DNS resolver behavior Tuesday, August 3, 2010

  21. DNS Load Balancing Tuesday, August 3, 2010

  22. DNS Load Balancing • Advertise many IP addresses for single hostname • Use DNS to spread load across servers Tuesday, August 3, 2010

  23. DNS Load Balancing • Advertise many IP addresses for single hostname • Use DNS to spread load across servers • Example: RightScale, or build your own using UltraDNS Tuesday, August 3, 2010

  24. DNS Load Balancing • Advertise many IP addresses for single hostname • Use DNS to spread load across servers • Example: RightScale, or build your own using UltraDNS • Relies on good DNS resolver behavior Tuesday, August 3, 2010

  25. Outline • Introduction • Examples of Utility Computing • Scaling Architectures • Analyzing the Design Space • Application Experiences Tuesday, August 3, 2010

  26. Analyzing the Design Space • Application Scope • Scale Limitation • Scale-up Time • Fault Tolerance Tuesday, August 3, 2010

  27. Application Scope • SDN: Only HTTP Tuesday, August 3, 2010

  28. Application Scope • SDN: Only HTTP • HTTP Redirection: Only HTTP Tuesday, August 3, 2010

  29. Application Scope • SDN: Only HTTP • HTTP Redirection: Only HTTP • L7 Load Balancing: All apps with the same L7 protocol Tuesday, August 3, 2010

  30. Application Scope • SDN: Only HTTP • HTTP Redirection: Only HTTP • L7 Load Balancing: All apps with the same L7 protocol • L4/DNS Load Balancing: All apps Tuesday, August 3, 2010

  31. Scale Limitations • SDNs: Limited only by provider’s SLA Tuesday, August 3, 2010

  32. Scale Limitations • SDNs: Limited only by provider’s SLA • HTTP Redirection: Very scalable if client sessions long Tuesday, August 3, 2010

  33. Scale Limitations • SDNs: Limited only by provider’s SLA • HTTP Redirection: Very scalable if client sessions long • L4/L7 Load Balancing: Scales if app is not communication-bound Tuesday, August 3, 2010

  34. Scale Limitations • SDNs: Limited only by provider’s SLA • HTTP Redirection: Very scalable if client sessions long • L4/L7 Load Balancing: Scales if app is not communication-bound • DNS Load Balancing: Scales to thousands of servers Tuesday, August 3, 2010

  35. Scale Limitations • SDNs: Limited only by provider’s SLA • HTTP Redirection: Very scalable if client sessions long • L4/L7 Load Balancing: Scales if app is not communication-bound • DNS Load Balancing: Scales to thousands of servers Tuesday, August 3, 2010

  36. Scale-Up Times Tuesday, August 3, 2010

  37. Scale-Up Times • SDN: Scales up very quickly Tuesday, August 3, 2010

  38. Scale-Up Times • SDN: Scales up very quickly • L4/L7 Load Balancing+HTTP Redirection • ~1-minute scale-up time Tuesday, August 3, 2010

  39. Scale-Up Times • SDN: Scales up very quickly • L4/L7 Load Balancing+HTTP Redirection • ~1-minute scale-up time • DNS Load Balancing • Scale-up time hampered by DNS caching Tuesday, August 3, 2010

  40. Front-End Failures • SDN: Failure unlikely Tuesday, August 3, 2010

  41. Front-End Failures • SDN: Failure unlikely • L4/L7 Load Balancers • Single load-balancer fails miserably • Need hot spares or DNS load-balanced front-ends Tuesday, August 3, 2010

  42. Front-End Failures • SDN: Failure unlikely • L4/L7 Load Balancers • Single load-balancer fails miserably • Need hot spares or DNS load-balanced front-ends • HTTP Redirectors • Existing sessions don’t fail Tuesday, August 3, 2010

  43. Front-End Failures • SDN: Failure unlikely • L4/L7 Load Balancers • Single load-balancer fails miserably • Need hot spares or DNS load-balanced front-ends • HTTP Redirectors • Existing sessions don’t fail • DNS Load Balancers • With replicated DNS, failover is very quick Tuesday, August 3, 2010

  44. Latency when DNS Servers Fail Tuesday, August 3, 2010

  45. Back-end Failures • SDN: Not our problem! Tuesday, August 3, 2010

  46. Back-end Failures • SDN: Not our problem! • HTTP Redirector+L4/L7 Load Balancer • Transient failures at worst Tuesday, August 3, 2010

  47. Back-end Failures • SDN: Not our problem! • HTTP Redirector+L4/L7 Load Balancer • Transient failures at worst • DNS Load Balancer • Bad! Several minutes of delay... Tuesday, August 3, 2010

  48. Application Experiences Tuesday, August 3, 2010

  49. MapCruncher • Tool to author AJAX-style interactive maps • Server only retrieved from a 25 GB set of map images • S3 solved the flash crowd problem • ~$200 total cost Tuesday, August 3, 2010

  50. Asirra CAPTCHA service • Images stored in S3 • Based on EC2 and Python • DNS-based load balancing • Handled 75,000 requests on first day, including a DOS attack • Starting more servers solved the DOS problem Tuesday, August 3, 2010

  51. InkblotPassword.com • Implemented in Python and EC2 • No code optimization • Depend on starting new servers for scale • ~10,000 users registered in a day • ~$150 spent to handle Slashdotting Tuesday, August 3, 2010

  52. Conclusion • Utility computing enables cheap handling of flash crowds • No “best” architecture • Depends on application workload • Enable scaling to many instances instead of optimizing one instance Tuesday, August 3, 2010

  53. Tuesday, August 3, 2010

Recommend


More recommend