gatekeeper supporting bandwidth guarantees for
play

Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant - PowerPoint PPT Presentation

Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks Henrique Rodrigues , Yoshio Turner , Jose Renato Santos , Paolo Victor , Dorgival Guedes HP Labs WIOV 2011, Portland, OR The Problem: Network Performance


  1. Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks Henrique Rodrigues , Yoshio Turner , Jose Renato Santos , Paolo Victor , Dorgival Guedes HP Labs WIOV 2011, Portland, OR

  2. The Problem: Network Performance Isolation

  3. Suppose that you have a datacenter … …

  4. Suppose that you have a datacenter …

  5. And you are an IaaS provider … 70% BW

  6. And you are an IaaS provider … 70% BW 30% BW

  7. ... and your network faces this traffic pattern: 70% BW 30% BW TCP TCP

  8. ... and your network faces this traffic pattern: 70% BW 30% BW TCP TCP

  9. ... and your network faces this traffic pattern: 70% BW 30% BW TCP TCP TCP is flow-based, not tenant-aware...

  10. It becomes worse with these transport protocols: 70% BW 30% BW TCP UDP

  11. It becomes worse with these transport protocols: 70% BW 30% BW TCP UDP UDP will consume most of the bandwidth

  12. It becomes worse with these transport protocols: 70% BW 30% BW TCP UDP Using rate limiters at each server doesn’t solve the problem …

  13. It becomes worse with these transport protocols: If you limit the rate of 70% BW each at 30% 30% BW TCP UDP Using rate limiters at each server doesn’t solve the problem …

  14. It becomes worse with these transport protocols: The aggregate 70% BW RX is 90% 30% BW TCP UDP Using rate limiters at each server doesn’t solve the problem …

  15. The Problem: Network Performance Isolation • How can we enforce that all tenants will have at least the minimum amount of network resources they need to keep their services up? o In other words, how to provide network performance isolation to multi-tenant datacenters ?

  16. Practical requirements for a traffic isolation mechanism/system

  17. Requirements for a practical solution • Scalability Datacenter supports thousands of physical servers hosting 10s of thousands of tenants and 10s to 100s of thousands of VMs • Intuitive Service Model Straightforward for tenants to understand and specify their network performance needs • Robust against untrusted tenants IaaS model allows users to run arbitrary code as tenants, giving users total control over the network stack. Malicious users could jeopardize the performance of other tenants • Flexibility / Predictability What should we do with the idle bandwidth? Work conserving vs non-work conserving?

  18. Existing solutions don ’ t meet all these requirements Flexibility / Solution Scalable Intuitive Model Robustness Predictability ✗ ✗ ✗ ✔ TCP BW Capping ✗ ✗ ✔ ✔ (policing) ✗ ✔ ✔ ✔ Secondnet ✗ ✗ ✔ ✔ Seawall ✗ ✗ ✔ ✔ AF-QCN

  19. Our approach

  20. Assumption Bisection bandwidth should not be a problem: • Emerging multi-path technologies will enable high bandwidth networks with full-bisection bandwidth • Smart tenant placement: tenant VMs placed close to each other in the network topology • Results on DC traffic analysis show that most of the congestion happens within racks, not at the core

  21. Our approach • Assume core is over-provisioned and manage bandwidth at edge o Addresses scalability challenge: Limited number of tenants in each edge link

  22. Tenant Performance Model Abstraction VM VM VM BW1 BW2 BW3 BW4 BW10 VM VM BW5 BW9 VM VM BW8 BW7 BW6 VM VM VM • Simple abstraction to tenant o Model similar to physical servers connected to a switch • Guaranteed bandwidth for each VM (TX and RX) o Minimum and Maximum rate per vNIC

  23. Gatekeeper • Provides network isolation for multi-tenant datacenters using a distributed mechanism • Agents implemented at the virtualization layer coordinate bandwidth allocation dynamically, based on tenants ’ guarantees

  24. Gatekeeper • Agents in the VMM control the transmission (TX) and coordinate the reception (RX)

  25. Gatekeeper - Overview 70% BW 30% BW TCP UDP

  26. Gatekeeper - Overview 70% BW 30% BW Congestion! TCP UDP

  27. Gatekeeper - Overview 70% BW 30% BW TCP OK! Reducing TX UDP OK! Reducing TX OK! Reducing TX

  28. Gatekeeper Architecture

  29. Gatekeeper Prototype o Xen/Linux o Gatekeeper integrated into Linux Open vSwitch o Leverage Linux traffic control mechanism (HTB) for rate control

  30. Example - RX 2 Tenants share a gigabit link: • Tenant A o 70% of the link, o 1 TCP Flow • Tenant B o 30% of the link, o 3 Flows ( TCP or UDP )

  31. Example - TX 2 Tenants share a gigabit link: • Tenant A o 70% of the link, o 1 TCP Flow • Tenant B o 30% of the link, o 3 Flows ( TCP or UDP )

  32. Example – Results without Gatekeeper Transmit (TX) Scenario no control TX rate cap 1000 ¡ 900 ¡ 800 ¡ 700 ¡ 600 ¡ 500 ¡ 400 ¡ 300 ¡ 200 ¡ 100 ¡ 0 ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ Type of traffic for tenant B Tenant A (TCP) Tenant B

  33. Example – Results without Gatekeeper Transmit (TX) Scenario Receive (RX) Scenario no control TX rate cap no control RX rate cap 1000 ¡ 1000 ¡ 900 ¡ 900 ¡ 800 ¡ 800 ¡ 700 ¡ 700 ¡ 600 ¡ 600 ¡ 500 ¡ 500 ¡ 400 ¡ 400 ¡ 300 ¡ 300 ¡ 200 ¡ 200 ¡ 100 ¡ 100 ¡ 0 ¡ 0 ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ Type of traffic for tenant B Type of traffic for tenant B Tenant A (TCP) Tenant B

  34. Example – Results without Gatekeeper • Bandwidth Capping doesn ’ t reallocate unused bandwidth (non work-conserving) • UDP consumes most of the switch resources Transmit (TX) Scenario Receive (RX) Scenario no control TX rate cap no control RX rate cap 1000 ¡ 1000 ¡ 900 ¡ 900 ¡ 800 ¡ 800 ¡ 700 ¡ 700 ¡ 600 ¡ 600 ¡ 500 ¡ 500 ¡ 400 ¡ 400 ¡ 300 ¡ 300 ¡ 200 ¡ 200 ¡ 100 ¡ 100 ¡ 0 ¡ 0 ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ Type of traffic for tenant B Type of traffic for tenant B Tenant A (TCP) Tenant B

  35. Example – Results with Gatekeeper Receive (RX) Scenario Transmit (TX) Scenario Gatekeeper Gatekeeper Gatekeeper Gatekeeper no control TX rate cap predictable flexible predictable flexible no control RX rate cap 1000 ¡ 1000 ¡ 900 ¡ 900 ¡ 800 ¡ 800 ¡ 700 ¡ 700 ¡ 600 ¡ 600 ¡ 500 ¡ 500 ¡ 400 ¡ 400 ¡ 300 ¡ 300 ¡ 200 ¡ 200 ¡ 100 ¡ 100 ¡ 0 ¡ 0 ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ none ¡ TCP ¡UDP ¡ Type of traffic for tenant B Type of traffic for tenant B Tenant A (TCP) Tenant B

  36. Summary • Gatekeeper provides network bandwidth guarantee at the server virtualization layer § Extends hypervisor to control RX bandwidth • Prototype implemented and used to demonstrate Gatekeeper in simple scenario • Future work § Evaluate Gatekeeper at larger scales § HP Labs Open Cirrus testbed (100+ nodes) § Further explore the design space § Functions to decrease/increase rate, etc § Evaluate Gatekeeper with more realistic benchmarks and applications

  37. Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks Contacts: {hsr,dorgival}@dcc.ufmg.br {yoshio_turner,joserenato.santos}@hp.com Acknowledgements: Brasil WIOV 2011, Portland, OR

Recommend


More recommend