flexible networking at large mega scale
play

Flexible Networking at Large Mega-Scale Exploring issues and - PowerPoint PPT Presentation

Flexible Networking at Large Mega-Scale Exploring issues and solutions What is Mega-Scale? One or more of: > 10,000 compute nodes > 100,000 IP addresses > 1 Tb/s aggregate bandwidth Massive East/West traffic


  1. Flexible Networking at Large Mega-Scale Exploring issues and solutions

  2. What is “Mega-Scale”? One or more of: ● > 10,000 compute nodes ● > 100,000 IP addresses ● > 1 Tb/s aggregate bandwidth ● Massive East/West traffic between tenants Yahoo is “Mega-Scale”

  3. What are our goals? ● Mega-Scale, with ○ Reliability ■ Yahoo supports ~200 million users/day -- it must be reliable ○ Flexibility ■ Yahoo has 100s of internal and user-facing services ○ Simplicity ■ Undue complexity is the enemy of scale!

  4. Our Strategy Leverage high-performance network design with: ➢ OpenStack ➢ Augmented with additional automation ➢ Hosting applications designed to be “disposable” - Fortunately, we already had many of the needed pieces

  5. Traditional network design ● Large layer 2 domains ● Cheap to build and manage ● Allows great flexibility of solutions ● Leverage pre-existing network design ● IP mobility across the entire domain It’s Simple. But...

  6. L2 Networks Have Limits ● The L2 Domain can only be extended so far ○ Hardware TCAM limitations (size and update rate) ○ STP scaling/stability issues ● But an L3 network can ○ scale larger ○ at less cost ○ but limits flexibility

  7. Potential Solutions ● Why not use a Software Defined Network? ○ Overlay allows IP mobility but ■ Control plane limits scale and reliability ■ Overhead at on-ramp boundaries ○ OpenFlow-based solutions ■ Not ready for mega-scale yet w/ L3 support ■ Control plane complexities Not Ready for Mega-Scale

  8. Our Solution ● Use Clos design network backplane ● Each cabinet has a Top-Of-Rack router ○ Cabinet is a separate L2 domain ○ Cabinets “own” one or more subnets (CIDRs) ○ OpenStack is patched to “know” which subnet to use ● Network backplane supports East-West and North- South traffic equally Well ● Structure is ideal if we decide to deploy SDN overlay

  9. A solution for scale: Layer 3 to the rack L3 L2 ... Compute Racks Compute + Admin • Clos-based L3 network • TOR (Top Of Rack) routers Admin= API, DB, MQ, etc

  10. Adding Robustness With Availability Zones

  11. Problems ● No IP Mobility Between Cabinets ○ Moving a VM between cabinets requires a re-IP ○ Many small subnets rather than one or more large ones ○ Scheduling complexities: ■ Availability zones, rack-awareness ● Other issues ○ Coordination between clusters ○ Integration with existing infrastructure You call that “flexible?”

  12. (re-)Adding Flexibility ● Leverage Load Balancing ○ Allows VMs to be added and removed (remember, our VMs are mostly “disposable”) ○ Conceals IP changes (such as rack/rack movement) ○ Facilitates high-availability ○ Is the key to flexibility in what would otherwise be a constrained architecture

  13. (re-)Adding Flexibility (cont’d) ● Automate it: ○ Load Balancer Management ■ Device selection based on capacity & quotas ■ Association between service groups and VIPs ■ Assignment of VMs to VIPs ○ Availability Zone selection & balancing ○ Multiple cluster integration ● Implement “Service Groups” ○ (external to OpenStack -- for now)

  14. Service Groups ● Consists of groups of VMs running the same application ● Can be a layer of an application stack, an implementation of an internal service, or a user-facing server ● Present an API that functions behind a VIP ○ Web services everywhere!

  15. Service Group Creation

  16. Integrating With Openstack

  17. Putting It Together ● Registration of hosts and services ○ A VM is associated with a service group at creation ○ A tag associated with the service group is accessible to resource allocation ● Control of load balancers ○ Allocates and controls hardware ○ Manages VMs for each service group ○ Provides elasticity and robustness

  18. Putting It Together (cont’d) ● OpenStack Extensions and Patches ○ Three points of integration: 1. Intercept request before issue 2a. Select network based on hypervisor 2b. Transmit new instance information to external automation 3. Transmit deleted instance information to external automation

  19. Wither OpenStack? ● Our Goals: ○ Minimize patching code ○ Minimize points of integration with external systems ○ Contribute back patches of general use ○ Replace custom code with community code: ■ Use Heat for automation ■ Use LBaaS to control load balancers ○ Share our experiences

  20. Complications ● OpenStack clusters don’t exist in a vacuum -- this makes scaling them harder ○ Existing physical infrastructure ○ Existing management infrastructure ○ Interaction with off-cluster resources ○ Security and organizational policies ○ Requirements of existing software stack ○ Stateful application introduce complexities

  21. Conclusion ● Mega-Scale has unique issues ○ Many potential solutions don’t scale sufficiently ○ Some flexibility must be sacrificed *BUT* ○ Mega-Scale also admits solutions that aren’t practical or cost-effective at smaller scale ○ Automation and integration with external infrastructure is key

  22. Questions ? email: edhall@yahoo-inc.com

Recommend


More recommend