CrossBow: From Hardware Virtualized NICs t\ to Virtualized Networks Sunay Tripathi, Nicolas Droux, Thirumalai Srinivasan, Kais Belgaied Aug 17 th . 2009 Sigcomm VISA 2009, Barcelona Sunay Tripathi, Distinguished Engineer, Sun Microsystems Inc Sunay.Tripathi@Sun.Com
Key Issues in Network Virtualization • Fair or Policy based resource sharing in virtualized environments > Bandwidth > NIC Hardware resources including Rx/Tx descriptors > Processing CPUs • Overheads due to Virtualization > Latency > Extra processing > Throughput • Security > New threats to L2 network • Where to solve the problem? > Switches > L3/L4 devices > Hosts www.opensolaris.org/os/project/crossbow 2
Crossbow: Solaris Networking Stack • 8 years of development work to achieve > Scalability across multi-core CPUs and multi-10gigE bandwidth > Virtualization, QoS, High Availibility designed in > Exploit advanced NIC features • Key Enabler for > Server and Network Consolidation > Open Networking > Cloud computing www.opensolaris.org/os/project/crossbow 3
Crossbow “Hardware Lanes” Ground-Up Design for multi-core and multi-10GigE • Linear Scalability using ' Hardware Lanes ' with dedicated resources • Network Virtualization and QoS designed in the stack • More Efficiency due to ' Dynamic Polling and Packet Chaining ' Physical Machine Physical NIC Kernel Threads Virtual Hardware C Virtual Rings/DMA and Queues NIC Machine/Zone L A Hardware Lane Hardware Kernel Threads Virtual Virtual S Rings/DMA and Queues NIC Machine/Zone S Switch I F VLAN I Separated E Hardware Kernel Threads Squeue Application R Rings/DMA and Queues www.opensolaris.org/os/project/crossbow 4
Hardware Lanes and Dynamic Polling ● Partition the NIC Hardware (Rx/Tx rings, DMA), kernel queues/threads, and CPU to allow creation of “Hardware Lane” which can be assigned to VNICs & Flows ● Use Dynamic Polling on Rx/Tx rings to schedule rate of packet arrival and transmission on a per lane bassis ● Effect of dynamic polling Mpstat (older driver) intr ithr csw icsw migr smtx srw syscl usr sys wt idl 10818 8607 4558 1547 161 1797 289 19112 17 69 0 12 Mpstat (GLDv3 based driver) intr ithr csw icsw migr smtx srw syscl usr sys wt idl 2823 1489 875 151 93 261 1 19825 15 57 0 27 ● Use Dynamic polling for B/W partitioning and isolation without any support from switches and routers ~85% ~85% ~15% ~75% Fewer Fewer More Fewer Context Mutexes CPU Free Interrupts Switches www.opensolaris.org/os/project/crossbow 5
Virtual Network Containers Virtualization Solaris Zone Zone xb1-z1 xb1-z2 Global Flows • Zone Virtual NICs & Virtual Switches • Virtual Virtual Virtual Wire • SQUEUE SQUEUE Resource Control Exclusive IP Exclusive IP Instance Instance Bandwidth Partitioning • NIC H/W Partitioning • VNIC1 VNIC2 bge0 (100Mbps) (200Mbps) CPUs/pri assignment • Observability Real time usage for each Link/flow • Rx/Tx Rx/Tx Rx/Tx DMA DMA DMA Finer grained stats per Link/flow • History at no cost • Flow Classifier NIC Client Client xb2 xb3 www.opensolaris.org/os/project/crossbow 6
Virtual NIC (VNIC) & Virtual Switches Virtual NICs > Functionally physical NICs: > IP address assigned statically or via DHCP and snooped individually > Appear in MIB as separate ' if ' with configured link speed shown as ' ifspeed ' > VNICs can be created over Link Aggregation on can be assigned to IPMP groups for load balancing and failover support > VNICs Can have multiple hardware lanes assigned to them > Can be created over physical NIC (without needing a Vswitch) to provide external connectivity with switching done in NIC H/W > VNICs have configurable link speed, CPU and priority assignment > Standards based End to End Network Virtualization > VLAN tags and Priority Flow Control (PFC) assigned to VNIC extend Hardware Lanes to Switch > No configuration changes needed on switch to support virtualization Virtual Switches > Can be created to provide private connectivity between Virtual Machines www.opensolaris.org/os/project/crossbow 7
Virtual NIC & Virtual Switch Usage # dladm create-vnic -l bge1 vnic1 # dladm create-vnic -l bge1 -m random -p maxbw=100M -p cpus=4,5,6 vnic2 # dladm create-etherstub vswitch1 # dladm show-etherstub LINK vswitch1 # dladm create-vnic -l vswitch1 -p maxbw=1000M vnic3 # dladm show-vnic LINK OVER MACTYPE MACVALUE BANDWIDTH CPUS vnic1 bge1 factory 0:1:2:3:4:5 - - vnic2 bge1 random 2:5:6:7:8:9 max=100M 4,5,6 vnic3 vswitch1 random 4:3:4:7:0:1 max=1000M - # dladm create-vnic -l ixgbe0 -v 1055 -p maxbw=500M -p cpus=1,2 vnic9 www.opensolaris.org/os/project/crossbow 8
Physical Wire w/Physical Machines Router Host 1 Host 2 Client Port 6 Port 9 Port 3 Port 1 Port 2 20.0.03 20.0.01 10.0.03 10.0.01 10.0.02 1 Gbps 1 Gbps 1 Gbps 100 Mbps 1 Gbps Switch 3 Switch 1 Virtual Wire w/Virtual Network Machines Router Host 1 Host 2 (Virtual Client Router) VNIC6 VNIC9 VNIC3 VNIC1 VNIC2 20.0.03 20.0.01 10.0.03 10.0.01 10.0.02 1 Gbps 1 Gbps 1 Gbps 100 Mbps 1 Gbps EtherStub 3 EtherStub 1 www.opensolaris.org/os/project/crossbow 9
Related Work • Commercial/Products > Vmware Hypervisor > Linux/Xen Hypervisor > Cisco UCS/VMware based solutions • Research Community > OpenFlow programmable switch > Various Linux/BSD based efforts www.opensolaris.org/os/project/crossbow 10
BACKUP
Solaris Core Network Functionality • Networking Services Developer Tools and Management Interfaces Routing Protocols using Quagga > L3/L4 Load Balancer kernel modules > Routing VRRP Perf IP Multi User > IP Firewall (IPFilter) Protocols (Routing Diag Pathing (Quagga) HA) Tools DNS, DHCP, NTP, SIP, VOIP, etc > Kernel Socket Kernel Sockets Scalable & Virtualized Network Stack • API Kernel Socket & Socket Filter > IP S Scalable IPFilter IP Modernized TCP/IP Stack Tunnels > (Firewall) Virtualized Hooks Y API QoS: B/W limits, Priorities, CPU bindings TCP/IP > L2 L3/L4 S Stack Bridge IP Multi Pathing (IPMP) Load Balancer > MAC Client IP Tunneling > Crossbow: Network Virtualization API Kernel A Defense against DDoS attacks > Virtual Virtual Virtual P • Crossbow: Virtual Networking NICs Switches Wire I VNICs, VSwitches, VWire > Observ- Flows QoS ability Service Virtualization (Flows) > s L2 Services: Classification, Filtering > L2 Classification, Filtering MAC Driver Generic LAN Driver v3 – GLDv3 • Generic LAN Driver – GLDv3 API > Aggregation Aggr, SR-IOV, Vanity Names Vanity Names > Driver 1gigE/10gigE Drivers (1GbE and 10GbE, FCoE, IPoIB) > FCOE IPoIB (Neptune, Niantic, etc) www.opensolaris.org/os/project/crossbow 12
Crossbow Flows : Service Virtualization Services and Protocols Compute Resources CPU 1 CPU 2 CPU 'n' VIRTUAL VIRTUAL VIRTUAL SQUEUE SQUEUE SQUEUE CPU 1 Virtual Squeue CPU 2 Virtual Squeue VOIP HTTPS DEFAULT TCP UDP DEFAULT SQUEUE SQUEUE SQUEUE SQUEUE SQUEUE SQUEUE Kernel Kernel Kernel Kernel Kernel Kernel threads/Qs threads/Qs threads/Qs threads/Qs threads/Qs threads/Qs Memory Memory Memory Memory Memory Memory Partition Partition Partition Partition Partition Partition Flow Classifier Flow Classifier NIC 1 NIC 2 www.opensolaris.org/os/project/crossbow 13
Crossbow Flows Crossbow Flows based on: > Services (protocol + remote/local ports) > Transport (TCP, UDP, SCTP, iSCSI, etc) > Remote and local IP addresses > Remote IP Subnets > DSCP labels Following attributes can be set on each Flow > B/W limits > Priorities > CPUs # flowadm create-flow -l bge0 protocol=tcp,local_port=443 -p maxbw=50M http-1 # flowadm set-flowprop -l bge0 -p maxbw=100M http-1 www.opensolaris.org/os/project/crossbow 14
Virtual Machines Solaris Guest OS 2 Solaris Guest OS 1 Solaris Host OS NIC Virtualization Engine NIC Virtualization Engine NIC Virtualization Engine Guest OS 2 Guest OS 1 Host OS VIRTUAL SQUEUE VIRTUAL SQUEUE All Traffic VIRTUAL SQUEUE All Traffic HTTP HTTPS DEFAULT SQUEUE SQUEUE SQUEUE Guest OS 2 VNIC Virtual Virtual Virtual Host OS VNIC NIC NIC NIC Guest Guest Guest Guest Host OS OS 1 OS 1 OS 1 OS 2 All traffic HTTP HTTPS DEFAULT All Traffic H/W Flow Classifier NIC www.opensolaris.org/os/project/crossbow 15
Dynamic Polling: Effect on Throughput Pkts Rcv'd via interrupt/poll High Load TCP Read/Write Test 5000000 5 Clients (pktsz=1500; wrtsz=8k) 4500000 4000000 Number of Packets 13 Bi-Directional Thruput (Gbps) 3500000 12 3000000 11 Pkts by Interrupt 2500000 10 Pkts by Poll 2000000 Total Pkts 9 8 1500000 7 1000000 Xbow2 6 Fedora 2.6 500000 5 0 4 1 2 3 4 3 Lane Number 2 1 Chain Lengths 0 100.00% 5 5 Client Read/Write 90.00% 3 Reading/2 Writing 80.00% 10 thread/client 70.00% Chain Lengths 60.00% Chains > 50 pkts Config Details: 50.00% Chains 10 – 50 Pkts 5 Client; 1 Server – 10GigE Links 40.00% Chains < 10 Pkts 3 Clients reading (10 thread each) 30.00% 2 Clients writing (10 thread each) 20.00% All Client/Sever: 10.00% 0.00% x4150 dual soc 8x2.8Ghz Intel CPU 1 2 3 4 10 GigE NIC – Intel Oplin (ixgbe) Lane Numbers www.opensolaris.org/os/project/crossbow 16
Recommend
More recommend