Evolution of OpenStack Networking at CERN Nova Network, Neutron and SDN Belmiro Moreira @belmiromoreira belmiro.moreira@cern.ch Ricardo Rocha @ahcorporto ricardo.rocha@cern.ch
Fundamental Science Founded in 1954 What is 96% of the universe made of? What was the state of matter just after the Big Bang? Why isn’t there anti-matter in the universe?
6
Capabilities GPU TOP CELL N Neutron CPU Pinning Huge Pages SMP Compute GPU CELL 2 Neutron ... CELL 1 Compute Configuration Nova Network Neutron vs Nova Network Scalability & Flexibility Allowed Projects ... Moving from CellsV1 to CellsV2 at CERN, Mon 21 11:35 7
VN CELL NODE 1 V2 V3 V1 Hypervisors Virtual Machines V2 V3 NODE 2 V1 Order of ~10s of cells (currently 70), with ~200 hypervisors per cell ● Number of virtual machines per hypervisor varies per use case ● From 4 to 30 VMs per hypervisor ○ 8
VN CELL NODE 1 V2 V3 V1 Hypervisors Virtual Machines S513-V-IP123 S513-V-VM908 V2 137.1XX.43.0/24 V3 188.1XX.191.0/24 NODE 2 ( Primary Service ) ( Secondary Service ) V1 Flat but segmented network, with multiple broadcast domains ● Scalability ○ Segmentation done on Primary Services ○ Primary Services can have multiple Secondaries ● No route if Secondary is in a different Primary ● VM IP allocation must belong to the hypervisor’s Primary ○ 9
LanDB Source of Truth Primary Services Secondary Services Virtual Machines Hypervisors IPv4 IPv6 DNS Aliases Aliases IPv6 Readiness Ownership ... All devices must be present ● ● Used for different purposes ○ Security checks ○ DNS/DHCP Configuration Switch/router configuration ○ Active Directory, … ○ 10
Phase 1. Phase 2. Phase 3. Nova Network Neutron SDN 11
Phase 1. Nova Network Custom NetworkManager ● Late IP allocation - after scheduling to compute nodes ● Patching done directly in the Nova code ● NOVA DB NOVA COMPUTE LanDB Nova Network is being deprecated... ● Quantum is the new thing… Neutron is the new thing... ○ 12
Phase 2. Neutron Linuxbridge, Flat / Provider networks ● Better integration using ML2, mechanism driver and extensions ● Quickly became possible to have it out of tree ○ Our extensions have a similar role to Neutron Segments ○ Gradual enroll, cell by cell ● Vanilla upstream packages for Neutron, much smaller patch on Nova ● More split pieces, potential points of failure ● Periodic consistency checks ○ 2 3 4b 1 NOVA COMPUTE Neutron LanDB 4a https://gitlab.cern.ch/cloud-infrastructure/openstack-neutron-cern 13
Phase 2. Neutron Subnet Cluster Which subnets belong to this cluster? neutron cluster-list +--------+----------------------+-------------------------------------------------------+ | id | name | subnets | +--------+----------------------+-------------------------------------------------------+ | ... | VMPOOL SXXXX-C-IPZZZ | ... 188.xxx.yy.zz/22 | | ... | VMPOOL SBBBB-C-IPWWW | ... 137.aaa.bb.ccc/25 | | | | ... 137.bbb.cc.0/25 | | | | ... 137.bbb.dd.0/25 | +--------+----------------------+-------------------------------------------------------+ 14
Phase 2. Neutron Host Restrictions Which subnets can i use for this hypervisor? neutron host p06253927y321a1 +----------------------------+--------------------------------------+ | Field | Value | +----------------------------+--------------------------------------+ | all_subnets | 4ca09148-32b5-4da4-95f9-35e83e2e1984 | | available_random_subnet | 4ca09148-32b5-4da4-95f9-35e83e2e1984 | | available_subnets | 4ca09148-32b5-4da4-95f9-35e83e2e1984 | | least_available_subnet | 4ca09148-32b5-4da4-95f9-35e83e2e1984 | | most_available_subnet | 4ca09148-32b5-4da4-95f9-35e83e2e1984 | +----------------------------+--------------------------------------+ 15
Phase 2. Neutron Single control plane, no partitioning (as with Nova cells) ● Scaling RabbitMQ wasis a challenge 3 Virtual Machines < 1000 Nodes →5x 64GB Virtual Machines ~default rabbit configuration ~default neutron configuration ~looking ok(ish) 16
Phase 2. Neutron Single control plane, no partitioning (as with Nova cells) ● Scaling RabbitMQ wasis a challenge Cluster crashes once, crashes constantly 1200 Nodes Cannot allocate 1318267840 bytes of memory (of type "heap"). Statistics db issues → collect_statistics_interval = 60000 Agents (too) aggressively trying to reconnect → rabbit_retry_backoff = 60 Agents not re-connecting properly → restart neutron servers Scale up Rabbit nodes, larger VMs 17
Phase 2. Neutron Single control plane, no partitioning (as with Nova cells) ● Scaling RabbitMQ wasis a challenge Cluster crashes periodically 2000 Nodes Lots of queued messages, until it goes ( neutron server ) →rpc_thread_pool_size = 2048 →rpc_conn_pool_size = 60 →rpc_response_timeout = 120 →rpc_workers = 4 ( rabbit ) →tcp_backlog: 4096 →tcp_listen_options { reuseaddr: true, keepalive, true } →tcp_keepalive = true →rabbitmq_server_erl_args = '+K +A128 +P 1048576' →vm_memory_high_watermark = 0.8 →ulimits (65536 for nofile/nproc soft and hard) →cluster_partition_handling = autoheal 18
Phase 2. Neutron Single control plane, no partitioning (as with Nova cells) ● Scaling RabbitMQ wasis a challenge Cluster crashes less, but still happens 2400 Nodes Lots of queued messages, until it goes ( rabbit virtual machines ) →ip link set %k txqueuelength 10000 ( neutron agent ) →report_interval=43200 ( neutron server ) →agent_downtime=86500 Other Considerations ( not done, not helpful ) →increase rpc_state_report_workers →heartbeat timeouts on the rabbit cluster 19
Phase 2. Neutron Single control plane, no partitioning (as with Nova cells) ● Scaling RabbitMQ wasis a challenge Stable cluster ~5000 Nodes →5x 64GB Virtual Machines Ocasional network partitions →recovering most times, but not always →procedure for a quick cluster rebuild (~10min downtime) 20
Phase 2. Neutron Single control plane, no partitioning (as with Nova cells) ● Scaling RabbitMQ wasis a challenge Stable cluster ~5000 Nodes →5x 64GB Virtual Machines Ocasional network partitions →recovering most times, but not always →procedure for a quick cluster rebuild (~10min downtime) 21
Phase 2. Neutron Single control plane, no partitioning (as with Nova cells) ● Scaling RabbitMQ wasis a challenge ~5000 Nodes 22
Phase 2. Neutron Migrating existing cells from Nova Network https://gitlab.cern.ch/cloud-infrastructure/python-neutronclient-cern Puppet for reconfiguration ● Custom command for the live VM changes ● $ openstack network cluster migrate --dry-run --host p06146676a327ab $ openstack network cluster migrate --host p06146676a327ab $ openstack network cluster migrate --cluster ‘VMPOOL SXXXX-C-IPZZZ’ for instance in instances: commands.extend([ ip = instance.addresses['CERN_NETWORK'][0] "brctl delif %s %s " % (NOVA_BRIDGE, raw_device), mac = ip['OS-EXT-IPS-MAC:mac_addr'] "ip link set %s down" % NOVA_BRIDGE, nova_tap = nova_interfaces[mac] "ip link set %s name %s " % (NOVA_BRIDGE, CERN_NETWORK_BRIDGE), neutron_tap = nova_interfaces[mac] "brctl addif %s %s " % (CERN_NETWORK_BRIDGE, raw_device), commands.extend([ "ip link set %s up" % CERN_NETWORK_BRIDGE, "brctl delif %s %s " % (NOVA_BRIDGE, nova_tap), "ip route add default via %s dev %s " % (gw, CERN_NETWORK_BRIDGE), "ip link set %s name %s " % (nova_tap, neutron_tap), ]) ]) 23
Phase 3. SDN Current network deployment has significant limitations ● Limited IP Mobility ● Segmented broadcast domains ○ Live migration limited to single cluster ○ Ad-hoc tunnels for hardware retirement campaigns ○ Hardware Repurposing ● Multiple network domains (General, Services, …) ○ Services dedicated to a single domain ○ No Floating IPs ● No Tenant/Private Networks ● 24
Phase 3. SDN Small prototype setups to evaluate functionality ● Neutron/OpenVSwitch OpenDaylight OVN DHCP Neutron Neutron/Built-in Built-in Floating IPs Yes Yes Yes Distributed Routing Only with DVR Yes Yes Tunneling Protocols vxlan / GRE / geneve vxlan / GRE / geneve vxlan / geneve Security Groups IPTables OpenFlow Native OpenFlow Native + Logging Load Balancing Octavia Octavia Octavia / OVN Native Acceleration Limited DPDK DPDK DPDK Tracing tcpdump tcpdump ovn-trace Physical Switch Integr. L2 / L3 L2 / L3 L2 / L3 25
Phase 3. SDN In the end we picked OpenContrail / Tungsten ● 26
Phase 3. SDN In the end we picked OpenContrail / Tungsten ● CONTROLLER OPENSTACK CONTROLLER NETCONF/EVPN OVSDB XMPP BGP VROUTER HYPERVISOR PHYSICAL PHYSICAL HYPERVISOR VXLAN MPLSoUDP/GRE WAN GATEWAY 27
Recommend
More recommend