MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS - PowerPoint PPT Presentation

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed Chief Architect asayeed@redhat.com

DISCLAIMER Important Informa+on The informa+on described in this slide set does not provide any commitments to roadmaps or availability of products or features. Its inten+on is purely to provide clarity in describing the problem and drive a discussion that can then be used to drive open source communi+es Red Hat Product Management owns the roadmap and supportability conversa+on for any Red Hat product 2

AGENDA Background: OpenStack Architecture • Telco Deployment Use case • Distributed deployment – requirements • Multi-Site Architecture • Challenges • Solution and Further Study • Conclusions • 3

OPENSTACK ARCHITECTURE

WHY MULTI-SITE FOR TELCO? Compute requirements – Not just at Data Center • • Mul+ple Data Centers • Managed Service Offering • Managed Branch Office • Thick vCPE • Mobile Edge Compute vRAN – vBBU loca+ons • • Virtualized Central Offices • Hundreds to thousands of loca+ons Primary and Backup Data Center – Disaster recovery • IoT Gateways – Fog compu+ng • Centrally managed Compute closer to the user 6

Multiple DC or Central Offices Independent OpenStack Deployments Remote Sites E2E Orchestrator • Hierarchical Connec+vity model of CO • Remote sites with compute requirements Security & Firewall • Extend OpenStack to these sites Quality of Service (QoS) Traffic Shaping Overlay Device Management Tunnel over Internet Main Data Center A typical service almost always Backup Data Center spans across mul3ple DCs Remote Data Centers 7

Multiple DCs – NFV Deployment Real Customer Requirements Region 2 . L2 or L3 Extensions . between DCs . . . Fully Redundant System . Region 1 25 25 Sites • 2-5 VNFs required at each site • Maximum of 2 Compute Nodes per site needed for these • VNFs Controllers Storage Requirements = Image storage only • Redundant Storage Nodes Total number of control Nodes = 25 *3 =75 • Compute Nodes Configura+on Overhead Total Number of Storage Nodes = 25 * 3 = 75 • 75% Total Number of Compute Nodes = 25 * 2 = 50 • 8

Virtual Central Office Real Customer Challenge Region Region 1 2 . L2 or L3 . Extensions between DCs . . . Fully Redundant . System 1000+ 1000+ Sites – Central Offices • From few 10s to 100s of VMs • Fully Redundant configura+ons • Termina+on of Residen+al, Business and Mobile Services • Controllers Managing 1000 openstack islands • Storage Nodes Tier 1 Telcos already have >100 sites today • Compute Nodes Management Challenge 9

DEPLOYMENT OPTIONS 10

OPTIONS Mul+ple Independent Island Model – seen this already • Common Authen+ca+on and Management • – External user policy management with LDAP integra+on – Common Keystone Stretched deployment model • – Extend compute and Storage Nodes into other Data Centers – Keep central control of all remote resources Allow Data Centers to share workloads – Tri-circle approach • Proxy the APIs – Master Slave model or cascading model • Agent based model • Something else?? • 11

Multiple DC or Central Offices Independent OpenStack Deployments Feed the load balancer • Site capacity independent of the other Cloud Management Pladorm • User informa+on separate or replicated offline • Load balancer directs traffic where to L go to – Good for loadsharing B • DR – external problem Directory L2 or L3 Extensions between DCs Fully Redundant Fully Redundant System System Controllers Storage Nodes Region 1 Compute Nodes Region 2 … N Good for few 10s of sites – What about 100s or Thousands of sites 12

Extended OpenStack Model Shared Keystone Deployment Common or Shared Keystone Cloud Management Pladorm • Single Keystone for authen+ca+on • User informa+on in one loca+on • Independent Resources • Modify the keystone endpoint table Keystone • Endpoint, Service, Region, IP Directory L2 or L3 Extensions between DCs Fully Redundant Fully Redundant System System Controllers Storage Nodes Region 1 Compute Nodes Region 2 … N Iden+ty: Keystone – Single point of control 13

Extended OpenStack Model Central Controller and Remote Compute & Storage (HCI) Nodes Central Controller Cloud Management Pladorm Single authen+ca+on • Distributed Compute Resources • Single Availability Zone per Region • L2 or L3 Region 1 Region 2 … N Extensions between DCs Fully Redundant System Controllers Replicated Storage – Storage Nodes Galera Cluster Compute Nodes Cinder, Glance and Image Directory 14 Manual Restore

Revisiting the Branch Office - Thick CPE Can we deploy compute nodes at all the branch sites and centrally control them? E2E Network Orchestrator Data Center Enterprise Security & Firewall vCPE Quality of Service (QoS) Traffic Shaping Device Management IPSec, MPLS Internet or Other Tunnel mechanism Enterprise vCPE x86 Server with VNFs Deploy Nova Compute NFVI How do I scale it to thousands of sites? OpenStack, OpenShift/ 15 Kubernetes

OSP 10 – Scale components independently Most OpenStack HA services and VIPs must be launched/managed by Pacemaker or HAProxy . However, some can be managed via systemctl thanks to the simplification of pacemaker constraints introduced in version 9 and 10.

COMPOSABLE SERVICES AND CUSTOM ROLES Hardcoded Custom Custom Custom Controller Role Controller Role Ceilometer Role Networker Role Keystone Keystone Ceilometer Ceilometer Neutron Neutron RabbitMQ RabbitMQ Glance Glance ... ... • Leverage composable services model – to define a Central Keystone – Place functionality where it is needed – i.e. dis-aggregate • Deployable standalone on separate nodes or combined with other services into Custom Role(s). 17 – Distribute the functionality depending on the DC locations

Re-visiting the Virtual Central Office use case Real Customer Challenge Region 1 L2 or L3 Region 2 Extensions between DCs Fully Redundant Region 4 System Region 3 Controllers Storage Nodes Compute Nodes Region 3a Region 3b Require Flexibility and some Hierarchy 18

CONSIDERATIONS Scaling across a thousand sites? Some areas that we need to look at • Latency and Outage times • • Delays due to distance between DCs and link speeds - RTT • The remote site is lost – headless operations and subsequent recovery • Startup Storms Scaling Oslo messaging • • RabbitMQ • Scaling of Nodes => Scale RabbitMQ/Messaging • Ceilometer (Gnocchi & Aodh) – heavy user of MQ 19

LATENCY AND OUTAGE TIMES Scaling across a thousand sites? Latency between sites – Nova API Calls • • 10, 50, 100 ms? Round trip +me = Queue tuning • Bojleneck link/node speed Outage +me – recovery +me • • 30s or more? Nova Compute services flapping • Confirma+on – from provisioning to opera+on • Neutron +me outs – binding issues • Headless opera+on • Restart – causes storms • 20

RABBITMQ TUNING Tune the buffers – increase buffer size • Take into account messages in flight – rates and round trip +mes • • BDP = Bojleneck speed * RTT • Number of messages • Servers * backends * requests/sec = Number of messages/sec Neutron • Split into mul+ple instances of message queues for distributed deployment Ceilometer into a MQ – Heaviest user of MQ • MQ • Nova into a single MQ MQ • Neutron into a MQ Nova Conductor Refer to an interes+ng presenta+on on this topic – “Tuning RabbitMQ • MQ Compute at Large Scale Cloud” – Openstack Summit – Aus+n 2016 Ceilometer Ceilometer Agents collector 21

RECENT AMQP ENHANCEMENTS Eliminates the broker based model • Broker Enhances AMQP 1.0 • Broker Broker • Separate messaging end point from message routers Hierarchical - Tree • Newton has AMQP driver for oslo messaging • Ocata provides perf tuning, upstream support for Triple-O If you must use RabbitMQ • • Use clustering and exchange configurations Broker Broker • Use shovel plugin with exchange configurations and multiple instances Mesh - Routed 22

OPENSTACK CASCADING PROJECT Parent Child Child Child Proxy for Nova, Cinder, Celometer & Neutron subsystems per site Parent AZn At Parent – loads of proxys one set per Child AZ1 Child User communicates to the master 23

TRICIRCLE AND TRIO2O Cascading solution split into two projects User1 UserN Tricircle – Networking across openstack clouds • Trio2o – Single API Gateway for Nova, Cinder • TRI-CIRCLE Trio2o API Gateway Make Neutron(s) work as a single cluster AZn AZx pod AZ1 Expand workloads into other OS instances Single Region with mul+ple sub regions Create Networking extensions Shared or Federated Keystone Isola+on of East-west traffic Shared or Distributed Glance Applica+on HA UID = TenantID+PODID OPNFV Mul+-Site Project – Eupherates release 24

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS - PowerPoint PPT Presentation

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed Chief Architect asayeed@redhat.com DISCLAIMER Important Informa+on The informa+on described in this slide set does not provide any commitments to roadmaps or

What is OpenStack ? Hello! I am Thierry Carrez I work for the OpenStack Foundation. You can

Multi-Site Vs. Domain A Commerce Case Study May 7, 2019 Page 1 | Multi-Site Vs Domain: A

Get a Python job, Work on OpenStack ! about:me Release Manager for OpenStack Chair of

BUILD YOUR FIRST OPENSTACK APPLICATION WITH OPENSTACK PYTHONSDK VICTORIA MARTINEZ DE LA CRUZ

The Multi-Site RCT A fleet of RCTs! Each conducted in a different social setting

Future of OpenStack Looking Forward to 2019 Alan.Clark@suse.com What and Why OpenStack

OpenStack Summit Primer: The Who, What, Why, and How of OpenStack Presented by Ben Silverman,

Automium, GitOps on Openstack. The first (2013) European, multi-region, open source cloud IaaS.

Build your own Web Portal using OpenStack APIs and Services OpenStack Summit in Austin 2016

Introduction to OpenStack Nabil Abdennadher, HES-SO What is OpenStack ? Free and

Introduction to OpenStack Nabil Abdennadher, HES-SO What is OpenStack ? Free and

GPU on OpenStack Masafumi Ohta @masafumiohta Who am I > Working for System Integrator as

Easy multi-tenant Kubernetes RWX storage with Cloud Provider OpenStack and Manila CSI Tom Barron

Coordination and Leadership challenges in producing OpenStack Thierry Carrez (@tcarrez) Release

CEPH DATA SERVICES IN A MULTI- AND HYBRID CLOUD WORLD Sage Weil - Red Hat 1 OpenStack Summit -

Distributed File Storage in Multi-Tenant Clouds using CephFS Openstack Vancouver 2018 May 23

Preventing craziness A deep dive into OpenStack testing automation Thierry Carrez (@tcarrez)

OpenStack Charms Project Update, OpenStack Summit Vancouver James Page (jamespage) What are the

OpenStack Hackathon Rio de Janeiro, Brazil - 2017 OpenStack Hackathons Help app developers

OpenStack and OVN Whats New with OVS 2.7 OpenStack Summit -- Boston 2017 Russell Bryant

DNS in OpenStack What is the OpenStack DNS API? https://gra.ham.ie | @grahamhayes 1 Graham

Moving SNE to the Cloud RP1i3 Sudesh Jethoe http://www.openstack.org/assets/openstack-logo/

Bringing Private Cloud to Australia OpenStack on VMware OpenStack Summit 2013 Introduction

Multi-task, Multi-lingual Learning Graham Neubig Site https://phontron.com/class/nn4nlp2018/

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS - PowerPoint PPT Presentation

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed Chief Architect asayeed@redhat.com DISCLAIMER Important Informa+on The informa+on described in this slide set does not provide any commitments to roadmaps or

What is OpenStack ? Hello! I am Thierry Carrez I work for the OpenStack Foundation. You can

Multi-Site Vs. Domain A Commerce Case Study May 7, 2019 Page 1 | Multi-Site Vs Domain: A

Get a Python job, Work on OpenStack ! about:me Release Manager for OpenStack Chair of

BUILD YOUR FIRST OPENSTACK APPLICATION WITH OPENSTACK PYTHONSDK VICTORIA MARTINEZ DE LA CRUZ

The Multi-Site RCT A fleet of RCTs! Each conducted in a different social setting

Future of OpenStack Looking Forward to 2019 Alan.Clark@suse.com What and Why OpenStack

OpenStack Summit Primer: The Who, What, Why, and How of OpenStack Presented by Ben Silverman,

Automium, GitOps on Openstack. The first (2013) European, multi-region, open source cloud IaaS.

Build your own Web Portal using OpenStack APIs and Services OpenStack Summit in Austin 2016

Introduction to OpenStack Nabil Abdennadher, HES-SO What is OpenStack ? Free and

Introduction to OpenStack Nabil Abdennadher, HES-SO What is OpenStack ? Free and

GPU on OpenStack Masafumi Ohta @masafumiohta Who am I &gt; Working for System Integrator as

Easy multi-tenant Kubernetes RWX storage with Cloud Provider OpenStack and Manila CSI Tom Barron

Coordination and Leadership challenges in producing OpenStack Thierry Carrez (@tcarrez) Release

CEPH DATA SERVICES IN A MULTI- AND HYBRID CLOUD WORLD Sage Weil - Red Hat 1 OpenStack Summit -

Distributed File Storage in Multi-Tenant Clouds using CephFS Openstack Vancouver 2018 May 23

Preventing craziness A deep dive into OpenStack testing automation Thierry Carrez (@tcarrez)

OpenStack Charms Project Update, OpenStack Summit Vancouver James Page (jamespage) What are the

OpenStack Hackathon Rio de Janeiro, Brazil - 2017 OpenStack Hackathons Help app developers

OpenStack and OVN Whats New with OVS 2.7 OpenStack Summit -- Boston 2017 Russell Bryant

DNS in OpenStack What is the OpenStack DNS API? https://gra.ham.ie | @grahamhayes 1 Graham

Moving SNE to the Cloud RP1i3 Sudesh Jethoe http://www.openstack.org/assets/openstack-logo/

Bringing Private Cloud to Australia OpenStack on VMware OpenStack Summit 2013 Introduction

Multi-task, Multi-lingual Learning Graham Neubig Site https://phontron.com/class/nn4nlp2018/

GPU on OpenStack Masafumi Ohta @masafumiohta Who am I > Working for System Integrator as