rabbitmq or qpid dispatch router pushing
play

RabbitMQ or Qpid Dispatch Router: Pushing Ali Sanhaji OpenStack - PowerPoint PPT Presentation

RabbitMQ or Qpid Dispatch Router: Pushing Ali Sanhaji OpenStack to Javier Rojas Balderrama Matthieu Simonin the Edge OpenStack Summit | Berlin 2018 Whos here Ali Sanhaji Javier Rojas Balderrama Research engineer at Orange, France


  1. RabbitMQ or Qpid Dispatch Router: Pushing Ali Sanhaji OpenStack to Javier Rojas Balderrama Matthieu Simonin the Edge OpenStack Summit | Berlin 2018

  2. Who’s here Ali Sanhaji Javier Rojas Balderrama Research engineer at Orange, France Research engineer at Inria, France Matthieu Simonin* Research engineer at Inria France

  3. Agenda - Bringing OpenStack to the edge - RabbitMQ and Qpid Dispatch Router for OpenStack over a WAN - Performance evaluation - Conclusions and next steps

  4. Edge sites Challenges at the edge Local sites Regional sites DC1 ● Scalability ● Locality Core ● Placement Network ● Resiliency DC2 ● ...

  5. OpenStack to the edge For a telco like Orange, pushing OpenStack to the edge is key ● How to deploy OpenStack in small edge sites ● (control plane + compute nodes)? Costly and too many control planes to manage and ○ synchronize ⇒ Have a centralized control plane (APIs) and remote ○ compute nodes OpenStack scalability (stateless processes) ● OpenStack over a WAN ●

  6. Deployment under consideration Edge Deployment: Keystone Horizon - Centralized control services Nova (control) - Remote edge compute nodes Glance Communication between edge and core Neutron (control) Core - RPC traffic (periodic tasks, control traffic) - Rest API calls (e.g., between nova and glance) WAN Edge Nova (agent) Nova (agent) Nova (agent) Neutron (agent) Neutron (agent) Neutron (agent)

  7. The message bus in OpenStack ● One critical component in OpenStack is the message bus used for interprocess communication Nova Nova Nova Neutron Neutron Nova API conductor scheduler compute server agent message bus message bus ● Used by processes to send various RPCs: ○ call : request from client to server, client waiting for response ○ cast : request from client to server, no response (direct notification) ○ fanout : request from client to multiple servers, no response (grouped notification) Server RPC Request RPC Client Server Client Server Client Server Response Server Call Cast Fanout

  8. The message bus in OpenStack ● Processes use oslo.messaging library to send RPCs It supports multiple underlying messaging implementations Nova Nova Nova Neutron Neutron Nova API conductor scheduler compute server agent oslo.messaging oslo.messaging QPID QPID RabbitMQ RabbitMQ Dispatch Router Dispatch Router (AMQP 0.9.1) (AMQP 0.9.1) (AMQP 1.0) (AMQP 1.0) Client Client Server Server Router Broker topology cluster

  9. The message bus over a WAN Broker Central site Server 1 cluster Regional Regional Server 2 site 1 site 2 Edge Edge Edge Edge Edge Edge site 1 site 2 site 3 site 4 site 5 site 6 Client 3 Server 3 Client 2 Server 4 Client 1

  10. The message bus over a WAN Central site Server 1 Regional Regional Server 2 site 1 site 2 Edge Edge Edge Edge Edge Edge site 1 site 2 site 3 site 4 site 5 site 6 Client 3 Server 3 Client 2 Server 4 Client 1

  11. Goal Evaluate the performance of RabbitMQ and Qpid Dispatch Router over a WAN How do they resist to WAN constraints (packet loss, latency, dropouts)? ○ Does a router fit better in a decentralized environment? ○ Are OpenStack operations still robust without a broker retaining ○ messages? Is a broker safer than a router? ○ How RPC communications (RabbitMQ and QDR) behave in a WAN? ○

  12. What could go wrong in a WAN? Examples of two possible situations: RPC client RPC server RPC client RPC server latency/loss between client and bus latency/loss between server and bus (e.g., nova-conductor sends a (e.g., nova-compute sends a vm boot request to nova-compute) state update to nova-conductor)

  13. What could go wrong in a WAN? In case of latency RPC client RPC server RPC calls: 1 - Sender blocks for 2× latency RPC casts (fire and forget semantics): - Correct semantic with QDR driver RPC client RPC server - Incorrect semantic with RabbitMQ driver 2 - In 1. Sender waits for 2x latency (acks) - But higher guarantee on message delivery

  14. Experiments

  15. Context ○ Test plan of massively distributed RPCs https://docs.openstack.org/performance-docs/latest/test_plans/ massively_distribute_rpc/plan.html ○ Two categories of experiments: 1. Synthetic (rabbitmq/qdr, decentralized configuration) 2. Operational (with OpenStack and centralized bus) ● Network dropout ● Latency and loss

  16. Tools ○ EnOS for OpenStack deployment (virtualization, bare metal) https://github.com/BeyondTheClouds/enos ○ Grid’5000 a dedicated testbed for experiment-driven research

  17. Synthetic experiment recap OpenStack Summit Vancouver 2018 presentation Evaluation of the implemented patterns of RPC messages in OpenStack ● Brokers and routers scalabilities are similar, but router is lightweight and ● achieves low latency message delivery especially under high load Routers offers locality of the messages in decentralized deployments ● Decentralization need to be applied to APIs and database ● Openstack internal messaging at the edge: I n depth evaluation www.openstack.org/summit/vancouver-2018/summit-schedule/events/2100 7/openstack-internal-messaging-at-the-edge-in-depth-evaluation

  18. × 9→17 https://hal.inria.fr/hal-01891567 × 8→27 × 2

  19. Operational experiments Keystone × 3 + 1 Nova (control) Glance Software Infrastructure Neutron (control) Bus (RMQ/QDR) OpenStack stable/queens - Hardware: Dell PowerEdge C6420 × 20 - Core node Optimised Kolla based deployment - 32 cores - - RabbitMQ version: v3.7.8 - 193 GB RAM - Qpid-Dispatch Router v1.3.0 - Virtualized deployment WAN - Core: 32 cores, 64 GB RAM - Edge: 2 cores, 4 GB RAM Nova (agent) Nova (agent) Nova (agent) × 100/400 Neutron (agent) Neutron (agent) Neutron (agent) Edge nodes

  20. Network dropout Configuration Iptables on the core nodes ● (controller and network nodes) Cron to schedule dropouts ● Frequency: [5m, 10m] ○ Duration: [30s, 60s, 120s] ○ Rally ● runner ○ constant_for_duration Concurrency 5 ■ Duration 30m ■ OpenStack with 100 computes ● Full deployment for each combination (set of parameters, bus)

  21. Network dropout: boot_and_delete_servers

  22. Network dropout: boot_and_delete_servers

  23. Latency and Loss Configuration Parameters ● Latency: [0, 5, 20, 40, 80, 120, 200] s ○ Loss: [0, 0.1, 0.2, 0.4, 0.8, 1.0, 2.0] % ○ Rally ● constant runner ○ Concurrency: 5 ■ Iterations: 100 ■ OpenStack ● Computes: [100, 400] ○ Full deployment for each combination (set of parameters, bus)

  24. 100 computes

  25. 400 computes

  26. 100 computes

  27. Timeline behind the scene of rally benchmarks (multicast/400 computes) boot_server_and_attach_interface ● create_and_delete_network ● create_and_delete_port ● create_and_delete_router ● create_and_delete_security_groups ● create_and_delete_subnet ● set_and_clear_gateway ●

  28. anycast queues

  29. fanout queues

  30. Conclusions

  31. Summary In front of WAN latency and loss, the router (no message ● retention) is as effective at delivering messages as the broker (message retention) Router is less resilient in the case of network dropouts ● QDR consumes way less resources than RMQ ●

  32. What’s next Bring QDR closer to edge sites and to compute nodes in ● order to leverage routing capabilities Bigger scale of compute nodes ● Make OpenStack control plane even more decentralized ● if possible (e.g., database)

  33. ali.sanhaji@orange.com javier.rojas-balderrama@inria.fr matthieu.simonin@inria.fr https://beyondtheclouds.github.io

Recommend


More recommend