Docker & Mesos/Marathon in production at OVH Balthazar - PowerPoint PPT Presentation

Docker & Mesos/Marathon in production at OVH Balthazar Rouberol https://ovh.to/6bRrkAn 1

About Docker at OVH 2014-2015: Home-made container orchestrator, Sailabove, based on LXC ● 2016: Switch to Docker & Mesos/Marathon ● 6 (soon 7) Mesos clusters: ● Internal production: 2 (soon 3) ○ External production: 2 ○ External gamma: 2 ○ At our peak: ● 800 hosts ○ 3000 cores ○ 12TB RAM ○ 200TB disk ○ 60 teams, ~2500 production containers ● 2

Problems we faced Docker instabilities and crashes ● Traceability of all network accesses established by containers ● Security rules enforcing ● No baked-in multi-tenancy in Marathon ● Incoming connections dropped due to marathon-lb/HAProxy reload stuck ● Partial network outages impacting production due to LB misconfiguration ● And many more, but I only have 30 minutes :) ● 3

What UnionFS to choose? The land of BUTs. devicemapper in loop file (default): works fine on dev machine, BUT ● catastrophic performances in production AUFS : abandoned ● overlay : faster than devicemapper BUT high inode consumption ● overlay2 : lower inode consumption BUT kernel > 4.0 ● ZoL : Few production feedback that I know of. Good reputation BUT hard ● to install on Linux. Will test. We currently run overlay2 , on kernel 4.3.0 without noticeable issues, except regular image cleanup (which has an impact on docker). 4

Traceability of network accesses Each packet is marked by the kernel with a class id ● A class id defines a cluster / team / app ● Iptables rules with classid filters can be written where appropriate ( u32 ) ● Prototype : Log all incoming/outgoing SYN packets with ● https://github.com/google/gopacket 5

Security rules enforcing Home made mesos-docker-executor: No privileged mode ● Limited default CAPs ● Class ID injection ● Of course, no SSH access on hosts running the containers 6

Marathon & Multi-tenancy No built-in support for multitenancy in marathon ● Possible Scala plugin integration, but poorly documenter ● 1 marathon / team (or client) → extreme load on Mesos ● 7

Multi-tenancy by API Proxy 8

Multi-tenancy by API Proxy, in a nutshell Override ~ all Marathon API calls to perform a virtual isolation ● VERB /marathon/<user>/v2/<path> + Basic Auth ● POST /marathon/<user>/v2/apps ● /<app_id> → /<user>/<app_id> ○ Add label MARATHON_USERNAME=<user> ○ GET /marathon/<user>/v2/apps ● Add Label selector MARATHON_USERNAME==<user> ○ /<user>/<app_id> → /<app_id> ○ Hide MARATHON_USERNAME label ○ GET /marathon/<user>/v2/apps/<app_id> ● /<user>/<app_id> → /<app_id> ○ Hide MARATHON_USERNAME label ○ ... ● 9

Multi-tenancy by API Proxy, limitations All apps are deployed, scaled, checked, etc, by a single Marathon cluster ● Global & progressive performance degradation ● Horizontal scaling to the rescue! ● Deploy multiple Marathon clusters ○ Limit the number of different teams/users per cluster ○ We’ve yet to measure our limit ○ 10

Load Balancer reload: marathon-LB 11

Load Balancer reload: marathon-LB’s approach 1. Block SYN for all bound ports (80, 443, 9000, service ports), one by one 2. Reload 3. Wait 4. Remove SYN drop rules 12

Load Balancer reload: marathon-LB’s approach Problems: Incoming connections are dropped for a while ● Reload is not atomic (2 iptables rules/port/reload) ● SYN DROP/ACCEPT is blocking, for each port → can lead to catastrophic ● situations 13

Load Balancer reload: enters sprint-LB Same architecture than marathon-LB but: Supports multiple orchestrators ● Supports multiple LB (nginx & UDP, wink wink) ● Atomic and non-locking reload ● Soon to be open-sourced ● 14

Load Balancer reload: sprint-LB’s approach Start 2 HAProxy side by side ● Transactional NAT of each port (or range) ● Old HAProxy only handles previously open ● connexions (conntrack), then dies (SIGTTOU) New HAProxy handles new connections ● Benefits: No connection drop ● No locking ● 15

Load balancing configuration Goals of a load balancer: Balance traffic between multiple healthy applications ● Perform health checks to detect unhealthy applications ● Remove unhealthy applications from the backend ● Bring back healthy applications into the backend ● Your SLI depends on a good load balancer configuration! 16

Guaranteeing a good SLI Quickly detect unhealthy applications: minimize errors ● Quickly detect healthy applications: spread load across applications ● Health checks: regular checks performed on each application L4 (TCP): connection attempt ● L7 (HTTP/..): request and response analysis ● 17

Guaranteeing a good SLI HAProxy configuration values redispatch=1: try a new application at each retry ● rise=1: one OK is enough for an app to be seen as healthy ● fall=1: one KO is enough for an app to be seen as unhealthy ● observe layer 4: each L4 connection is considered as a health-check ● 18

Thanks! Questions? 19

Docker & Mesos/Marathon in production at OVH Balthazar - PowerPoint PPT Presentation

Docker & Mesos/Marathon in production at OVH Balthazar Rouberol https://ovh.to/6bRrkAn 1 About Docker at OVH 2014-2015: Home-made container orchestrator, Sailabove, based on LXC 2016: Switch to Docker & Mesos/Marathon 6

Docker@OVH with Mesos/Marathon June 28th 2016 @devatoria @brouberol Devops / Python charmer

docker service is the new docker run Getting Started with Docker Clustering Mike Goelzer /

Setup docker rm $(docker ps -aq) docker network rm my_net Demo - Install and activate yum -y

Docker Provider The Docker provider is used to interact with Docker containers and images. It uses

An Ultramarathon Pie with Doge Glaze An Ultramarathon Pie with Doge Glaze Marathon: The Summary

Fault Domains in Mesos Vinod Kone (vinodkone@apache.org) About me Apache Mesos PMC and

Docker Review Basic Commands docker image ls # list images currently present locally docker

Nvidia GPU Support on Mesos: Bridging Mesos Containerizer and Docker Containerizer MesosCon Asia

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos Rahul Kumar Technical Lead

CONTAINER ORCHESTRATION WITH SWARM MODE, MESOS/MARATHON AND KUBERNETES ADRIAN MOUAT WHO AM I?

Going D/S/K Prod Like A Pro BRET FISHER Docker Captain, DevOps Dude, Creator of Docker Mastery

Orchestration in Docker Swarm mode, Docker services and declarative application deployment Mike

Secrets Management in Mesos Vinod Kone ( vinodkone@apache.org ) MesosCon EU 2017 About me

LA Marathon-SRLA Presentation Marathon Pacing Strategies Pacing Strategies Getting Started

MESOS & CONTAINERS Overview of Mesos containerization and upcoming filesystem isolation

SCALING JENKINS WITH DOCKER AND APACHE MESOS Carlos Sanchez @csanchez csanchez.org Watch online

& Rise of the Contingent Workforce Mehul Rajparia Lumesse APAC 26/11/2018 Lumesse Talent

Virtualized Congestion Control Bryce Cronkite-Ratcliff, Aran Bergman, Shay Vargaftik,

Paradrop: Enabling Lightweight Multi-tenancy at the Networks Extreme Edge Peng Liu, Dale

Bala lancing Efficiency and Fair irness in in Heterogeneous GPU Clu lusters for Deep Learning

7 Critical Reasons for Kubernetes-Native Backup about us Mark Severson Member of Technical Staff

Sales and Manufacturing for a Social Enterprise Agenda Greg Connecting CRM and ERP for the

Workload Management for Big Data Analytics Ashraf Aboulnaga University of Waterloo + Qatar

Scaling real-time analytics using Postgres in the cloud Sai.Srirampur@microsoft.com

Docker & Mesos/Marathon in production at OVH Balthazar - PowerPoint PPT Presentation

Docker & Mesos/Marathon in production at OVH Balthazar Rouberol https://ovh.to/6bRrkAn 1 About Docker at OVH 2014-2015: Home-made container orchestrator, Sailabove, based on LXC 2016: Switch to Docker & Mesos/Marathon 6

Docker@OVH with Mesos/Marathon June 28th 2016 @devatoria @brouberol Devops / Python charmer

docker service is the new docker run Getting Started with Docker Clustering Mike Goelzer /

Setup docker rm $(docker ps -aq) docker network rm my_net Demo - Install and activate yum -y

Docker Provider The Docker provider is used to interact with Docker containers and images. It uses

An Ultramarathon Pie with Doge Glaze An Ultramarathon Pie with Doge Glaze Marathon: The Summary

Fault Domains in Mesos Vinod Kone (vinodkone@apache.org) About me Apache Mesos PMC and

Docker Review Basic Commands docker image ls # list images currently present locally docker

Nvidia GPU Support on Mesos: Bridging Mesos Containerizer and Docker Containerizer MesosCon Asia

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos Rahul Kumar Technical Lead

CONTAINER ORCHESTRATION WITH SWARM MODE, MESOS/MARATHON AND KUBERNETES ADRIAN MOUAT WHO AM I?

Going D/S/K Prod Like A Pro BRET FISHER Docker Captain, DevOps Dude, Creator of Docker Mastery

Orchestration in Docker Swarm mode, Docker services and declarative application deployment Mike

Secrets Management in Mesos Vinod Kone ( vinodkone@apache.org ) MesosCon EU 2017 About me

LA Marathon-SRLA Presentation Marathon Pacing Strategies Pacing Strategies Getting Started

MESOS &amp; CONTAINERS Overview of Mesos containerization and upcoming filesystem isolation

SCALING JENKINS WITH DOCKER AND APACHE MESOS Carlos Sanchez @csanchez csanchez.org Watch online

&amp; Rise of the Contingent Workforce Mehul Rajparia Lumesse APAC 26/11/2018 Lumesse Talent

Virtualized Congestion Control Bryce Cronkite-Ratcliff, Aran Bergman, Shay Vargaftik,

Paradrop: Enabling Lightweight Multi-tenancy at the Networks Extreme Edge Peng Liu, Dale

Bala lancing Efficiency and Fair irness in in Heterogeneous GPU Clu lusters for Deep Learning

7 Critical Reasons for Kubernetes-Native Backup about us Mark Severson Member of Technical Staff

Sales and Manufacturing for a Social Enterprise Agenda Greg Connecting CRM and ERP for the

Workload Management for Big Data Analytics Ashraf Aboulnaga University of Waterloo + Qatar

Scaling real-time analytics using Postgres in the cloud Sai.Srirampur@microsoft.com

MESOS & CONTAINERS Overview of Mesos containerization and upcoming filesystem isolation

& Rise of the Contingent Workforce Mehul Rajparia Lumesse APAC 26/11/2018 Lumesse Talent