Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos - PowerPoint PPT Presentation

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos Rahul Kumar Technical Lead LinuxCon / ContainerCon - Berlin, Germany

● Data Pipeline Agenda Mesos + Docker ● ● Reactive Data Pipeline

Goal Analyzing data always have great benefits and is one of the greatest challenge for an organization.

Today’s business generates massive amount of digital data.

which is cumbersome to store, transport and analyze

Making distributed system and off-loading workload to commodity clusters is one of the better approach to solve data problem

Characteristics Of a distributed system Resource Sharing ❏ Openness ❏ Concurrency ❏ Scalability ❏ Fault Tolerance ❏ Transparency ❏

Collect Store Process Analyze

Data Center

Manually Scale Frameworks & Install services

Complex Very Limited Inefficient Low Utilization

Static Partitioning Blocker for Fault Tolerant data pipeline

Failure make it even more complex to manage

Apache Mesos “Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.”

Mesos Features ● Scalability: scale up to 10,000s of nodes Fault-tolerant: replicated master and slaves using ZooKeeper ● ● Docker support: Support for Docker containers Native Container: Linux Native isolation between tasks with Linux ● Containers Scheduling: Multi-resource scheduling (memory, CPU, disk, and ● ports) API supports : Java, Python and C++ APIs for developing new parallel ● applications Monitoring: Web UI for viewing cluster state ●

Resource Isolation

Docker Containerizer Mesos adds the support for launching tasks that contains Docker images Users can either launch a Docker image as a Task, or as an Executor. To run the mesos-agent to enable the Docker Containerizer, “docker” must be set as one of the containerizers option mesos-agent --containerizers=docker,mesos

Mesos Frameworks ● Aurora: Aurora was developed at Twitter and the migrated to Apache Project later. Aurora is a framework that keeps service running across a shared pool of machines, and responsible for keeping them running forever. Marathon: It is a framework for container orchestration for Mesos. ● Marathon helps to run other framework on Mesos. Marathon also runs other application container such as Jetty, JBoss Server, Play Server. Chronos: Fault tolerance job scheduler for Mesos, It was developed at ● Airbnb as replacement of cron.

Spark Stack R esilient D istributed D atasets (RDDs) - Big collection of data which is: - Immutable - Distributed - Lazily evaluated - Type Inferred - Cacheable

Why Spark Streaming? Many big-data applications need to process large data streams in near-real time Monitoring Systems Alert Systems Computing Systems

What is Spark Streaming? Taken from Apache Spark.

What is Spark Streaming? Framework for large scale stream processing ➔ Created at UC Berkeley ➔ Scales to 100s of nodes ➔ Can achieve second scale latencies ➔ Provides a simple batch-like API for implementing complex algorithm ➔ Can absorb live data streams from Kafka, Flume, ZeroMQ, Kinesis etc.

Spark Streaming Run a streaming computation as a series of very small, deterministic batch jobs - Chop up the live stream into batches of X seconds - Spark treats each batch of data as RDDs and processes them using RDD operations - Finally, the processed results of the RDD operations are returned in batches

Simple Streaming Pipeline Point of Failure

Spark Streaming over a HA Mesos Cluster ● To use Mesos from Spark, you need a Spark binary package available in a place accessible (http/s3/hdfs) by Mesos, and a Spark driver program configured to connect to Mesos. ● Configuring the driver program to connect to Mesos: val sconf = new SparkConf() .setMaster("mesos://zk://10.121.93.241:2181,10.181.2.12:2181,10.107.48.112:2181/mesos") .setAppName("MyStreamingApp") .set("spark.executor.uri","hdfs://Sigmoid/executors/spark-1.3.0-bin-hadoop2.4.tgz") .set("spark.mesos.coarse", "true") .set("spark.cores.max", "30") .set("spark.executor.memory", "10g") val sc = new SparkContext(sconf) val ssc = new StreamingContext(sc, Seconds(1)) ...

Spark Streaming Fault-tolerance Real-time stream processing systems must be operational 24/7, which requires them to recover from all kinds of failures in the system. ● Spark and its RDD abstraction is designed to seamlessly handle failures of any worker nodes in the cluster. ● In Streaming, driver failure can be recovered with checkpointing application state. ● Write Ahead Logs (WAL) & Acknowledgements can ensure 0 data loss.

Simple Fault-tolerant Streaming Infra

Creating a scalable pipeline ● Figure out the bottleneck : CPU, Memory, IO, Network ● If parsing is involved, use the one which gives high performance. ● Proper Data modeling ● Compression, Serialization

Thank You @rahul_kumar_aws LinuxCon / ContainerCon - Berlin, Germany

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos - PowerPoint PPT Presentation

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos Rahul Kumar Technical Lead LinuxCon / ContainerCon - Berlin, Germany Data Pipeline Agenda Mesos + Docker Reactive Data Pipeline Goal Analyzing data always

docker service is the new docker run Getting Started with Docker Clustering Mike Goelzer /

Setup docker rm $(docker ps -aq) docker network rm my_net Demo - Install and activate yum -y

Docker Provider The Docker provider is used to interact with Docker containers and images. It uses

Lecture 10: Fault Tolerance Fault Tolerant Concurrent Computing The main principles of fault

Docker Review Basic Commands docker image ls # list images currently present locally docker

Going D/S/K Prod Like A Pro BRET FISHER Docker Captain, DevOps Dude, Creator of Docker Mastery

Orchestration in Docker Swarm mode, Docker services and declarative application deployment Mike

Adaptive Fault Tolerant Systems: Adaptive Fault Tolerant Systems: Reflective Design and

Idealised Fault Tolerant Idealised Fault Tolerant Architectural Element Architectural Element

Fault-Tolerant Data Collection in Fault-Tolerant Data Collection in Heterogeneous Intelligent

Fault-tolerant techniques Fault-tolerant techniques What causes component faults? What are the

Distributed Systems 5. Fault Tolerant Systems Fault-Tolerance - 1 Lszl Bszrmnyi

FAULT-TOLERANT CONTROL Is it possible? JAN MACIEJOWSKI Fault- tolerant control. DPS09,

Building a Fault- Building a Fault- Tolerant Distributed Tolerant Distributed System with

Docker meets Python A look on the Docker SDK for Python pip install docker Jan Wagner

INTRODUCTION TO DOCKER ADRIAN MOUAT SO WHAT IS DOCKER? SIMILAR TO A LIGHTWEIGHT VM Both

Fault Tolerant Computing Coping with errors Steven Janke February 2013 Steven Janke (Seminar)

Formalization and Verification of Fault Tolerance and Security Felix G artner TU Darmstadt,

Tolerating Faults in Disaggregated Datacenters Amanda Carbonari , Ivan Beschastnikh University

Hamiltonian thickness and fault-tolerant spanning rooted path systems of graphs Yinfeng Zhu ( 6

Dependability Engineering of Complex Computing Systems M. Kaniche J.-C. Laprie

Distributed Real-Time Fault Tolerance on a Virtualized Multi-Core System Eric Missimer*, Richard

Model Checking of Fault-Tolerant Distributed Algorithms Igor Konnov joint work with Annu

Hybrid Fault-Tolerant Consensus in Asynchronous and Wireless Embedded Systems 22nd International