Systems for Resource Management Corso di Sistemi e Architetture per - PDF document

Macroarea di Ingegneria Dipartimento di Ingegneria Civile e Ingegneria Informatica Systems for Resource Management Corso di Sistemi e Architetture per Big Data A.A. 2019/2020 Valeria Cardellini Laurea Magistrale in Ingegneria Informatica The reference Big Data stack High-level Interfaces Support / Integration Data Processing Data Storage Resource Management Valeria Cardellini - SABD 2019/2020 1

Outline • Cluster management system – Mesos • Resource management policy – DRF Valeria Cardellini - SABD 2019/2020 2 Motivations • Rapid innovation • No single framework optimal for all Big Data applications • Running each framework on its dedicated cluster: – Expensive – Hard to share data Valeria Cardellini - SABD 2019/2020 3

A possible solution • Run multiple frameworks on a single cluster • How to share the (virtual) cluster resources among multiple and non homogeneous frameworks executed in virtual machines/containers? • The classical solution: Static partitioning • Efficient? Valeria Cardellini - SABD 2019/2020 4 What we need • “The datacenter is the computer” (D. Patterson) – Share resources to maximize their utilization – Share data among frameworks – Provide a unified API to the outside – Hide the internal complexity of the infrastructure from applications • The solution: A cluster-scale resource manager that employs dynamic partitioning Valeria Cardellini - SABD 2019/2020 5

Apache Mesos • Cluster manager that provides a common resource sharing layer over which diverse frameworks can run “Program against your datacenter like it’s a single pool of resources” ⎼ Abstracts the entire datacenter into a single pool of computing resources, simplifying running distributed systems at scale ⎼ Distributed system to build and run fault-tolerant and elastic distributed systems on top of it Dynamic partitioning Valeria Cardellini - SABD 2019/2020 6 Apache Mesos • Designed and developed at Berkeley Univ. - Top open-source project by Apache mesos.apache.org • Used by Twitter, Uber, Apple (Siri) among the others • Cluster: a dynamically shared pool of resources Static partitioning Dynamic partitioning Valeria Cardellini - SABD 2019/2020 7

Mesos goals • High utilization of resources • Support for diverse frameworks (current and future) • Scalability to 10,000's of nodes • Reliability in face of failures Valeria Cardellini - SABD 2019/2020 8 Mesos in the data center • Where does Mesos fit as an abstraction layer in the datacenter? Valeria Cardellini - SABD 2019/2020 9

Computation model • A framework (e.g., Hadoop, Spark) manages and runs one or more jobs • A job consists of one or more tasks • A task (e.g., map, filter) consists of one or more processes running on same machine Valeria Cardellini - SABD 2019/2020 10 What Mesos does • Enables fine-grained resource sharing (at the level of tasks within a job) of resources (CPU, RAM, …) across frameworks • Provides common functionalities: - Failure detection - Task distribution - Task starting - Task monitoring - Task killing - Task cleanup Valeria Cardellini - SABD 2019/2020 11

Fine-grained sharing • Allocation at the level of tasks within a job • Improves utilization, latency, and data locality Coarse-grain sharing Fine-grain sharing Valeria Cardellini - SABD 2019/2020 12 Frameworks on Mesos • Frameworks must be aware of running on Mesos – DevOps tooling: Vamp • Deployment and workflow tool for container orchestration – Long running services: Aurora (service scheduler), … – Big Data processing: Hadoop, Flink, Spark, Storm, … – Batch scheduling: Chronos, … – Data storage: Alluxio, Cassandra, ElasticSearch, … – Machine learning: TFMesos • Framework to help running distributed Tensorflow ML tasks on Apache Mesos with GPU support Full list at mesos.apache.org/documentation/latest/frameworks/ Valeria Cardellini - SABD 2019/2020 13

Mesos: architecture • Master-worker architecture • Workers publish available resources to master • Master sends resource offers to frameworks • Master election and service discovery via ZooKeeper Source: Mesos: a platform for fine-grained resource sharing in the data center, NSDI'11 Valeria Cardellini - SABD 2019/2020 14 Mesos component: Apache ZooKeeper • Coordination service for maintaining configuration information, naming, providing distributed synchronization, and providing group services • Used in many distributed systems, among which Mesos, Storm and Kafka • Allows distributed processes to coordinate with each other through a shared hierarchical name space of data ( znodes ) – File-system-like API – Name space similar to a standard file system – Limited amount of data in znodes – Not really: file system, database, key-value store, lock service • Provides high throughput, low latency, highly available, strictly ordered access to the znodes Valeria Cardellini - SABD 2019/2020 15

Mesos component: ZooKeeper • Replicated over a set of machines that maintain an in-memory image of the data tree – Read requests processed locally by the ZooKeeper server – Write requests forwarded to other ZooKeeper servers and consensus before a response is generated (primary-backup system) – Uses Paxos as leader election protocol to determine which server is the master • Implements atomic broadcast – Processes deliver the same messages (agreement) and deliver them in the same order (total order) – Message = state update Valeria Cardellini - SABD 2019/2020 16 Mesos and framework components • Mesos components - Master - Workers or agents • Framework components - Scheduler : registers with master to be offered resources - Executors : launched on agents to run the framework’s tasks Valeria Cardellini - SABD 2019/2020 17

Scheduling in Mesos • Scheduling mechanism based on resource offers - Mesos offers available resources to frameworks • Each resource offer contains a list of <agent ID, resource1: amount1, resource2: amount2, ...> - Each framework chooses which resources to use and which tasks to launch • Two-level scheduler architecture - Mesos delegates the actual scheduling of tasks to frameworks - Why? To improve scalability • Master does not have to know the scheduling intricacies of every type of supported application Valeria Cardellini - SABD 2019/2020 18 Mesos: resource offers • Resource allocation is based on Dominant Resource Fairness (DRF) algorithm Valeria Cardellini - SABD 2019/2020 19

Mesos: resource offers in details • Workers continuously send status updates about resources to master Valeria Cardellini - SABD 2019/2020 20 Mesos: resource offers in details (2) Valeria Cardellini - SABD 2019/2020 21

Mesos: resource offers in details (3) • Framework scheduler can reject offers Valeria Cardellini - SABD 2019/2020 22 Mesos: resource offers in details (4) • Framework scheduler selects resources and provides tasks • Master sends tasks to workers Valeria Cardellini - SABD 2019/2020 23

Mesos: resource offers in details (5) • Framework executors launch tasks Valeria Cardellini - SABD 2019/2020 24 Mesos: resource offers in details (6) Valeria Cardellini - SABD 2019/2020 25

Mesos: resource offers in details (7) Valeria Cardellini - SABD 2019/2020 26 Mesos fault tolerance • Task failure • Worker failure • Host or network failure • Master failure • Framework scheduler failure Valeria Cardellini - SABD 2019/2020 27

Fault tolerance: task failure Valeria Cardellini - SABD 2019/2020 28 Fault tolerance: task failure (2) Valeria Cardellini - SABD 2019/2020 29

Fault tolerance: worker failure Valeria Cardellini - SABD 2019/2020 30 Fault tolerance: worker failure (2) Valeria Cardellini - SABD 2019/2020 31

Fault tolerance: host or network failure Valeria Cardellini - SABD 2019/2020 32 Fault tolerance: host or network failure (2) Valeria Cardellini - SABD 2019/2020 33

Fault tolerance: host or network failure (3) Valeria Cardellini - SABD 2019/2020 34 Fault tolerance: master failure Valeria Cardellini - SABD 2019/2020 35

Fault tolerance: master failure (2) • When the leading master fails, the surviving masters use ZooKeeper to elect a new leader Valeria Cardellini - SABD 2019/2020 36 Fault tolerance: master failure (3) • The workers and frameworks use ZooKeeper to detect the new leader and reregister Valeria Cardellini - SABD 2019/2020 37

Fault tolerance: framework scheduler failure Valeria Cardellini - SABD 2019/2020 38 Fault tolerance: framework scheduler failure (2) • When a framework scheduler fails, another instance can reregister to the master without interrupting any of the running tasks Valeria Cardellini - SABD 2019/2020 39

Fault tolerance: framework scheduler failure (3) Valeria Cardellini - SABD 2019/2020 40 Fault tolerance: framework scheduler failure (4) Valeria Cardellini - SABD 2019/2020 41

Systems for Resource Management Corso di Sistemi e Architetture per - PDF document

Macroarea di Ingegneria Dipartimento di Ingegneria Civile e Ingegneria Informatica Systems for Resource Management Corso di Sistemi e Architetture per Big Data A.A. 2019/2020 Valeria Cardellini Laurea Magistrale in Ingegneria Informatica The

Resource Resource Management Management RESOURCE MANAGEMENT RESOURCE MANAGEMENT We have a

SDR CLOUDS SDR CLOUDS RESOURCE MANAGEMENT RESOURCE MANAGEMENT IMPLICATIONS IMPLICATIONS INDEX

New Resource Implementation Shawna Warneke, Resource Management Specialist Christina Weiler,

Chapter 6 Cloud Resource Management and Scheduling Contents Resource management and

HUMAN RESOURCE MANAGEMENT Topic: Strategic Human Resource Management Company: Shan Foods (Pvt)

Resource Management with systemd LinuxCon North America 2013 Lennart Poettering September 2013

Deadlock Example Process 1 Process 2 Resource 1 Resource 2 Example Process 1 Process 2

Placement resource view visualization $ openstack resource provider tree balazs.gibizer@est.tech

and Scheduling Techniques Agenda for Today Resource management encompasses all the

Systems Systems Systems Integration Systems Integration Systems Systems Integration Systems

Water Resource Management The Oakdale Irrigation Districts strategic approach to resource

Fisheries Relevant Resources Resource Resource Development Research Habitat External

Types of Expert Systems Interpretation Systems Prediction Systems Diagnosis Systems

Hillsdale Historic Resource Survey Historic Maps: 1851 Hillsdale Historic Resource Survey

Resource efficiency targets and indicators Dr. Martin Hirschnitz-Garbers Coordinator Resource

City of Watsonville, Water Resource Center Entrance City of Watsonville, Water Resource Center

Multipath TCP Architecture: Towards Consensus Towards Consensus draft-ford-mptcp-architecture-01

Kick-Off: TDAQ Phase-II Upgrade - Overview Outline o High-level design o Effort and cost

CSC 309 Lecture Notes Week 2 General Design Principles High-Level Design Patterns Examples of

Design patterns for code reuse in HLS packet processing pipelines Haggai Eran , Lior Zeno

A high-level implementation of software pipelining in LLVM Roel Jordans 1 , David Moloney 2 1

High Level Synthesis Design Representation Intermediate representation essential for efficient

Introduction to Coding in Python Fermilab - TARGET 2017 Week 1 Low to High Level Programing

Towards a High-Level Implementation of Flexible Parallelism Primitives for Symbolic Languages