pilot streaming design considerations for a stream
play

Pilot-Streaming: Design Considerations for a Stream Processing - PowerPoint PPT Presentation

Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing Andre Luckow, Peter M. Kasson, Shantenu Jha STREAMING 2016, 03/23/2016 RADICAL, Rutgers, http://radical.rutgers.edu Motivation There is


  1. Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing Andre Luckow, Peter M. Kasson, Shantenu Jha STREAMING 2016, 03/23/2016 RADICAL, Rutgers, http://radical.rutgers.edu

  2. Motivation There is a need to couple data sources, HPC, analytics! 20+ applications identified at STREAM16 Challenges: • Data applications and pipelines are complex • Scalability and Elasticity: dynamic changes in resource demands • Scheduling and provisioning of resources: right amount of resources at right time • Programming models: HPC (MPI, OpenMP, GPU) vs. Big Data (Java, Python, R) • Interoperability: Data sources sinks often in different environments (IoT, cloud, HPC, HPDC) than compute Current State: • Streaming (in sciences) often implemented on application-level (w/ limited re-use) • Manifold landscape of streaming tools (Apache Open Source Tools, Cloud Tools)

  3. Workload Characteristics HPC Resource HPC Resource 1 HPC Resource 2 Simulation Analysis Simulation Analysis

  4. Workload Characteristics HPC Resource 1 Simulation Message Broker HPC Resource 2 HPC Resource 3 Analysis 2 Analysis 1

  5. Introduction Pilot Abstraction Space User Application Pilot-Job System User Policies Pilot-Job Pilot-Job Resource Manager System Space Resource A Resource B Resource C Resource D http://arxiv.org/abs/1207.6644

  6. The Convergence of HPC and “Data Intensive” Computing Applications Applications Orchestration Orchestration (Oozie, Pig) (Pegasus, Taverna, Dryad, Swift) Advanced Analytics & Machine Learning Advanced Analytics & Machine Learning (Mahout, R, MLBase) Da (Pilot-KMeans, Replica Exchange) MPI Frameworks for Advanced Analytics & SQL-Engines (Impala, Hive, Shark, Phoenix) O Machine Learning MapReduce Declarative (Blas, ScaLAPACK, Frameworks Languages CompLearn, PetSc, In-Memory MapReduce Twister (Pilot-MapReduce) (Swift) Data Store & Blast) Higher-Level (Spark) MapReduce Processing Workload (HBase) Map H Management Twister Spark Workload Management Reduce (TEZ, LLama) Scheduler Scheduler Scheduler (Pilots, Condor) Scheduler En MPI, RDMA Hadoop Shuffle/Reduction, HARP Collectives C o Data Access (Virtual Filesystem, Cluster Resource Manager Cluster Resource Manager GridFTP, SSH) (Slurm, Torque, SGE) (YARN, Mesos) M a Storage Management (iRODS, SRM, GFFS) Compute Resources Storage Resources Compute and Data Resources (Nodes, Cores, VMs) (Lustre, GPFS) (Nodes, Cores, HDFS) High-Performance Computing Apache Hadoop Big Data A Tale of Two Data-Intensive Paradigms: Data Intensive Applications, Abstractions and Architectures In collaboration with Geoffrey Fox (Indiana), http://arxiv.org/abs/1403.1528

  7. Pilot-Abstraction for HPC and Hadoop Interoperability Map Spark- Other Hadoop/Spark HPC App cation Appli- Reduce App YARN App App (e.g. MPI) Application-level Hadoop YARN Spark Scheduling Application Scheduler Pilot-Job (e.g. Spark, Pilot-Job Tez, LLama) System-level HPC Scheduler Scheduling YARN/HDFS (Slurm, Torque, SGE) Mode I: Hadoop on HPC Mode II: HPC on Hadoop http://arxiv.org/abs/1602.00345

  8. Streaming and Batch Computing Data Questions: - How to manage batch and Compute streaming frameworks side-by- (e.g. YARN, SLURM, Torque, PBS) Broker side? Streaming Hadoop Machine - How to enable interoperability ETL SQL Learning Framework between different programming system/models/middleware/schedu Broker lers? Storage and Format - How to enable elasticity? (e.g. Lustre, HDFS,…) Broker Mutable/ Raw Text HDF5 Columnar Random Other Access Message Broker Storage Stream Processing http://dx.doi.org/10.5281/zenodo.47946

  9. Pilot-Streaming Distributed Application User-Space Pilot API Pilot Compute Pilot Data SAGA Cloud YARN SSH iRODS Cloud HDFS Kafka Globus Online HPC HTC (OSG/EGI) Cloud Hadoop Local / Infrastructure Local/ Local SRM S3 HDFS EBS Parallel FS GFFS (iRODS) (iRODS) (HTTP) (WebHDFS) (SSH/GO) (SSH) Node Node n Node n EC2 VM Node n Node n YARN Node n Node n Node Node n Node n Pilot Agent Pilot Agent SSH SSH SSH SSH SSH SSH Pilot Agent SSH SSH Pilot Agent

  10. Conclusion 1. Pilot-Jobs enable the co-location of HPC/Simulations and Big Data Tools (Hadoop, Spark, higher-level tools) 2. Pilot-Streaming will support message-broker as data source/sink that enables the de-coupling of applications 3. Dynamic resource management provided by the Pilot- Abstraction is critical for stream environments

  11. Thank you!

Recommend


More recommend