Mixing Hadoop and HPC Workloads on Parallel Filesystems Esteban - PowerPoint PPT Presentation

Mixing Hadoop and HPC Workloads on Parallel Filesystems Esteban Molina-Estolano * , Maya Gokhale † , Carlos Maltzahn * , John May † , John Bent ‡ , Scott Brandt * * UC Santa Cruz, ISSDM, PDSI † Lawrence Livermore National Laboratory ‡ Los Alamos National Laboratory Sunday, November 15, 2009

Motivation • Strong interest in running both HPC and large-scale data mining workloads on the same infrastructure • Hadoop-tailored filesystems (e.g. CloudStore) and high- performance computing filesystems (e.g. PVFS) are tailored to considerably different workloads • Existing investments in HPC systems and Hadoop systems should be usable for both workloads • Goal: Examine the performance of both types of workloads running concurrently on the same filesystem • Goal: collect I/O traces from concurrent workload runs, for parallel filesystem simulator work Sunday, November 15, 2009

MapReduce-oriented filesystems • Large-scale batch data processing and analysis • Single cluster of unreliable commodity machines for both storage and computation • Data locality is important for performance • Examples: Google FS, Hadoop DFS, CloudStore Sunday, November 15, 2009

Hadoop DFS architecture "##$%&&"'())$*'$'+",*)-. Sunday, November 15, 2009

High-Performance Computing filesystems • High-throughput, low- latency workloads • Architecture: separate compute and storage .$/01#()2314#(% clusters, high-speed 56'7840((9):%69'( bridge between them • Typical workload: simulation checkpointing • Examples: PVFS, Lustre, "#$%&'()*%(&)+(#,$%- PanFS, Ceph Sunday, November 15, 2009

Running each workload on the non-native filesystem • Two-sided problem: running HPC workloads on a Hadoop filesystem, and Hadoop workloads on an HPC filesystem • Different interfaces: • HPC workloads need a POSIX-like interface and shared writes • Hadoop is write-once-read-many • Different data layout policies Sunday, November 15, 2009

Running HPC workloads on a Hadoop filesystem • Chosen filesystem: CloudStore • Downside of Hadoop’s HDFS: no support for shared writes (needed for HPC N-1 workloads) • Cloudstore has HDFS-like architecture, and shared write support Sunday, November 15, 2009

Running Hadoop workloads on an HPC filesystem • Chosen HPC filesystem: PVFS • PVFS is open-source and easy to configure • Tantisiriroj et al. at CMU have created a shim to run Hadoop on PVFS • Shim also adds prefetching, buffering, exposes data layout Sunday, November 15, 2009

The two concurrent workloads • IOR checkpointing workload • writes large amounts of data to disk from many clients • N-1 and N-N write patterns • Hadoop MapReduce HTTP attack classifier (TFIDF) • Using a pre-generated attack model, classify HTTP headers as normal traffic or attack traffic Sunday, November 15, 2009

Sunday, November 15, 2009

Experimental Setup • System: 19 nodes, 2-core 2.4 GHz Xeon, 120 GB disks • IOR baseline: N-1 strided workload, 64 MB chunks • IOR baseline: N-N workload, 64 MB chunks • TFIDF baseline: classify 7.2 GB of HTTP headers • Mixed workloads: • IOR N-1 and TFIDF, IOR N-N and TFIDF • Checkpoint size adjusted to make IOR and TFIDF take the same amount of time Sunday, November 15, 2009

Performance metrics • Throughputs are not comparable between workloads • Per-workload throughput: measure how much each job is slowed down by the mixed workload • Runtime: compare the runtime of the mixed workload with the runtime of the same jobs run sequentially Sunday, November 15, 2009

Hadoop performance results TFIDF classification throughput, standalone and with IOR 20 Baseline Classification throughput (MB/s) with IOR N-1 with IOR N-N 15 10 5 0 CloudStore PVFS Sunday, November 15, 2009

IOR performance results IOR checkpointing IOR checkpointing on CloudStore on PVFS 90 Standalone 80 Mixed Write throughput (MB/s) 70 60 50 40 30 20 10 0 N-1 N-N N-1 N-N Sunday, November 15, 2009

Runtime results Runtime comparison of mixed vs. serial workloads 2000 Serial runtime 1800 Mixed runtime 1600 Runtime (seconds) 1400 1200 1000 800 600 400 200 0 PVFS N-1 CloudStore N-1 PVFS N-N CloudStore N-N Sunday, November 15, 2009

Tracing infrastructure • We gather traces to use for our parallel filesystem simulator • Existing tracing mechanisms (e.g. strace, Pianola, Darshan) don’t work well with Java or CloudStore • Solution: our own tracing mechanisms for IOR and Hadoop Sunday, November 15, 2009

Tracing IOR workloads • Trace shim intercepts I/O calls, sends to stdio #$%&'()*&.$/#' #$%&'()* #$%&'()*&+*,-) #$%&'()*&0.##$ #$%&'()*&1234 89,*9&9;=)>?@*<-)88&AB=>?C*),:DE*;9)F> <((8)9>?8;G)>?)A:&9;=) #$%&'()*&560.# #$%&'()*&73/ 89:;< Sunday, November 15, 2009

Tracing Hadoop • Tracing shim wraps filesystem interfaces, sends I/O calls to Hadoop logs #$%&'$(+0%.%1234.+.$'%/ 56(+70%.%1234.+.$'%/ #$%&'$()*'+,-.'/ :6%& 9%:;;3<*;=- 56(+7()*'+,-.'/ #$%&'$(+0%.%84.34.+.$'%/ ;$%".% 56(+70%.%84.34.+.$'%/ (;$/%.><?)*'2%/'@<3):@<-.%$.A.)/'@<'2:A.)/'@<;3'$%.);2B3%$%/@<CCCD<E<$'-4*.<F'*%3-':</-G Sunday, November 15, 2009

Tracing overhead • Trace data goes to NFS-mounted share (no disk overhead) • Small Hadoop reads caused huge tracing overhead • Solution: record traces behind read-ahead buffers • Overhead (throughput slowdown): • IOR checkpointing: 1% • TFIDF Hadoop: 5% • Mixed workloads: 10% Sunday, November 15, 2009

Conclusions • Each mixed workload component is noticeably slowed, but... • If only total runtime matters, the mixed workloads are faster • PVFS shows different slowdowns for N-N vs. N-1 workloads • Tracing infrastructure: buffering required for small I/O tracing • Future work: • Run experiments at a larger scale • Use experimental results to improve parallel filesystem simulator • Investigate scheduling strategies for mixed workloads Sunday, November 15, 2009

Questions? • Esteban Molina-Estolano: eestolan@soe.ucsc.edu Sunday, November 15, 2009

Mixing Hadoop and HPC Workloads on Parallel Filesystems Esteban - PowerPoint PPT Presentation

Mixing Hadoop and HPC Workloads on Parallel Filesystems Esteban Molina-Estolano * , Maya Gokhale , Carlos Maltzahn * , John May , John Bent , Scott Brandt * * UC Santa Cruz, ISSDM, PDSI Lawrence Livermore National Laboratory

SAS Data Loader for Hadoop Agenda Intro What is Hadoop? What do I get from Hadoop?

Hadoop on HPC: Integrating Hadoop and Pilot-based Dynamic Resource Management Andre Luckow,

HPC @ SAO S.G. Korzennik - SAO HPC Analyst hpc@cfa February 2013 SGK ( hpc@cfa ) HPC @ SAO

Energy-Efficient Mixing Solutions The power of innovation BioMix TM Compressed Gas Mixing

Uni.lu HPC School 2020 PS6: HPC Containers: Singularity Uni.lu High Performance Computing (HPC)

COMP9313: Big Data Management Hadoop and HDFS Hadoop Apache Hadoop is an open-source

for HPC workloads Key Liao Center for HPC Shanghai Jiao Tong University Jan 9th, 2019 About Me

The HPC Skill Tree A Brief Overview Kai Himstedt On Behalf of the HPC-CF Board BoF:

Data Management Parallel Filesystems Dr David Henty HPC Training and Support

Introduction Introduction to storage and to storage and filesystems filesystems Introduction

This time we'll talk about filesystems. We'll start out by looking at disk partitions, which are

Hard State Revisited: Network Filesystems Hard State Revisited: Network Filesystems Jeff Chase

Introduction Workloads for Experiments Introduction to workloads CS 239 Workload

MATLAB on UL HPC Checkpointing & parallel execution UL High Performance Computing (HPC) Team

Math 211 Math 211 Lecture #7 Mixing Problems September 10, 2003 2 Mixing Problem #1 Mixing

Introduction to serial HDF5 Matthieu Haefele Saclay, April 2018, Parallel filesystems and

The Byzantine Generals Problem - Kushal Babel Authors Leslie Lamport Turing Award

Byzantine Techniques Michael George November 29, 2005 Michael George Byzantine Techniques

A brief overwiev of pairings Razvan Barbulescu CNRS and IMJ-PRG R. Barbulescu Overview

17-654: Analysis of Software Systems Spring 2005 4/21/2005 Topics Timing attack

Modern Cryptology: from public key cryptography to homomorphic encryption 2015/12 Yaound,

Recreating the NSA's MITM Attack 1 Why Should This Topic be Chosen? Goal 1: Understanding how

Accessing Samba from Linux. Whats new? Whats faster? Whats better? Steve French

Security in a cloud context David Crooks, for the EGI CSIRT Lessons learned from recent incidents

Mixing Hadoop and HPC Workloads on Parallel Filesystems Esteban - PowerPoint PPT Presentation

Mixing Hadoop and HPC Workloads on Parallel Filesystems Esteban Molina-Estolano * , Maya Gokhale , Carlos Maltzahn * , John May , John Bent , Scott Brandt * * UC Santa Cruz, ISSDM, PDSI Lawrence Livermore National Laboratory

SAS Data Loader for Hadoop Agenda Intro What is Hadoop? What do I get from Hadoop?

Hadoop on HPC: Integrating Hadoop and Pilot-based Dynamic Resource Management Andre Luckow,

HPC @ SAO S.G. Korzennik - SAO HPC Analyst hpc@cfa February 2013 SGK ( hpc@cfa ) HPC @ SAO

Energy-Efficient Mixing Solutions The power of innovation BioMix TM Compressed Gas Mixing

Uni.lu HPC School 2020 PS6: HPC Containers: Singularity Uni.lu High Performance Computing (HPC)

COMP9313: Big Data Management Hadoop and HDFS Hadoop Apache Hadoop is an open-source

for HPC workloads Key Liao Center for HPC Shanghai Jiao Tong University Jan 9th, 2019 About Me

The HPC Skill Tree A Brief Overview Kai Himstedt On Behalf of the HPC-CF Board BoF:

Data Management Parallel Filesystems Dr David Henty HPC Training and Support

Introduction Introduction to storage and to storage and filesystems filesystems Introduction

This time we'll talk about filesystems. We'll start out by looking at disk partitions, which are

Hard State Revisited: Network Filesystems Hard State Revisited: Network Filesystems Jeff Chase

Introduction Workloads for Experiments Introduction to workloads CS 239 Workload

MATLAB on UL HPC Checkpointing &amp; parallel execution UL High Performance Computing (HPC) Team

Math 211 Math 211 Lecture #7 Mixing Problems September 10, 2003 2 Mixing Problem #1 Mixing

Introduction to serial HDF5 Matthieu Haefele Saclay, April 2018, Parallel filesystems and

The Byzantine Generals Problem - Kushal Babel Authors Leslie Lamport Turing Award

Byzantine Techniques Michael George November 29, 2005 Michael George Byzantine Techniques

A brief overwiev of pairings Razvan Barbulescu CNRS and IMJ-PRG R. Barbulescu Overview

17-654: Analysis of Software Systems Spring 2005 4/21/2005 Topics Timing attack

Modern Cryptology: from public key cryptography to homomorphic encryption 2015/12 Yaound,

Recreating the NSA's MITM Attack 1 Why Should This Topic be Chosen? Goal 1: Understanding how

Accessing Samba from Linux. Whats new? Whats faster? Whats better? Steve French

Security in a cloud context David Crooks, for the EGI CSIRT Lessons learned from recent incidents

MATLAB on UL HPC Checkpointing & parallel execution UL High Performance Computing (HPC) Team