Initial Characterization of I/O in Large-Scale Deep Learning - PowerPoint PPT Presentation

Initial Characterization of I/O in Large-Scale Deep Learning Applications Fahim Chowdhury, Jialin Liu, Quincey Koziol, Thorsten Kurth, Steven Farrell, Suren Byna, Prabhat, and Weikuan Yu November 12, 2018 - 1 -

Outline  Objectives  DL Benchmarks at NERSC  Profiling Approaches  Experimental Results  Future Work - 2 -

Objectives  Deep Learning (DL) applications demand large-scale computing facilities.  DL applications require efficient I/O support in the data processing pipeline to accelerate the training phase.  The goals of this project are  Exploring I/O patterns invoked through multiple DL applications running on HPC systems  Addressing possible bottlenecks caused by I/O in the training phase  Developing optimization strategies to overcome the possible I/O bottlenecks 4

Objectives  Deep Learning (DL) applications demand large-scale computing facilities.  DL applications require efficient I/O support in the data processing pipeline to accelerate the training phase.  The goals of this project are  Exploring I/O patterns invoked through multiple DL applications running on HPC systems  Addressing possible bottlenecks caused by I/O in the training phase  Developing optimization strategies to overcome the possible I/O bottlenecks 5

HEPCNNB Overview  High Energy Physics Deep Learning Convolutional Neural Network Benchmark (HEPCNNB)  Runs on distributed TensorFlow using Horovod  Can generate particle events that can be described by standard model physics and particle events with R-parity violating Supersymmetry  Uses a 496 GB dataset of 2048 HDF5 files representing particle collisions generated by a fast Monte-Carlo generator named Delphes at CERN 7

CDB Overview  Climate Data Benchmark (CDB)  Runs on distributed TensorFlow using Horovod  Can act as an image recognition model to detect patterns for extreme weather  Uses a 3.5 TB dataset of 62738 HDF5 images representing climate data  Leverages TensorFlow Dataset API and python’s multiprocessing package for input pipelining 8

Profiling Approaches  Develop TimeLogger tool based on python to profile application layer  Determine the total latency from merged interval list for each training component  Explore TensorFlow Runtime Tracing Metadata Visualization (TRTMV) tool developed at Google and extract I/O specific metadata  Working on integration of runtime metadata from application and framework layer  Work available in: https://github.com/NERSC/DL-Parallel-IO 10

HEPCNNB Latency Breakdown 8.01% 3.60% 7.72% 3.08% 6.83% 3.17% 6.16% 2.91% 1.49% 1.44% Local Shuffle Global Shuffle  I/O takes more time when Global Shuffling is introduced  Global Shuffling affects I/O even for small dataset and only 5 epochs training  I/O bottleneck can become more severe with increasing epochs - 12 -

HEPCNNB Read Bandwidth 194.99 187.99 91.53 44.20 30.86 9.91 21.69 8.76 15.90 4.33 Local Shuffle Global Shuffle  I/O takes more time when Global Shuffling is introduced  Global Shuffling affects I/O even for small dataset and only 5 epochs training  I/O bottleneck can become more severe with increasing epochs - 13 -

CDB Latency and Read Bandwidth 8.73% 2.57 15.05% 1.14 10.63% 11.04% 0.33 0.30  The percentage of I/O in the training process is more when dataset is larger  The I/O percentage increases with the number of nodes  Training benefits more from the scaling than I/O - 14 -

Future Work  To integrate TRTMV results with TimeLogger data for better profiling of highly parallelized I/O pipeline  To explore the I/O patterns and determine possible I/O bottlenecks in distributed TensorFlow  To develop an optimized cross-framework I/O strategy to overcome the possible I/O bottlenecks 16

Thank You - 17 -

Initial Characterization of I/O in Large-Scale Deep Learning - PowerPoint PPT Presentation

Initial Characterization of I/O in Large-Scale Deep Learning Applications Fahim Chowdhury, Jialin Liu, Quincey Koziol, Thorsten Kurth, Steven Farrell, Suren Byna, Prabhat, and Weikuan Yu November 12, 2018 - 1 - Outline Objectives DL

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Bell Schedule 2020-21 Initial Data Initial Data Initial Data Initial

Characterization of the Household Electricity Characterization of the Household Electricity

SITE CHARACTERIZATION Part 1. Non-Intrusive Site Characterization Technologies Tyler E. Gass,

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Multi-Scale Initial Conditions Oliver Hahn (KIPAC/Stanford) MULTI SCALE Hahn & Abel (2011)

Regularization Effect of Large Initial Learning Rate Yuanzhi Li* Colin Wei* Tengyu Ma Carnegie

Geomaterial Characterization Sub-topics Chemical characterization pH, TDS, EC, BOD, COD

Sub-topics Chemical characterization Sorption-Desorption (Contaminant Transport in Porous

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Distributed DeepLearning at Scale Soumith Chintala Facebook AI Research Overview Deep

Re Relational Con Constraint So Solving ng in in SMT SMT Paul Meng , Andrew Reynolds, Cesare

The DL-Lite Family of Languages A FO Perspective Alessandro Artale KRDB Research Centre Free

Owning Your Home Network: Router Security Revisited Marcus

JSR-166: Concurrency Utilities Present and Future The java.util.concurrent package aims to do for

Basic ics of f DL Prof. Leal-Taix and Prof. Niessner 1 What we assume you know Linear

The Database as a Value Rich Hickey Complexity Out of the Tar Pit Moseley and Marks (2006)

Broadcasting your attack: Security testing DAB radio in cars Andy Davis, Research Director

Foundations of Chemical Kinetics Lecture 28: Diffusion-influenced reactions, Part I Marc R.

Initial Characterization of I/O in Large-Scale Deep Learning - PowerPoint PPT Presentation

Initial Characterization of I/O in Large-Scale Deep Learning Applications Fahim Chowdhury, Jialin Liu, Quincey Koziol, Thorsten Kurth, Steven Farrell, Suren Byna, Prabhat, and Weikuan Yu November 12, 2018 - 1 - Outline Objectives DL

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Bell Schedule 2020-21 Initial Data Initial Data Initial Data Initial

Characterization of the Household Electricity Characterization of the Household Electricity

SITE CHARACTERIZATION Part 1. Non-Intrusive Site Characterization Technologies Tyler E. Gass,

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Multi-Scale Initial Conditions Oliver Hahn (KIPAC/Stanford) MULTI SCALE Hahn &amp; Abel (2011)

Regularization Effect of Large Initial Learning Rate Yuanzhi Li* Colin Wei* Tengyu Ma Carnegie

Geomaterial Characterization Sub-topics Chemical characterization pH, TDS, EC, BOD, COD

Sub-topics Chemical characterization Sorption-Desorption (Contaminant Transport in Porous

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Distributed DeepLearning at Scale Soumith Chintala Facebook AI Research Overview Deep

Re Relational Con Constraint So Solving ng in in SMT SMT Paul Meng , Andrew Reynolds, Cesare

The DL-Lite Family of Languages A FO Perspective Alessandro Artale KRDB Research Centre Free

Owning Your Home Network: Router Security Revisited Marcus

JSR-166: Concurrency Utilities Present and Future The java.util.concurrent package aims to do for

Basic ics of f DL Prof. Leal-Taix and Prof. Niessner 1 What we assume you know Linear

The Database as a Value Rich Hickey Complexity Out of the Tar Pit Moseley and Marks (2006)

Broadcasting your attack: Security testing DAB radio in cars Andy Davis, Research Director

Foundations of Chemical Kinetics Lecture 28: Diffusion-influenced reactions, Part I Marc R.

Multi-Scale Initial Conditions Oliver Hahn (KIPAC/Stanford) MULTI SCALE Hahn & Abel (2011)