NERSC User Group SIG: Experimental Facilities Bryce Foster - PowerPoint PPT Presentation

NERSC User Group SIG: Experimental Facilities Bryce Foster 2020-07-15

Agenda JGI Data Factory JGI's Pipelines NERSC Usage Challenges 2

JGI Overview ● JGI’s mission – provide the scientific community at large with access to high-throughput, high-quality sequencing, DNA synthesis, metabolomics and analysis capabilities – projects involve many important multicellular organisms, microbes and communities of microbes called metagenomes related to the DOE mission areas of bioenergy, understanding global cycles such as the carbon cycle, and biogeochemistry ● JGI is a data factory responsible for delivering project data to users analysis files – Each product has a cycle time requirement to keep data flowing – Many projects involve multiple samples and take multiple years to process the data and provide all of the analyses 3

JGI - Data Factory ● Employees: ~280 ● FY2020 Budget: $77m ● For FY2019 ... – Users: 1940 JGI – Active User Proposals: 600 – Active projects: 16,000 – Samples: 24,000 – Pipeline runs: 95,000 (RQC only) ● 30 active pipelines – Sequencers: 3 PacBio, 8 Illumina ● Partnerships: Livermore Lab, Oakridge Lab analysis packages for our users 4

JGI Growth & Cycle Time ● Projects and sequencing have constantly been growing due to expanded capabilities and new sequencing technologies ● Products have expected Cycle Times – some users have priority projects that require 225 quick turn around from sequencing to analysis TBases – the synthetic biology group needs to know quickly if the DNA sequence created is what the customer ordered ● JGI analysts run analyses daily to keep up with the demand – avg 200-300 daily pipelines runs – sequencers can produce 2 to 3TB per run 5

Big Picture - Data Factory Converts sequence Physical output to files Filtering, Microbial, plant of assemble PPS Metadata and fungal sequencing reads, QC annotations reads reports SDM RQC IMG PMOS GOLD Digital Plant Genome Portal l a s o p o Data Warehouse r P Researcher can Fungal download files from the website 6 Pre-Sequencing Post-Sequencing

Pipeline Characteristics ● Sequence data is strings (ATGCGC…) Novaseq DNA – Input sequence files are large (10MB to 100GB+) sequencers runs twice – Output analysis folders can be 10GB+ a week and produce – Input and output data archived to the tape system using HPSS 2TB of sequence data for each run and can ● Pipelines run from 5 minutes to more than 7 days create 1000s of – Dependent on pipelines, sample sizes and product types sequencing files as inputs to pipelines – Sequence data runs 2 to 5 different pipelines ● Heavy disk I/O and high memory – use Project B and CScratch ● both are used to work around software bugs – loading input files or large databases into memory – difficult to predict memory requirements ahead of time ● Wide variety of node usage – Many pipelines run on one node for analysis – Some pipelines runs on several nodes for one analysis – Run on a workflow node and manage parallel analyses on cluster 7

Example Pipeline Each stage runs from 5 minutes to hours with Sequencer file: 200mb varying memory and (Short reads: 150 bases) CPU requirements ATTCGCCATGCAT ... GC Analysis Diamond Aligner Each stage is reading or writing to the file Runs input against several system Identification other data sources (1MB to Identification Identification 300GB) Blast+ Blast+ Blast+ Aligner Subsample Assembly Assembly Trim BBTools SPAdes BBTools GC vs Coverage Reports Blast+, Minimap2 Latex Run docker container for Wrapper code SpaDES/3.14.1 Quality Rating runs 2+ different checkm, barnap docker containers Entire pipeline submitted to Cori as one job ● python with conda environment Tetramer Analysis R, BBTools 8

NERSC Usage ● Since 2011, JGI has depended on NERSC resources to run pipelines ● Almost all of JGI's analyses runs on the Cori cluster – JGI has its own partition on Cori because JGI requires short queue wait times – JGI bought a high memory partition (19 nodes, 1.5 TB) because jobs need more than 128G of RAM – heavy usage of Shifter to run Docker images and conda environments – use Cori's general partition for overflow capacity, some KNL usage ● KNL is 3x to 5x slower than Haswell nodes – 80% of JGI's usage of Cori is annotation of DNA sequences (what genes are in the DNA) ● Disk usage – 5 PB of spinning disk (project B, DNA, sandbox) – 20 PB of analysis files on tape (NERSC tape system - HSI) ● Consultants – JGI has 2 "full time" consultant positions split among 3 NERSC staff 9

NERSC & JGI Cluster Migration History "Clusters" Each JGI group had Genepool nodes a small cluster repurposed for Denovo. Wasn't ready Genepool (UGE/SGE - 360 nodes 128Gb to 1Tb memory) for several months Racks dedicated to JGI, fairshare for each JGI group Denovo Genepool replacement Cori (Slurm, Shifter) Need custom New HPC for all of LBL scheduler rules giving production users priority Cori Genepool 192 nodes only for JGI 1.5 TB nodes with local disk for JGI's Cori Ex Vivo computing needs 19 high memory nodes 2016 2012 2014 2018 10

Challenges working on NERSC Clusters ● "Weather" on Cori – Regular problems with reading or writing files on the network file system (DVS) – Slower pipeline throughput because no local disk – Monthly maintenance can hold up analysis ● e.g. need 5 days to run, maintenance in 4 days - jobs won't run until next week – Wasted resources spent debugging and rerunning failures NERSC CLE NERSC Upgrade CLE Upgrade NERSC CLE Fix 11

Run Time Experiment - Different Disk Systems 12

Haswell, KNL and Skylake CPU Comparison Run times running commonly used pipelines on Haswell, KNL and Skylake using ProjectB (B) and CScratch (c) Conclusions ● Haswell and Skylake Run times = 0 perform much better for because of bug with 3rd party software JGI's pipelines and cscratch ● Using ProjectB or CScratch have little affect on run time 13

Challenges with NERSC ● Retooling every few years for a new cluster – interruptions for installation (power work last weekend) – changed from using modules to using conda and docker for software packages – changed cluster scheduler from SGE/UGE to slurm ● Our team uses more than 30 different 3rd party software packages – trying to use cluster not ready for production ● file system not mounted or mounted read only ● scheduler not configured correctly adding addtional cycle time to analysis – NERSC chasing the high performance systems isn't beneficial for us ● what changes to the file system and nodes will need us to retool our pipelines again? ● bioinformatic pipelines are not GPU-friendly or require a lot of retooling 14

JGI's Wishlist for NERSC ● stable mid-range compute environment dedicated for JGI's computing needs ● local disk on nodes because I/O is much faster (25% faster run time) ● quarterly maintenance (or less) because interruptions affect product cycle time ● longer windows for computing (10 days+) because it is difficult to break up long running 3rd party software ● nodes that can be used to create docker containers at NERSC ● Benefits – Spend less time retooling code for new clusters – Spend more time doing analysis and creating new products for our customers 15

Discussion What do we need to do to make our code work at NERSC this time? Acknowledgements: ● Alicia Clum ● Alex Copeland ● Christa Pennacchio 16

Hidden Slide User Proposal memory Project 1 1 2 3 n Project 2 ...processing... Product 1 File System Product 2 Product 2 outputs 17

NERSC User Group SIG: Experimental Facilities Bryce Foster - PowerPoint PPT Presentation

NERSC User Group SIG: Experimental Facilities Bryce Foster 2020-07-15 Agenda JGI Data Factory JGI's Pipelines NERSC Usage Challenges 2 JGI Overview JGIs mission provide the scientific community at large with access to

Accelerating Experimental Workflows on NERSC systems Katie Antypas NERSC Division Deputy

External Services on the NERSC Hopper System Katie Antypas, Tina Butler, and Jonathan Carter

Filesystems and I/O Balance on the NERSC T3E Tina Butler, NERSC Systems Group This work was

Mendel at NERSC: Multiple Workloads on a Single Linux Cluster Larry Pezzaglia NERSC

Benchmark Performance of Different Compilers on a Cray XE6 Mike Stewart and Helen He NERSC User

Recent Workload Characterization Activities at NERSC Harvey Wasserman NERSC Science Driven System

UPDATE ON NERSC PScheD EXPERIENCES, A CONTINUING SUCCESS STORY Tina Butler - NERSC Brent Draney

The Art of Conversation With CrayPort Bidirectional Record Management Daniel Gens, Owen James,

Planning for the Future of Data, Storage, and I/O at NERSC Glenn K. Lockwood, Ph.D Advanced

Tapes Not Dead At LBNL/NERSC Nick Balthaser MSST 2019 May 21, 2019 Storage @NERSC

High-Performance Computing and NERSC Rebecca Hartman-Baker, PhD Presentation for CSSS Program

DVS, GPFS and External Lustre at NERSC How Its Working on Hopper Tina Butler, Rei Chi Lee,

SAXS and SANS facilities and experimental practice Clement Blanchet SAS experiment Detector

Experiences with High Performance Intrusion Detection in the HPC Environment Cray Users Group

Building Web Gateways to Science in Python Shreyas Cholia NERSC/LBL SciPy 2010 Jun 30 th 2010

RAMP for Exascale RAMP Wrap August 25th, 2010 Kathy Yelick NERSC Overview NERSC represents

Current and Future Data Intensive Computing at DOE BES User Facilities Steve Miller Scientific

SLURM. Our Way. Douglas Jacobsen, James Botts, Helen He NERSC CUG 2016 NERSC Vital Statistics

MAST ONLINE USER GROUP 24 JUNE 2020 USER GROUP WILL BEGIN AT 2PM AGENDA What does MAST

Maintaining Individual Traceability in Shared Project Accounts with CEDPS/VDT Tools Shreyas

Post-Mortem of the NERSC Franklin XT Upgrade to CLE 2.1 James M. Craw, Nicholas P. Cardo, Yun

Perlmutter - A 2020 Pre-Exascale GPU-accelerated System for NERSC - Architecture and Application

Experimental Mathematics and High-Performance Computing David H Bailey Lawrence Berkeley

Use of Facilities Use of Facilities Approval process Limitations for Use Group/organization