3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 and back again 3. to share or not to share john bent, dagstuhl 2017 seagate gov
grind-crunch bent john bent, dagstuhl 2017 seagate gov
bent’s super simplistic understanding of hpc sims ckpt ckpt ckpt ckpt cycle cycle cycle cycle comm comm comm grind grind grind grind load load load load crunch crunch crunch crunch Key observation: a grind (for strong-scaling apps) traverses all of memory.
one more dollar test
2 to 4 and back again bent, settlemeyer, grider
Tightly Coupled Parallel Application HPC Storage Stack, 2002-2015 Concurrent, unaligned, interspersed IO Parallel File System Concurrent, aligned, interleaved IO Tape Archive
whence burst buffer? (Graph and analysis courtesy of Gary Grider)
Tightly Coupled Parallel Application HPC Storage Stack, 2002-2015 Concurrent, unaligned, interspersed IO Parallel File System Concurrent, aligned, interleaved IO Tape Archive
Tightly Coupled Parallel Application HPC Storage Stack, 2015-2016 Burst Buffer Parallel File System Tape Archive
whence object? Post 2016 HPC MarFS Cold Data Requirements
Tightly Coupled Parallel Application HPC Storage Stack, 2015-2016 Burst Buffer Parallel File System Tape Archive
Tightly Coupled Parallel Application HPC Storage Stack, 2016-2020 Burst Buffer Parallel File System Object Store Tape Archive
four is too many
Tightly Coupled Parallel Application HPC Storage Stack, 2020- Burst Buffer Object Store
why not one?
17
the number is 2 of physical, there may be many of logical, there should be two human in loop to make difficult decisions one storage system focused on performance, one on durability durable should be site-wide, perf can be machine-local
from 2 to 4 and back again Tightly Coupled Parallel Application Tightly Coupled Parallel Application Tightly Coupled Parallel Application Tightly Coupled Parallel Application Concurrent, unaligned, interspersed IO Burst Buffer Burst Buffer Burst Buffer Parallel File System Parallel File System Parallel File System Concurrent, aligned, interleaved IO Object Store Object Store Tape Archive Tape Archive Tape Archive 2002-2015 2015-2016 2016-2020 2020-
to share or not to share a comparison of burst buffer architectures bent, settlemeyer, cao john bent, dagstuhl 2017 seagate gov
three places to add burst buffers CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN private, e.g. Cray/Intel Aurora @ Argonne john bent, dagstuhl 2017 seagate gov
three places to add burst buffers CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN shared, e.g. Cray Trinity @ LANL john bent, dagstuhl 2017 seagate gov
three places to add burst buffers CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN embedded, e.g. Seagate Nytro NXD john bent, dagstuhl 2017 seagate gov
private no contention linear scaling low cost no network bandwidth coupled failure domain single shared file is difficult small jobs cannot use them all john bent, dagstuhl 2017 seagate gov
shared n-1 easy data can outlive job temporary storage if pfs offline small jobs can use it all decoupled failure domain most flexible ratio btwn compute, burst, pfs most expensive interference possible john bent, dagstuhl 2017 seagate gov
embedded n-1 easy data outlives job small jobs can use it all decoupled failure domain from app low cost most transparent SAN must be provisioned for burst interference possible most transparent john bent, dagstuhl 2017 seagate gov
the value of decoupled failure domains e.g. 2-way replica observations shared doesn’t need parity private does ... but then the high private perf is lost e.g. RAID6 (10+2) Bent, Settlemyer, et al. On the non-suitability of non-volatility. HotStorage ’15. john bent, dagstuhl 2017 seagate gov
the value of shared for bandwidth Local Local Shared Unreliable 20% Parity Unreliable Mean Ckpt Bw 206.8 GB/s simulation of APEX workflows running on Trinity Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017. john bent, dagstuhl 2017 seagate gov
the value of shared for bandwidth Local Local Shared Unreliable 20% Parity Unreliable Mean Ckpt Bw 206.8 GB/s 165.6 GB/s simulation of APEX workflows running on Trinity Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017. john bent, dagstuhl 2017 seagate gov
the value of shared for bandwidth Local Local Shared Unreliable 20% Parity Unreliable Mean Ckpt Bw 206.8 GB/s 165.6 GB/s 614.54 GB/s simulation of APEX workflows running on Trinity observation: capacity machines need shared burst buffers Lei Cao, Bradley Settlemyer, and John Bent. To share or not to share: Comparing burst buffer architectures. SpringSim 2017. john bent, dagstuhl 2017 seagate gov
babblings from bent @dagstuhl (blame the jet lag) john.bent@seagategov.com john bent, dagstuhl 2017 seagate gov
Recommend
More recommend