parallel io
play

Parallel IO These slides are possible thanks to these sources - PowerPoint PPT Presentation

Parallel IO These slides are possible thanks to these sources Jonathan Drusi - SCInet Toronto Parallel I/O Tutorial, Argonne National Labs: HPC I/O for Computational Scientists,TACC/Cornell MPI/IO Tutorial, NeRSC Lustre Notes; Quincey


  1. Parallel IO These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Parallel I/O Tutorial, Argonne National Labs: HPC I/O for Computational Scientists,TACC/Cornell MPI/IO Tutorial, NeRSC Lustre Notes; Quincey Koziol – HDF Group nci.org.au nci.org.au @NCInews

  2. References • eBook: High Performance Parallel I/O – Chapter 8: Lustre – Chapter 13: MPI/IO – Chapter 15: HDF5 • HPC I/O For Computational Scientists (YouTube); Slides • Parallel IO Basics - Paper • eBook: Memory Systems - Cache, DRAM, Disk – Bruce Jacob nci.org.au

  3. The Advent of Big Data • Big Data refers to datasets and flows large enough that have outpaced our capability to store , process , analyze and understand – Increase in computing power makes simulations larger and more frequent – Increase in sensor technology resolution creates larger observation data points • Data sizes that once used to be measured in MBs or GBs now measured in TBs or PBs • Easier to generate the data than to store it nci.org.au

  4. The Four V’s nci.org.au http://www.ibmbigdatahub.com/infographic/four-vs-big-data

  5. BIG DATA PROJECTS AT THE NCI nci.org.au

  6. ESA’s Sentinel Constellation Sentinel-1 observation scenario and impact on data volumes • Sentinel-1 systematic observation scenario in one/two main high rate modes of operation will result in significanlty large acquisition segments (data takes of few minutes) • 25min in high rate modes leads to about 2.4 TBytes of compressed raw 2 min IW data per day for the 2 satellites 16 GB for SLC 4 GB for GRD-HR • Wave Mode operated continuously over ocean where high rate modes are not used 46 GB for SLC 12 GB for GRD-HR 6 min IW 15 min IW nci.org.au We care for a safer world

  7. ESA’s Sentinel Constellation Sentinel-1 observation scenario and impact on data volumes • Sentinel-1 systematic observation scenario in one/two main high rate modes of operation will result in significanlty large acquisition segments (data takes of few minutes) • 25min in high rate modes leads to about 2.4 TBytes of compressed raw Sentinel-1s 2.4TB/day 2 min IW data per day for the 2 satellites 16 GB for SLC Sentinel-2s 1.6TB/day (High-Res Optical Land Monitoring) 4 GB for GRD-HR Sentinel-3s providing 0.6 TB/day (Land+Marine Observation) • Wave Mode operated continuously over ocean where high rate modes are not used 46 GB for SLC 12 GB for GRD-HR 6 min IW 15 min IW nci.org.au We care for a safer world

  8. Nepal Earthquake Inteferogram using Sentinel SAR Data nci.org.au

  9. Data Storage at NCI nci.org.au nci.org.au @NCInews

  10. Data Storage Subsystems at the NCI • The NCI compute and data environments allow researchers the ability to seamlessly work with HPC and Cloud based compute cycles, while have unified data storage • How is this done? nci.org.au

  11. NCI’s integrated high-performance environment Internet Raijin Login + Data NCI data VMware Cloud movers movers Raijin HPC Compute 10 GigE To Huxley DC /g/data 56Gb FDR IB Fabric Raijin 56Gb FDR IB Fabric Persistent global parallel Raijin high-speed Massdata (tape) filesystem filesystem /g/data1 /g/data2 /short /home, /system, Cache 1.0PB, /images, /apps ~6.3PB ~3.1PB 7.6PB Tape 12.3PB nci.org.au

  12. HARDWARE TRENDS nci.org.au

  13. Disk and CPU Performance Disk (MB/s), CPU (MIPS) Di “ ma techno Tho Bes Ha ! M Wi “ Seco HPCS2012 nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

  14. Disk and CPU Performance Disk (MB/s), CPU (MIPS) 1000x Di “ ma techno Tho Bes Ha ! M Wi “ Seco HPCS2012 nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

  15. Memory and Storage Latency nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

  16. Assessing Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads • IOPS – IO Operations Per Second – open(), close(), seek(), read(), write() nci.org.au

  17. Assessing Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads Lab – measuring MB/s and IOPS • IOPS – IO Operations Per Second – open(), close(), seek(), read(), write() nci.org.au

  18. Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads • IOPS – IO Operations Per Second – open(), close(), seek(), read(), write() Device Bandwidth (MB/s) IOPS SATA HDD 100 100 SSD 250 10000 HD: ! Open, Write, Close 1000x1kB files: 30.01s (eff: 0.033 MB/s) ! Open, Write, Close 1x1MB file: 40ms (eff: 25 MB/s) nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

  19. Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads SSDs better at IOPS • IOPS – IO Operations Per Second – no moving parts – open(), close(), seek(), read(), write() Latency at controller, system calls etc. SSDs are still very Device Bandwidth (MB/s) IOPS expensive. Disk to stay! SATA HDD 100 100 SSD 250 10000 SSD: ! Open, Write, Close 1000x1kB files: 300ms (eff: 3.3 MB/s) ! Open, Write, Close 1x1MB file: 4ms (eff: 232 MB/s) nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

  20. Storage Performance • Data Rate – MB/sec – Peak or sustained – Writes are faster than reads SSDs better at IOPS • IOPS – IO Operations Per Second – no moving parts Raijin – open(), close(), seek(), read(), write() Latency at controller, /short – aggregate – 150GB/sec (writes), 120GB/sec (reads) system calls etc. 5 DDN SFA12K arrays for /short, each is capable of SSDs are still very 1.3M read IOPS; 700,000 write IOPS yielding a total Device Bandwidth (MB/s) IOPS expensive. Disk to of 6.5M read IOPS and 3.5M write IOPS stay! SATA HDD 100 100 SSD 250 10000 SSD: ! Open, Write, Close 1000x1kB files: 300ms (eff: 3.3 MB/s) ! Open, Write, Close 1x1MB file: 4ms (eff: 232 MB/s) nci.org.au Jonathan Dursi https://support.scinet.utoronto.ca/wiki/images/3/3f/ParIO-HPCS2012.pdf

  21. The Linux Storage Stack Diagram Fibre Channel over Ethernet Fibre Channel Virtual Host version 4.10, 2017-03-10 FireWire outlines the Linux storage stack as of Kernel version 4.10 ISCSI USB mmap (anonymous pages) Applications (processes) LIO malloc iscsi_target_mod tcm_usb_gadget chmod(2) open(2) write(2) read(2) stat(2) vfs_writev, vfs_readv, ... tcm_qla2xxx ... sbp_target tcm_vhost tcm_fc VFS Block-based FS Network FS Pseudo FS Special ext2 ext3 ext4 xfs NFS coda purpose FS proc Direct I/O sysfs Page target_core_mod ... (O_DIRECT) btrfs ifs iso9660 smbfs tmpfs ramfs cache pipefs futexfs ... target_core_ fi le gfs ocfs ceph ... devtmpfs usbfs Stackable FS target_core_iblock FUSE ecryptfs overlayfs unionfs userspace (e.g. sshfs) target_core_pscsi target_core_user network (optional) stackable struct bio - sector on disk BIOs (block I/Os) Devices on top of “normal” BIOs (block I/Os) - sector cnt - bio_vec cnt block devices LVM drbd - bio_vec index device mapper - bio_vec list mdraid dm-crypt dm-mirror ... bcache dm-cache dm-thin dm-raid dm-delay userspace BIOs BIOs BIOs Block Layer I/O scheduler blkmq Maps BIOs to requests multi queue hooked in device drivers noop Software (they hook in like stacked ... queues cfq devices do) deadline Hardware Hardware ... dispatch dispatch queues queue BIO Request Request based drivers based drivers based drivers Request-based device mapper targets dm-multipath SCSI mid layer sysfs scsi-mq /dev/zram* /dev/rbd* /dev/mmcblk*p* /dev/nullb* /dev/vd* /dev/rssd* /dev/skd* (transport attributes) SCSI upper level drivers ... /dev/ubiblock* /dev/nbd* /dev/loop* /dev/nvme*n* /dev/sda /dev/sd* /dev/rsxx* Transport classes scsi_transport_fc /dev/st* /dev/sr* zram ubi rbd nbd mmc loop null_blk virtio_blk mtip32xx nvme skd rsxx scsi_transport_sas scsi_transport_... network memory SCSI low level drivers megaraid_sas qla2xxx pm8001 iscsi_tcp virtio_scsi ... libata ufs ... ata_piix mpt3sas vmw_pvscsi ahci aacraid lpfc network HDD SSD DVD LSI Qlogic PMC-Sierra para-virtualized Micron nvme stec mobile device virtio_pci device device drive RAID HBA HBA SCSI PCIe card nci.org.au fl ash memory Adaptec Emulex LSI 12Gbs VMware's SD-/MMC-Card IBM fl ash RAID HBA SAS HBA para-virtualized adapter SCSI Physical devices

Recommend


More recommend