Benchmarking HDF5 Compression Filters in R Mike L. Smith - PowerPoint PPT Presentation

Oct 25, 2022 •289 likes •414 views

Benchmarking HDF5 Compression Filters in R Mike L. Smith @grimbough HDF5 is a file format for storing large, heterogenous, data Used in a variety of software, e.g: DelayedArray Kallisto ONT sequencing mz5 mass spec

Benchmarking HDF5 Compression Filters in R Mike L. Smith @grimbough
HDF5 is a file format for storing large, heterogenous, data ● Used in a variety of software, e.g: DelayedArray ○ ○ Kallisto ONT sequencing ○ ○ mz5 mass spec file Interfaces in many languages ● ○ C, Python, … ○ rhdf5 & Rhdf5lib ● Key features: ○ Hierarchical Self describing ○ ○ Efficient subsetting Compressed ○ http://neondataskills.org/HDF5/About
HDF5 datasets are not contiguous, but stored in chunks
HDF5 datasets are not contiguous, but stored in chunks
Chunks are stored separately on disk
Only read the chunks needed for a subset
Chunks can be processed by filters - usually for compression Writing GZIP Shuffle Compress In Memory On Disk Data Storage Chunk GZIP Unshuffle Decompress Reading
There are a number of compression filters available ● Internal filters ○ HDF5 ships with support for GZIP and SZIP Dynamic filters ● ○ Third party tools can be made available at runtime ○ Wrap existing compression tool in small amount of C code Provide location to HDF5 and they are loaded when required ○ ○ Independent of the application(s) using them
rhdf5filters provides additional filters in R ● BLOSC meta compressor ● BZIP2 ● Compiles C code on all platforms, including Windows ● Integrated with rhdf5 ○ Writing: Supply argument to function ○ Reading: Used automatically if needed msmith.de/rhdf5filters/ ●
Filters & parameters have been benchmarked Reading Time Writing Time File Size
You can explore the results with a shiny app msmith.de/rhdf5filters-benchmarks ● ● Scripts to run benchmarks also available Grateful for any contributions on ● both style and substance!
Thanks to EMBL Huber Lab & BioC community! msmith.de/rhdf5filters-benchmarks @grimbough

Recommend

JASON BRUDVIK JASON BRUDVIK HDF5 WEB VIEWER HDF5 WEB VIEWER MAX IV LABORATORY MAX IV

JASON BRUDVIK JASON BRUDVIK HDF5 WEB VIEWER HDF5 WEB VIEWER MAX IV LABORATORY MAX IV LABORATORY LUND, SWEDEN LUND, SWEDEN MAX IV LABORATORY MAX IV LABORATORY BEAMLINES BEAMLINES 2 storage rings ~30 beamlines ~0.1 - 30 TB/week HDF5

531 views • 11 slides

Introduction to serial HDF5 Matthieu Haefele Saclay, April 2018, Parallel filesystems and

Introduction to serial HDF5 Matthieu Haefele Saclay, April 2018, Parallel filesystems and parallel IO libraries PATC@MdS Matthieu Haefele Training outline Day 1: AM: Serial HDF5 (M. Haefele) PM: Parallel IO and parallel HDF5 (M. Haefele)

1.09k views • 47 slides

Overview of Discrete-Time Filters First-order filters Ideal filters Practical filters

Overview of Discrete-Time Filters First-order filters Ideal filters Practical filters Frequency-selective filter specifications Ripple versus filter order tradeoff Application example Portland State University ECE 223 DT

590 views • 40 slides

Overview of Discrete-Time Filters Discrete-Time Filters Overview First-order filters N M

Overview of Discrete-Time Filters Discrete-Time Filters Overview First-order filters N M a k y [ n k ] = b k x [ n k ] Ideal filters k =0 k =0 Practical filters M k =0 b k e jwk Y (e j ) k =0 a k e jwk

533 views • 10 slides

Lossless compression in lossy compression systems Almost every lossy compression system

Lossless compression in lossy compression systems Almost every lossy compression system contains a lossless compression system Lossy compression system Dequantizer Transform Lossless Lossless Inverse Quantizer Encoder Decoder

771 views • 29 slides

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

14.9 JPEG and MPEG image compression 31 14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression basis for JPEG2000 JPEG2000 new international standard for still image compression

492 views • 12 slides

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3 Benchmarking: Background B3 stands for: Buildings Benchmarking and Beyond www.CleanEnergyResourceTeams.org B3 Benchmarking: Background The

297 views • 25 slides

S-100 Maintenance Proposals Part 10c (HDF5) Part 8 (Gridded data) S100WG4 / S102PT 25 February

S-100 Maintenance Proposals Part 10c (HDF5) Part 8 (Gridded data) S100WG4 / S102PT 25 February 1 March 2019 Raphael Malyankar Eivind Mong Sponsored by NOAA Overview Proposal 1: Provisions for use of HDF5 File Families.

191 views • 8 slides

V ISUALIZING HDF5 DATA WITH O PEN DX Ireneusz Szcze sniak John Cary Center for Integrated

V ISUALIZING HDF5 DATA WITH O PEN DX Ireneusz Szcze sniak John Cary Center for Integrated Plasma Studies University of Colorado at Boulder September 25, 2002 P RESENTATION S PLAN 2 P LAN OF PRESENTATION Introduction to HDF5 and

438 views • 22 slides

Definition(of(Keywords(and(Its(Organization(in(HDF5( By(Jixia(Li( 1. ! Introduction,

Definition(of(Keywords(and(Its(Organization(in(HDF5( By(Jixia(Li( 1. ! Introduction, Here$is$the$definition$of$keywords$and$how$they$are$organized$in$hdf5$file.$$ To$ ensure$ compatibility$ for$ other$ programing$ languages,$ Python>specific$

79 views • 3 slides

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Outline Introduction Images and Compression Walkthrough of JPEG Compression Steps Complete Compression Process Results and Conclusion JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline Introduction Images

594 views • 45 slides

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Compression Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression Recap Bu ff er Management Thread Safety A piece of code is thread-safe if it functions correctly during simultaneous execution

786 views • 52 slides

Finite Impulse Response (FIR) Digital Filters Digital filters are rapidly replacing classic

Finite Impulse Response (FIR) Digital Filters Digital filters are rapidly replacing classic analog filters. Programmable DSP with MAC can be used to implement digital filters. For high-bandwidth signal processing applications,

540 views • 20 slides

Practical Analog Filters Overview Types of practical filters Filter specifications

Practical Analog Filters Overview Types of practical filters Filter specifications Tradeoffs Many examples J. McNames Portland State University ECE 222 Practical Analog Filters Ver. 1.04 1 Ideal Filters Lowpass Highpass

644 views • 48 slides

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters Filters can be added in AngularJS to format or transform data AngularJS provides filters to currency - Format a number to a currency format. o date -

147 views • 11 slides

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise filters were deployed online earlier this year Based on the offline Hcal HBHE Noise filters used in the PromptReco For HLT, add an additional

339 views • 11 slides

Models for Sentence Compression A Comparison across Domains, Training Requirements and Evaluation

Models for Sentence Compression A Comparison across Domains, Training Requirements and Evaluation Measures James Clarke and Mirella Lapata School of Informatics University of Edinburgh July 2006 ACL 2006 James Clarke and Mirella Lapata 1

1.07k views • 62 slides

in RIOT? R I O T S U M M I T 2 0 2 0 - B A R T M O O N S MAIN GOAL Integrate libSCHC for

Static Context Header Compression [sjiek] Where do we want to go in RIOT? R I O T S U M M I T 2 0 2 0 - B A R T M O O N S MAIN GOAL Integrate libSCHC for standard, IPv6-based connectivity to the smallest devices reliable

473 views • 20 slides

Faster Isogeny-Based Compressed Key Agreement Gustavo H. M. Zanon, Marcos A. Simplicio Jr,

Faster Isogeny-Based Compressed Key Agreement Gustavo H. M. Zanon, Marcos A. Simplicio Jr, Geovandro C. C. F. Pereira , Javad Doliskani, and Paulo S. L. M. Barreto. 1 REVI EW : SI DH AND COMPRESSED KEYS 2 Isogeny-based Crypto n SIDH:

859 views • 81 slides

Compressing RSA/Rabin keys Public keys D. J. Bernstein Each user publishes a key 2 2047 + 1

Compressing RSA/Rabin keys Public keys D. J. Bernstein Each user publishes a key 2 2047 + 1 2 2047 2 2048 1 . Thanks to: University of Illinois at Chicago User knows prime factors of . NSF CCR9983950

771 views • 53 slides

On The Complexity of Compressing Obfuscation Gilad Asharov, Naomi Ephraim, Ilan Komargodski, and

On The Complexity of Compressing Obfuscation Gilad Asharov, Naomi Ephraim, Ilan Komargodski, and Rafael Pass Cornell University and Cornell Tech CRYPTO 2018 Indistinguishability Obfuscation (iO) An obfuscator is a compiler which preserves

1.01k views • 86 slides

Panoramic video content distribution in the xTV project Peter Quax, Panagiotis Issaris, Wouter

Panoramic video content distribution in the xTV project Peter Quax, Panagiotis Issaris, Wouter Vanmontfort, Wim Lamotte Hasselt University, Belgium Panoramic/omni-directional Video Concept Comparison : Google streetview with video instead

539 views • 40 slides

Accelerating the Tucker Decomposition with Compressed Sparse Tensors Shaden Smith and George

Accelerating the Tucker Decomposition with Compressed Sparse Tensors Shaden Smith and George Karypis Department of Computer Science & Engineering, University of Minnesota { shaden, karypis } @cs.umn.edu Euro-Par 2017 1 / 40 Outline Tensor

921 views • 57 slides

Preview question Officially the name of the Tor network is not an acronym, but the or part

Preview question Officially the name of the Tor network is not an acronym, but the or part of the name originated from this technique it uses: CSci 5271 A. onion routing Introduction to Computer Security DoS, Tor, and usability combined

392 views • 10 slides

Benchmarking HDF5 Compression Filters in R Mike L. Smith - PowerPoint PPT Presentation

Benchmarking HDF5 Compression Filters in R Mike L. Smith @grimbough HDF5 is a file format for storing large, heterogenous, data Used in a variety of software, e.g: DelayedArray Kallisto ONT sequencing mz5 mass spec

JASON BRUDVIK JASON BRUDVIK HDF5 WEB VIEWER HDF5 WEB VIEWER MAX IV LABORATORY MAX IV

Introduction to serial HDF5 Matthieu Haefele Saclay, April 2018, Parallel filesystems and

Overview of Discrete-Time Filters First-order filters Ideal filters Practical filters

Overview of Discrete-Time Filters Discrete-Time Filters Overview First-order filters N M

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

S-100 Maintenance Proposals Part 10c (HDF5) Part 8 (Gridded data) S100WG4 / S102PT 25 February

V ISUALIZING HDF5 DATA WITH O PEN DX Ireneusz Szcze sniak John Cary Center for Integrated

Definition(of(Keywords(and(Its(Organization(in(HDF5( By(Jixia(Li( 1. ! Introduction,

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Finite Impulse Response (FIR) Digital Filters Digital filters are rapidly replacing classic

Practical Analog Filters Overview Types of practical filters Filter specifications

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise

Models for Sentence Compression A Comparison across Domains, Training Requirements and Evaluation

in RIOT? R I O T S U M M I T 2 0 2 0 - B A R T M O O N S MAIN GOAL Integrate libSCHC for

Faster Isogeny-Based Compressed Key Agreement Gustavo H. M. Zanon, Marcos A. Simplicio Jr,

Compressing RSA/Rabin keys Public keys D. J. Bernstein Each user publishes a key 2 2047 + 1

On The Complexity of Compressing Obfuscation Gilad Asharov, Naomi Ephraim, Ilan Komargodski, and

Panoramic video content distribution in the xTV project Peter Quax, Panagiotis Issaris, Wouter

Accelerating the Tucker Decomposition with Compressed Sparse Tensors Shaden Smith and George

Preview question Officially the name of the Tor network is not an acronym, but the or part

Sambuz

Useful Links

Newsletter

Mail Us

Benchmarking HDF5 Compression Filters in R Mike L. Smith - PowerPoint PPT Presentation

Benchmarking HDF5 Compression Filters in R Mike L. Smith @grimbough HDF5 is a file format for storing large, heterogenous, data Used in a variety of software, e.g: DelayedArray Kallisto ONT sequencing mz5 mass spec

JASON BRUDVIK JASON BRUDVIK HDF5 WEB VIEWER HDF5 WEB VIEWER MAX IV LABORATORY MAX IV

Introduction to serial HDF5 Matthieu Haefele Saclay, April 2018, Parallel filesystems and

Overview of Discrete-Time Filters First-order filters Ideal filters Practical filters

Overview of Discrete-Time Filters Discrete-Time Filters Overview First-order filters N M

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

S-100 Maintenance Proposals Part 10c (HDF5) Part 8 (Gridded data) S100WG4 / S102PT 25 February

V ISUALIZING HDF5 DATA WITH O PEN DX Ireneusz Szcze sniak John Cary Center for Integrated

Definition(of(Keywords(and(Its(Organization(in(HDF5( By(Jixia(Li( 1. ! Introduction,

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Finite Impulse Response (FIR) Digital Filters Digital filters are rapidly replacing classic

Practical Analog Filters Overview Types of practical filters Filter specifications

AngularJS Unit Testing AngularJS Filters and Services with Karma &amp; Jasmine Filters

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise

Models for Sentence Compression A Comparison across Domains, Training Requirements and Evaluation

in RIOT? R I O T S U M M I T 2 0 2 0 - B A R T M O O N S MAIN GOAL Integrate libSCHC for

Faster Isogeny-Based Compressed Key Agreement Gustavo H. M. Zanon, Marcos A. Simplicio Jr,

Compressing RSA/Rabin keys Public keys D. J. Bernstein Each user publishes a key 2 2047 + 1

On The Complexity of Compressing Obfuscation Gilad Asharov, Naomi Ephraim, Ilan Komargodski, and

Panoramic video content distribution in the xTV project Peter Quax, Panagiotis Issaris, Wouter

Accelerating the Tucker Decomposition with Compressed Sparse Tensors Shaden Smith and George

Preview question Officially the name of the Tor network is not an acronym, but the or part

Sambuz

Useful Links

Newsletter

Mail Us

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters