Accelerating Tandem MS Protein Database Searches Using OpenCL - PowerPoint PPT Presentation

Jan 13, 2024 •534 likes •721 views

Rick Weber, David D. Jenkins, Nicholas Lineback, Robert Hettich, Gregory D. Peterson Accelerating Tandem MS Protein Database Searches Using OpenCL Programming devices the intractable way Programming devices with OpenCL T andem MS/MS

Rick Weber, David D. Jenkins, Nicholas Lineback, Robert Hettich, Gregory D. Peterson Accelerating Tandem MS Protein Database Searches Using OpenCL
Programming devices the intractable way
Programming devices with OpenCL
T andem MS/MS experiment  Collect a sample  Clean it  Try to remove things that aren’t proteins  Dissolve proteins into peptides  Trypsin  Shoot mixture through mass spectrometer  Mass spectrometer gives ~100k scans containing m/Z and intensities
Peptide searching with database
Search algorithms  Mostly differ in the scoring algorithm  Consequently, different execution rates  Sequest  Cross correlation  Most widely used  X! Tandem  Dot product  Myrimatch  Multi-Variate Hypergeometric (MVH) distribution
Specmaster  OpenCL Myrimatch implementation  Runs correctly on AMD, Nvidia GPUs; AMD, Intel CPUs  Not tested anything else  Designed from ground up for speed  Myrimatch already multi-threaded  No 400x speedup using GPU  10x is more reasonable
Algorithm design  Make peptides from proteins sequentially on CPU  Needs to be done in OpenCL (future work)  Amdahl’s law  Perform search using OpenCL devices  Each workgroup processes different MS2+ scan  Each work item searches a different candidate
Search  Binary search for candidates  Precursor masses within tolerance for assumed charge state  Binary search for ions  Look for peaks theoretically predicted for peptide’s amino acid in multiple charge states  Compute MVH as a function of number of found peaks by intensity class
OpenCL and the lack of free lunch  Little performance portability  Different devices have:  Different memories  Different SIMD sizes  Different branch penalties  Different execution models
Memory speeds __constant __local __global __global (cached) (raw) E5-2680 518GB/s 425GB/s 469GB/s 51GB/s GTX 480 1.29TB/s 1.3TB/s 588GB/s 152GB/s Radeon 7970 7TB/s 3.6TB/s 1.7TB/s 213GB/s
Preferred work group sizes  CPU: 1  AMD GPU: multiple of 32 or 64  Nvidia GPU: multiple of 32 or 64
Peformance (as of time of publication)
“Future work” already completed  Portable device specific tuning  Still running with same kernel code on all devices!  Preprocessor abuse  Kernel apathetic to work group size  Heterogeneous scan scoring  Use every device in CPU to score  Up to 90% of peak strong-scaled throughput using 32 cores and 3 Radeon 7970s
Actual future work  Post translational modifications  When generating peptides, create each modified variant of pepties on CPU  Easy (Don’t need to modify kernels)  Probably slow  Take existing unmodified list and modify on the fly on the device  Hard due to lack of recursion in OpenCL  Amortizes sequential execution and PCIe transfers
Acknowledgements  The University of Tennessee  NSF and SCALE-IT  Intel  For donating the research machine
Questions?

Recommend

DNA Short Tandem Repeats Organism DNA Short Tandem Repeats Organ DNA Short Tandem Repeats Cell

DNA Short Tandem Repeats Organism DNA Short Tandem Repeats Organ DNA Short Tandem Repeats Cell Weights 1kg a bag of sugar 1g paper clip 1mg (milligram) 0.001g brain of a bee 1g (microgram) 0.000001g weight of a

1.38k views • 102 slides

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem repeat Ted Pak HURS

460 views • 22 slides

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence motifs Premise: the sequence of a protein Premise: the sequence of a protein sequence gives clues about its structure sequence gives clues

359 views • 11 slides

Tandem modeling investigations Dan Ellis International Computer Science Institute, Berkeley CA

Tandem modeling investigations Dan Ellis International Computer Science Institute, Berkeley CA <dpwe@icsi.berkeley.edu> Outline 1 What makes Tandem successful? 2 Can we make Tandem better? 3 Does Tandem work with LVCSR tricks?

716 views • 7 slides

Protein-Protein interactions Reducing the complexity Why are protein-protein interactions

12/3/2012 Protein-Protein interactions Reducing the complexity Why are protein-protein interactions important? Identify proteins in complexes. Identify proteins that are in a metabolic or signaling pathway. Identify members of a

817 views • 8 slides

Modeling Wind Shielding for FPSO Tandem Offloading using CFD Bob Gordon, Granherne Satpreet

Modeling Wind Shielding for FPSO Tandem Offloading using CFD Bob Gordon, Granherne Satpreet Nanda, CD-adapco Presentation Outline FPSO Tandem Offloading Examples of Shielding in Offshore Applications Wind Tunnel Tests for Tandem

1.03k views • 20 slides

Searches with a Searches with a Disappearing-Track Signature Disappearing-Track Signature Andy

Searches with a Searches with a Disappearing-Track Signature Disappearing-Track Signature Andy Haas Andy Haas New York University New York University LHC Searches for Long-Lived BSM Particles LHC Searches for Long-Lived BSM Particles U.

483 views • 19 slides

Using Single Photons Using Single Photons Using Single Photons Using Single Photons for WIMP

Using Single Photons Using Single Photons Using Single Photons Using Single Photons for WIMP Searches at the ILC for WIMP Searches at the ILC for WIMP Searches at the ILC for WIMP Searches at the ILC K. Murase, T. Tanabe, T. Suehara, S.

679 views • 24 slides

Animal protein production in a Animal protein production in a Animal protein production in a

Animal protein production in a Animal protein production in a Animal protein production in a Animal protein production in a resource depleted world subject to resource depleted world subject to environmental decline and global environmental

419 views • 37 slides

DNA RNA Protein synthesis AMINO ACIDS PROTEIN Protein degradation FUNCTION Some properties

DNA RNA Protein synthesis AMINO ACIDS PROTEIN Protein degradation FUNCTION Some properties of intracellular protein degradation (1970). Abnormal proteins are rapidly eliminated. Normal proteins are selectively degraded at widely

866 views • 29 slides

CSE182-L7 CSE182-L7 Protein structure Basics Protein structure Basics Protein sequencing via MS

CSE182-L7 CSE182-L7 Protein structure Basics Protein structure Basics Protein sequencing via MS Protein sequencing via MS Quiz Quiz What research won the Nobel prize in What research won the Nobel prize in Chemistry in 2004?

503 views • 39 slides

Dynamics of Protein-Protein Interactions: A Probabilistic Model Toward Protein Function Amir

Dynamics of Protein-Protein Interactions: A Probabilistic Model Toward Protein Function Amir Vajdi Computer Science Department University of Massachusetts Boston PhD Dissertation Defense, November 28, 2018 Amir Vajdi (UMB) Protein Function

838 views • 54 slides

Tandem Nishita Muhnot | Kevin Scott | Tiffany Tsai | Ari Zilnik Whats Tandem? The

Tandem Nishita Muhnot | Kevin Scott | Tiffany Tsai | Ari Zilnik Whats Tandem? The Market Name Rating Price Repair Step by Videos Messaging Route Sensing Emergency Injury Help Challenge Instruction step Mapping Request

1.78k views • 149 slides

Tandem bike for autistic person (Team Tandem) Team Members: Client: Callie Mataczynski - Team

Tandem bike for autistic person (Team Tandem) Team Members: Client: Callie Mataczynski - Team Leader Eric Arndt - Communicator Advisor: Aaron Wagner - BWIG/BPAG Mengizem Tizale - BSAC 1 Presentation Overview 2

196 views • 15 slides

Orientations bipolaires et chemins tandem Eric Fusy (CNRS/LIX) Travaux avec Mireille

Orientations bipolaires et chemins tandem Eric Fusy (CNRS/LIX) Travaux avec Mireille Bousquet-M elou et Kilian Raschel Journ ees Alea, 2017 Tandem walks A tandem-walk is a walk in Z 2 with step-set { N, W, SE } in the plane Z 2 in the

780 views • 30 slides

The Potential of Tandem Photovoltaic Solar Cells Tandem Photovoltaic Solar Cells for Indoor

The Potential of Tandem Photovoltaic Solar Cells Tandem Photovoltaic Solar Cells for Indoor Applications pp Ben Minnaert and Peter Veelaert Faculty of Engineering and Architecture Ghent University, Belgium Ben.Minnaert@UGent.be Low power indoor

907 views • 9 slides

KBDOCK A Case-Based Reasoning Approach for Protein Docking Dave Ritchie Team Orpailleur

KBDOCK A Case-Based Reasoning Approach for Protein Docking Dave Ritchie Team Orpailleur Inria Nancy Grand Est Outline Basic Difficulties of Modeling PPIs by Docking The Need to Classify Existing Interactions The KBDOCK Case-Based

544 views • 41 slides

Development of Multiscale Models for Complex Chemical Systems From H+H 2 to Biomolecules Do not

Development of Multiscale Models for Complex Chemical Systems From H+H 2 to Biomolecules Do not go where the pathway leads, go instead where there is no path and leave a trail. Ralph Waldo Emerson 1 Quantum Mechanics of Many-Electron Systems

451 views • 28 slides

Proteomics Informatics Databases, data repositories and standardization (Week 8) Protein

Proteomics Informatics Databases, data repositories and standardization (Week 8) Protein Sequence Databases RefSeq Distinguishing Features of the RefSeq collection include: non-redundancy explicitly linked nucleotide and protein

835 views • 56 slides

CSE182-L11 Protein sequencing and Mass Spectrometry CSE182 Course Summary Gene finding

CSE182-L11 Protein sequencing and Mass Spectrometry CSE182 Course Summary Gene finding Sequence Comparison (BLAST & other tools) Protein Motifs: Profiles/Regular Expression/ HMMs Discovering protein coding genes

911 views • 31 slides

Mass Spectrometry MALDI-TOF ESI/MS/MS Mass spectrometer Basic components Ionization

11/29/2012 Mass Spectrometry MALDI-TOF ESI/MS/MS Mass spectrometer Basic components Ionization source Mass analyzer Detector 1 11/29/2012 Principles of Mass Spectrometry Proteins are separated by mass to charge ratio

421 views • 19 slides

SDEs in large dimension and numerical methods Part 2: Sampling metastable dynamics T. Lelivre

Introduction Accelerated dynamics Adaptive Multilevel Splitting algorithm SDEs in large dimension and numerical methods Part 2: Sampling metastable dynamics T. Lelivre CERMICS - Ecole des Ponts ParisTech & Matherials project-team - INRIA

1.6k views • 80 slides

COMPUTATIONAL PROTEOMICS AND METABOLOMICS Oliver Kohlbacher, Sven

COMPUTATIONAL PROTEOMICS AND METABOLOMICS Oliver Kohlbacher, Sven Nahnsen, Knut Reinert 0. Introduc,on and Overview This work is licensed under a Creative Commons Attribution 4.0

438 views • 41 slides

Towards more efficient molecular simulations Gabriel STOLTZ, Tony LELIEVRE (CERMICS, Ecole des

Towards more efficient molecular simulations Gabriel STOLTZ, Tony LELIEVRE (CERMICS, Ecole des Ponts) (CERMICS) 1 / 17 Presentation of the institutions at play CERMICS: Applied mathematics laboratory of Ecole des Ponts 18 permanent

588 views • 17 slides