analysis of high throughput biological data part i
play

Analysis of High-Throughput Biological Data Part I: Scalable High - PowerPoint PPT Presentation

NZIMA NZIMA Napier Napier 2008 2008 Analysis of High-Throughput Biological Data Part I: Scalable High Performance Algorithms and Implementations Mike Langston Professor Department of Electrical Engineering and Computer Science University


  1. NZIMA NZIMA Napier Napier 2008 2008 Analysis of High-Throughput Biological Data Part I: Scalable High Performance Algorithms and Implementations Mike Langston Professor Department of Electrical Engineering and Computer Science University of Tennessee and Collaborating Scientist Biological Sciences Division Oak Ridge National Laboratory USA 21 February 2008 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  2. NZIMA Outline of Talk Napier 2008 Sample Application Tools and Technologies Complexity Theory Graph Algorithms High Performance Computation Reconfigurable Computation Compute Engine Problem Variants 2 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  3. NZIMA Outline of Talk Napier 2008 Sample Application Tools and Technologies Complexity Theory Graph Algorithms High Performance Computation Reconfigurable Computation Compute Engine Problem Variants 3 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  4. NZIMA Technology Mapping Napier 2008 Biological Knowledge Analysis Tools . . . . Protein Structure Ontology . . . . Cis -Regulatory Elements Gene Regulatory Networks . . . . Sequence Homology Quantitative Trait Loci . . . . Protein function Combinatorial Algorithms . . . . Cell Physiology Bayesian Networks . . . . 4 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  5. NZIMA Technology Mapping Napier 2008 Biological Knowledge Analysis Tools . . . . Protein Structure Ontology . . . . Cis -Regulatory Elements Gene Regulatory Networks . . . . Sequence Homology Quantitative Trait Loci . . . . Protein function Combinatorial Algorithms . . . . Cell Physiology Bayesian Networks . . . . 5 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  6. NZIMA Technology Mapping Napier 2008 Biological Knowledge Analysis Tools . . . . Protein Structure Ontology . . . . Cis -Regulatory Elements Gene Regulatory Networks . . . . Sequence Homology Quantitative Trait Loci . . . . Protein function Combinatorial Algorithms . . . . Cell Physiology Bayesian Networks . . . . 6 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  7. Gene Regulatory NZIMA Napier Networks 2008 central dogma: one gene one protein cis regulation Gene4 Gene2 Gene3 Gene1 CREs CREs CREs CREs regulation via cis regulatory elements (CREs) promoter, TATA box, motifs, modules 8-15 bp in length, action often at the ends 7 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  8. Gene Regulatory NZIMA Napier Networks 2008 trans regulation (direct) via gene products transcription factor protein translation mRNA transcription Gene4 Gene2 Gene3 Gene1 up or down regulate mRNA expression 8 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  9. Gene Regulatory NZIMA Napier Networks 2008 trans regulation (indirect) via post-translational modification protein transcription factor transcription factor transcription factor kinase phosphorylation protein protein protein protein translation mRNA transcription Gene4 Gene4 Gene2 Gene2 Gene3 Gene3 Gene1 Gene1 up or down regulate mRNA expression up or down regulate mRNA expression 9 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  10. Gene Regulatory NZIMA Napier Networks 2008 many other network actions protein transcription factor transcription factor transcription factor kinase phosphorylation protein protein protein protein translation mRNA transcription Gene4 Gene4 Gene2 Gene2 Gene3 Gene3 Gene1 Gene1 up or down regulate mRNA expression up or down regulate mRNA expression post-transcriptional regulation (e.g., alternate splicing) μ RNA (e.g., functional RNA, RNAi and gene silencing) but all are forms of co-regulation 10 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  11. Currently Awash in a Sea of NZIMA Napier Transcriptomic Data 2008 An organism’s mRNA transcripts: • link between the genome, the proteome and the cellular phenotype • data quality and richness increasing - noise reduction - more conditions - correlation, putative coregulation, regulatory networks • cannot see post-translational modifications (e.g., phosphorylation) • huge range of prokaryotic and eukaryotic data coming on line • timely confluence of technologies • proteomics, metabolomics data not far behind 11 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  12. A Major Computational NZIMA Napier Bottleneck: Clique 2008 Data transformation: • representing biological networks with graphs is well understood • genes (via transcripts, probesets) are denoted by vertices • edges denote significant gene-gene correlations • we seek genesets with common regulatory mechanisms • thus we want to identify dense subgraphs, in particular cliques - complete subgraphs K 4 - special case of subgraph isomorphism - NP -complete to decide - NP -complete even to approximate 12 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  13. NZIMA Outline of Talk Napier 2008 Sample Application Tools and Technologies Complexity Theory Graph Algorithms High Performance Computation Reconfigurable Computation Compute Engine Problem Variants 13 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  14. NZIMA Tools and Technologies Napier 2008 COMPLEXITY THEORY PARALLELISM AND GRIDS Problem Classification Speedup Algorithm Selection Collaboration Intellectual Available Clique Property Technologies GRAPH ALGORITHMS RECONFIGURATION Modeling Hardware Acceleration Optimization Fast Prototyping 14 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  15. NZIMA Tools and Technologies Napier 2008 COMPLEXITY THEORY PARALLELISM AND GRIDS FIXED-PARAMETER TRACTABILITY Intellectual Available Clique Clique Property Technologies GRAPH ALGORITHMS RECONFIGURATION 15 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  16. NZIMA A Little Complexity Theory Napier 2008 The Classic View: “easy” P … … NP Σ 2 P PSPACE 16 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  17. NZIMA A Little Complexity Theory Napier 2008 The Classic View: “easy” P … … NP Σ 2 P PSPACE “hard” 17 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  18. NZIMA A Little Complexity Theory Napier 2008 The Classic View: “fuggettaboutit” “easy” P … … NP Σ 2 P PSPACE “hard” 18 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  19. Fixed-Parameter NZIMA Napier Tractability 2008 Pioneering approach going back twenty years – Well-Quasi-Order theory – nonuniform measure of complexity 19 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  20. Fixed-Parameter NZIMA Napier Tractability 2008 Pioneering approach going back twenty years – Well-Quasi-Order theory – nonuniform measure of complexity Exploit knowledge of the solution space – Consider an algorithm with a time bound such as O(2 kn ). – And now one with a time bound more like O(2 k n). – Both are exponential in parameter value(s). – But what happens when k is fixed? – Fixed-Parameter Tractable (FPT) iff O ( f ( k ) n c ) – Confines superpolynomial behavior to the parameter 20 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  21. NZIMA Complexity Theory, Refined Napier 2008 Hence, the Parameterized View: “solvable” (even if NP-hard!) … … W[1] W[2] XP FPT 21 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  22. NZIMA Complexity Theory, Refined Napier 2008 The Parameterized View: “solvable” (even if NP-hard!) … … W[1] W[2] XP FPT “heuristics only” 22 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  23. NZIMA Complexity Theory, Refined Napier 2008 The Parameterized View: “fuggettaboutit” “solvable” (even if NP-hard!) … … W[1] W[2] XP FPT “heuristics only” 23 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  24. NZIMA Tools and Technologies Napier 2008 COMPLEXITY THEORY PARALLELISM AND GRIDS FIXED-PARAMETER TRACTABILITY Intellectual Available Clique Property Technologies GRAPH ALGORITHMS RECONFIGURATION VERTEX COVER 24 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

  25. NZIMA The Vertex Cover Project Napier 2008 Pioneering approach going back twenty years – Well-Quasi-Order theory – nonuniform measure of complexity Exploit knowledge of the solution space – Consider an algorithm with a time bound such as O(2 kn ). – And now one with a time bound more like O(2 k n). – Both are exponential in parameter value(s). – But what happens when k is fixed? – Fixed-Parameter Tractable (FPT) iff O ( f ( k ) n c ) – Confines superpolynomial behavior to the parameter Duality – We solve vertex cover , clique’s complementary dual _ – O(1.2759 k k 1.5 + kn ) time G G Key features – Kernelization, branching and interleaving 25 ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY OF TENNESSEE

Recommend


More recommend