Bayesian Decomposition Michael Ochs Fox Chase Cancer Center Bioinformatics Fox Chase Cancer Center
Making Proteins Bioinformatics Fox Chase Cancer Center
A Closer Look at Translation Post-Trans- lational Modification RNA Splicing miRNA Bioinformatics Fox Chase Cancer Center
Identifying Pathways A B 3 1 2 C D A B C D www.promega.com Bioinformatics Fox Chase Cancer Center
Goal of Analysis Take measurements of thousands of genes, some of which are responding to stimuli of interest 3 1 2 And find the correct set of basis vectors that link to pathways * * * * * * then identify the pathways Bioinformatics Fox Chase Cancer Center
BD: Matrix Decomposition condition 1 condition M Distribution of Patterns gene 1 * * * * * * * * * * * * * * * * * * * * condition M condition 1 pattern 1 pattern k * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * gene 1 * * * * * * * * * * * * * * pattern 1 * * * * * * * * * * * * * * * * * * * * * * * * X * * * * * * * * * * * * * * = * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * pattern k * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Patterns of * * * * gene N * * * * * * * * * * The behavior of * * * * one gene can be Behavior * * * * Data with different explained as a * * * * behaviors mixture of patterns * * * * gene N * * * * Bioinformatics Fox Chase Cancer Center
Patterns as Basis Vectors Bioinformatics Fox Chase Cancer Center
BD with Knowledge of Classes condition 1 condition M Distribution of Patterns gene 1 * * * * * * * * * * * * * * * * * * * * condition M condition 1 pattern 1 pattern k * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * gene 1 * * * * * * * * * * * * * * pattern 1 * * * * 0 0 0 0 0 0 * * * * * * * * * * * * * * X 0 0 0 0 * * * * 0 0 * * * * = * * * * * * * * * * 0 0 0 0 0 0 0 0 * * * * * * * * * * * * * * * * pattern k * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Patterns of * * * * gene N * * * * * * * * * * * * * * Behavior * * * * Data * * * * * * * * gene N * * * * Bioinformatics Fox Chase Cancer Center
BD Structure Atomic Domains Allow Encoding of Biological Information Markov Chain Monte Carlo is used to explore possible sets of distributions and patterns Bioinformatics Fox Chase Cancer Center
Project Normal Data • Download Data from CAMDA Site • Adjust for Background Measurement • Take Ratios • Calc Mean and SDOM for Each Ratio • Eliminate M3T and M4T Data • Eliminate 24 Points with Only 1 Data Pt – 99% 4 Pts, 1% 3 Pts, 0.1% 2 Pts Bioinformatics Fox Chase Cancer Center
Filtering of Genes • Eliminated all ESTs – Annotated Remaining Genes from Gene Ontology on Unigene Name • Annotated all Genes on Clone ID – 24% Changed Unigene Cluster – 948 Clones Had GO Process Information Bioinformatics Fox Chase Cancer Center
Updating Annotations: ASAP http://bioinformatics.fccc.edu/ Bioinformatics Fox Chase Cancer Center
Bayesian Decomposition • Encoded 3 Known Patterns – Kidney, 6 Conditions – Liver, 6 Conditions – Testis, 4 Conditions • Allowed 1 - 3 Additional Patterns – Account for Behavior Unrelated to Tissue Specific Expression Bioinformatics Fox Chase Cancer Center
Fitting the Data Bioinformatics Fox Chase Cancer Center
Four Patterns 0.3 Kidney Liver Testis 0.25 Background 0.2 0.15 0.1 0.05 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Bioinformatics Fox Chase Cancer Center
Five Patterns 0.3 Kidney Liver 0.25 Testis Background 1 Background 2 0.2 0.15 0.1 0.05 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Bioinformatics Fox Chase Cancer Center
Four vs Five Patterns Bioinformatics Fox Chase Cancer Center
Gene Ontology • Identify Genes “Only” in One Pattern – See if Pattern Enhanced in GO • Identify Genes in a Pattern – 3 σ above Zero in Distribution – Look at GO Assignments • Identify Genes Lacking in Pattern – Eliminate Background (Genes > 70%) – Look for Genes Not in Pattern (3 σ ) Bioinformatics Fox Chase Cancer Center
Genes Only in Kidney by GO neurotransmitter transport * From Old Annotations chloride transport * receptor mediated endocytosis Sodium transport, vesicle- enzyme linked receptor protein mediated transport, amino signaling pathway * acid transport, folate transmembrane receptor protein tyrosine kinase signaling transport, homophilic cell pathway * adhesion, cell-cell vitamin/cofactor transport * adhesion, monovalent vitamin B12 transport inorganic cation transport inorganic anion transport * metal ion transport anion transport * neuropeptide signaling pathway > 10x Enhancement endocytosis * Bioinformatics Fox Chase Cancer Center
Genes Only in Liver by GO antigen processing antigen processing, endogenous From Old Annotations antigen via MHC class I" cellular defense response small molecule transport, response to drug histogenesis and drug susceptibility/resistance * organogenesis, cell-cell adhesion * embryogenesis and homophilic cell adhesion * morphogenesis, lipid response to abiotic stimulus metabolism response to chemical substance response to pest/pathogen/parasite > 10x Enhancement protein targeting Bioinformatics Fox Chase Cancer Center
Genes Only in Testis by GO DNA recombination From Old Annotations meiotic recombination reproduction * nuclear organization gametogenesis * and biogenesis, spermatogenesis * chromosome regulation of transcription from organization and Pol II promoter biogenesis, cell microtubule-based movement organization and microtubule-based process biosynthesis development * > 10x Enhancement Bioinformatics Fox Chase Cancer Center
Kidney Genes, 3 σ , > 2 fold amino acid metabolism inflammatory response mitotic cell cycle amine metabolism anion transport nitrogen metabolism perception of abiotic stimulus perception of light cell-cell adhesion homophilic cell adhesion S phase of mitotic cell cycle endocytosis G-protein coupled receptor protein signaling pathway Bioinformatics Fox Chase Cancer Center
Testis Genes, 3 σ , >4 fold reproduction gametogenesis spermatogenesis regulation of cell shape and cell size mitotic cell cycle microtubule-based movement protein folding S phase of mitotic cell cycle Bioinformatics Fox Chase Cancer Center
Liver Genes, 3 σ, >3 fold amino acid metabolism response to drug drug susceptibility/resistance energy pathways energy derivation by oxidation of organic compounds main pathways of carbohydrate metabolism catabolic carbohydrate metabolism response to abiotic stimulus response to chemical substance sensory perception morphogenesis organogenesis tricarboxylic acid cycle Bioinformatics Fox Chase Cancer Center
Genes Absent in Patterns Absent in Kidney Absent in Liver reproduction monosaccharide metabolism gametogenesis regulation of transcription from Pol II promoter spermatogenesis regulation of cell shape and cell differentiation cell size actin filament-based process biological_process unknown actin cytoskeleton obsolete organization and biogenesis reproduction microtubule-based movement gametogenesis spermatogenesis microtubule-based process Bioinformatics Fox Chase Cancer Center
Genes Absent in Background 1 biological_process unknown obsolete protein modification protein targeting actin filament-based process actin cytoskeleton organization and biogenesis endocytosis regulation of transcription from Pol II promoter reproduction gametogenesis spermatogenesis mitotic cell cycle Bioinformatics Fox Chase Cancer Center
Genes Present in Two Tissues Kidney/Liver not Testis Kidney/Testis not Liver cell-cell adhesion mitotic cell cycle homophilic cell adhesion defense response immune response amino acid metabolism amine metabolism perception of abiotic stimulus perception of light Bioinformatics Fox Chase Cancer Center
Acknowledgements • Programming • This Work – Jeffrey Grant – Tom Moloshok – Elizabeth Goralczyk – DJ Datta (Cambridge) – Luke Somers – Andrew Kossenkov – Bill Speier (JHU) • Others – G. Parmigiani (JHU) • Colleagues – T. Brown (Columbia) – J. Robert Beck – E. Korotkov (RAS) – Frank Manion Bioinformatics Fox Chase Cancer Center
Recommend
More recommend