Detecting “Network Motifs” in Gene Co-expression Networks Xinxia Peng Genome Science & Technology Program The University of Tennessee – Oak Ridge Natl. Lab
Motivation Modularity of Biological Networks
Co-expression Network Correlation Matrix* Adjacency Matrix Cutoff: 0.8 *Pearson’s R
Genes of Similar Function Cluster Together Densely connected subgraphs � Protein complexes � Pathways � … Clique � maximally connected subgraph
Gene Duplication Paralogs “Paralogous pathways”: pathways with duplicated proteins and interactions duplication co-expression
Protein Domain (or Motif) Evolutionary unit Functional unit Reiterated use of domains 1dxy
“Network Motifs” II and III are overlapping, I and II are non- overlapping
Materials and Methods
Protein Domain Annotation HMM Library Protein Sequences Pfam PlasmoDB http://www.sanger.ac.uk http://plasmodb.org /Software/Pfam OIT Cluster HMMER http://icl.cs.utk.edu/si http://hmmer.wustl.edu nrg/index.html Domain Annotations
Network Motif Discovery (1) < G , k , f > Enumeration of k -vertex cliques Protein domain Groups of cliques f : # of non-overlapping cliques Network motifs
Network Motif Discovery (2) p-value: fraction of times putative network motifs found in randomized networks � Randomize the real network by randomly permuting the protein domain labels � Repeat 1,000 times
Network Motif Discovery (3) Domain Matching Level 2 1 A B C A B C 1’ Domain Matching Level 4 D A A 2 D A 2’
Protein Interaction Network and Data Visualization Protein Interaction Network (PPI) � BIND (http://www.blueprint.org/bind/bind.php) � Vertices: genes/proteins � Edges: binary protein interactions � Protein complex: “matrix” model Visualization � ALIVE (http://mouse.ornl.gov/alive) � R (http://www.r-project-org)
Results
Co-expression Network • Complete Dataset • R >= 0.95 • 2,292 ORFs • 93% ( 2124) with strong periodic behavior • cover 78% (2124/2714) of Overview Dataset
Prediction of Network Motifs # of network motifs found # of network motifs having at least one instance in yeast PPI k : size of network motif. f : min. number of non-overlapping instances
↑ k or f , ↑ % in yeast PPI 1/1 5/6 ↑ f 2/3 11/18 25/88 ↑ k 0/0 0/0 0/0 Percentage of network motifs having instance in yeast PPI network by Freq. x Size. Domain matching level 2
↑ k or f , ↑ % in yeast PPI ↑ f 5/5 6/6 13/17 6/9 17/32 29/87 53/197 ↑ k 0/0 0/0 0/0 0/0 0/0 Percentage of network motifs having instance in yeast PPI network by Freq. x Size. Domain matching level 4
Example 1
Functional Annotations DEAD/DEAH box helicase (PF00270) and Helicase conserved C- terminal domain (PF00271), WD domains, G-beta repeats (PF00400), Brix domain (PF04427), GTPase of unknown function (PF01926).
Supported by Yeast Protein Interactions
Prediction of Complementary Functional Units
Example 2
Functional Annotations Protein kinase domain (PF00069), Calcineurin-like phosphoesterase (PF00149), AhpC/TSA family (PF00578), it contains Peroxiredoxins (Prxs), a ubiquitous family of antioxidant enzymes, and Prxs can be regulated by phosphorylation.
Differential Temporal Expression
More Results http://mouse.ornl.gov/~xpv/camda04/index.html
Conclusion New strategy for microarray data analysis Data integration � Gene expression, sequence, protein interaction, … Easier for experimental verification � Small clusters � Implication about relationships among members Biological hypothesis � Modularity of biological networks
Acknowledgements Dr. Jay Snoddy (Genome Science & Technology, UT-ORNL) � Adam Tebbe, Suzanne Baktash, … Dr. Michael Langston (Computer Science, UT) � Nicole Baldwin Dr. Arnold Saxton (Animal Science, UT)
References [1] Bhan, A., Galas, D.J. and Dewey, T.G. A duplication growth model of gene expression networks. Bioinformatics, 18 (11). 1486-1493. [2] Bozdech, Z., Llinas, M., Pulliam, B.L., Wong, E.D., Zhu, J. and DeRisi, J.L. The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparum. PLoS Biol, 1 (1). E5. [3] Chang, L. and Karin, M. Mammalian MAP kinase signalling cascades. Nature, 410 (6824). 37-40. [4] Chang, T.S., Jeong, W., Choi, S.Y., Yu, S., Kang, S.W. and Rhee, S.G. Regulation of peroxiredoxin I activity by Cdc2-mediated phosphorylation. J Biol Chem, 277 (28). 25370-25376. [5] Eisenhaber, F., Wechselberger, C. and Kreil, G. The Brix domain protein family - a key to the ribosomal biogenesis pathway? Trends in Biochemical Sciences, 26 (6). 345-347. [6] Langston, M., Lin, L., Peng, X., Baldwin, N., Symons, C., Zhang, B. and Snoddy, J. A Combinatorial Approach to the Analysis of Differential Gene Expression Data: The Use of Graph Algorithms for Disease Prediction and Screening. in Methods of Microarray Data Analysis IV, Kluwer academic publishers, Boston, In press. [7] Lee, H.K., Hsu, A.K., Sajdak, J., Qin, J. and Pavlidis, P. Coexpression analysis of human genes across many microarray data sets. Genome Research, 14 (6). 1085-1094. [8] Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. and Alon, U. Network motifs: Simple building blocks of complex networks. Science, 298 (5594). 824-827. [9] Neer, E.J., Schmidt, C.J., Nambudripad, R. and Smith, T.F. The ancient regulatory-protein family of WD-repeat proteins. Nature, 371 (6495). 297-300. [10] Pawson, T. and Nash, P. Assembly of cell regulatory systems through protein interaction domains. Science, 300 (5618). 445-452. [11] Shen-Orr, S.S., Milo, R., Mangan, S. and Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genetics, 31 (1). 64-68. [12] Wood, Z.A., Schroder, E., Robin Harris, J. and Poole, L.B. Structure, mechanism and regulation of peroxiredoxins. Trends Biochem Sci, 28 (1). 32-40.
Recommend
More recommend