identifying stage specific genes by combining information
play

Identifying Stage-Specific Genes by Combining Information from Two - PowerPoint PPT Presentation

Identifying Stage-Specific Genes by Combining Information from Two Types of Oligonucleotide Arrays Yin Liu, Ning Sun, Junfeng Liu, Liang Chen, Michael McIntosh, Liangbiao Zheng and Hongyu Zhao Yale University 1 CAMDA 2004 Yin Liu


  1. Identifying Stage-Specific Genes by Combining Information from Two Types of Oligonucleotide Arrays Yin Liu, Ning Sun, Junfeng Liu, Liang Chen, Michael McIntosh, Liangbiao Zheng and Hongyu Zhao Yale University 1 CAMDA 2004 Yin Liu Nov. 12, 2004

  2. Objective � Identify genes differentially expressed in sporozoite and gametocyte stages - potential candidates for transmission blocking vaccine development 2 CAMDA 2004 Yin Liu Nov. 12, 2004

  3. Outline � Description of the datasets � Our approaches � Sporozoite/gametocyte stages specifically expressed genes (not expressed in blood stages) � Sporozoite/gametocyte stages up-regulated genes (constantly expressed in blood stages) � Results � Gene Ontology analysis � Protein-protein interactions � Conclusions 3 CAMDA 2004 Yin Liu Nov. 12, 2004

  4. Two Microarray Datasets � The DeRisi Data � 7,462 long (70mer) oligonucleotides 4,488 genes � � 46 time points across the complete asexual blood stages at 1-hour time scale resolution � The Winzeler Data � 260,596 25mer probes � 5,159 genes six blood stages synchronized by two methods as � well as merozoites, gametocyte and sporozoite stages 4 CAMDA 2004 Yin Liu Nov. 12, 2004

  5. Negative Controls � 281 “EMPTY” spots in the DeRisi’s array � The intensities of these spots were standardized � Summarize the standardized intensities across all the time points for each “EMPTY” spot � The red and green channel intensities have very similar distribution 5 CAMDA 2004 Yin Liu Nov. 12, 2004

  6. Genes Not Expressed in Blood Stages � Standardize all the spots by empMean t , empVar t : mean and variance of the intensities of the true “EMPTY” spots, respectively � Expression cutoff: 95% percentile of the summarized standardized intensities of “EMPTY” spots � 721 genes not expressed across the complete blood stages 6 CAMDA 2004 Yin Liu Nov. 12, 2004

  7. Identify Genes Specifically Expressed in Sporozoite/Gametocyte Stage � Determine an expression cutoff in the Winzeler data � Assume the number of genes identified as not expressed in blood stages based on Winzeler’s data is the same as what identified based on DeRisi’s data � 17% (721/4250) of them are identified as not expressed in blood stages based on DeRisi’s data � Genes specifically expressed in sporozoite/gametocyte stage • Intensity values in sporozoite/gametocyte stages above the cutoff • Not expressed in blood stages 7 CAMDA 2004 Yin Liu Nov. 12, 2004

  8. Representative Genes Specifically Expressed in Sporozoite/Gametocyte Stage � Sporozoite � Sporozoite surface protein 2 � Pbs36-related protein � rifin -encoded proteins � Gametocyte � 25 kDa ookinete surface antigen � Gametocyte antigen 377 ���������������������������������� � 8 CAMDA 2004 Yin Liu Nov. 12, 2004

  9. Identify Genes Upregulated in Sporozoite/Gametocyte Stages � Goal � Predict the gene expression values on sporozoite and gametocyte stages in the DeRisi’s dataset using Winzeler’s dataset � Nonparametric regression � Local Linear regression Kernel function Smoothing parameter � Problem � Gene expression values measured in different time scale resolutions in two datasets � Choose an “invariant” gene set: constantly expressed in blood stages � Average the expression values of the “invariant” genes across blood stages 9 CAMDA 2004 Yin Liu Nov. 12, 2004

  10. Identify Genes Upregulated in Sporozoite/Gametocyte Stages � Predict the gene expression values at sporozoite and gametocyte stages � Compare the gene expression values at blood stages and the values at sporozoite or gametocyte stage � Upregulated genes � Expression values increase at least 1.5 fold 10 CAMDA 2004 Yin Liu Nov. 12, 2004

  11. Results Summary � Compare with the results from Winzeler’s study � We obtain larger number of sporozoite- or gametocyte-stage specific genes � Concordance rate: 78% and 69% � Novel genes MAL13P1.304 (malaria surface antigen) • MAL13P1.148 ( P.falciparum myosin) • 11 CAMDA 2004 Yin Liu Nov. 12, 2004

  12. Gene Ontology Analysis � Gene Ontology database � Describe the roles of genes and gene products in organisms � 40% of gene products in P.falciparum were assigned GO terms � Molecular function � Biological process � Cellular Component � Investigate the enrichment of GO categories of the stage-specific genes 12 CAMDA 2004 Yin Liu Nov. 12, 2004

  13. Molecular Function � Cell Adhesion � Defense/Immunity Protein 13 CAMDA 2004 Yin Liu Nov. 12, 2004

  14. Biological Process � Cell Communication � Metabolism 14 CAMDA 2004 Yin Liu Nov. 12, 2004

  15. Cellular Component � Extracellular 15 CAMDA 2004 Yin Liu Nov. 12, 2004

  16. Gametocyte Stage-Specific Genes Note: The genes identified in Winzeler’s study don’t show different GO term enrichment compared to the overall genes 16 CAMDA 2004 Yin Liu Nov. 12, 2004

  17. Relate Protein Interaction with Gene Expression Pattern � Purpose � An important component of functional annotation � Investigate relationship between gene expression pattern and protein interactions � It is reasonable to believe that there should be a relationship between the gene expression pattern and protein interactions 17 CAMDA 2004 Yin Liu Nov. 12, 2004

  18. Identify Potential Protein Interaction Pairs � “All-against-all” BLASTP comparisons of sequences of the S. cerevisiae and P. falciparum proteomes � Apply program INAPRANOID to identify ortholog groups (orthologs and paralogs) � Concept of “interolog”: (A,B) and (A’,B’) � Transfer the protein interaction information between species to predict the protein interaction pairs in P.falciparum 18 CAMDA 2004 Yin Liu Nov. 12, 2004

  19. Protein Interaction Pairs in Sporozoites and Gametocytes ���������� � ��������� ������������� � ���������� � �������� �������!�� ! � ���"���#��$��������%�&�"��%�������'������%��������������������� � ����&�����(��&���������) ����"���������)&��"��&����%���"����&��&� 19 CAMDA 2004 Yin Liu Nov. 12, 2004

  20. Conclusions � Identification of Sporozoite stage- and gametocyte stage- specific genes � Well-known stage-specific genes � High degree of overlaps between our identified genes and those of the Winzeler’s study � Significant enrichment for certain GO categories � Related to the number of predicted protein interaction pairs � Combine information from different sources � A dataset with higher time scale resolution � Discover novel stage-specific genes � Depend on the data quality of the datasets used 20 CAMDA 2004 Yin Liu Nov. 12, 2004

  21. References � Bozdech Z, et al . The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparum. PLoS Biol. 1(1):E5, 2003. � Le Roch KG, et al . Discovery of gene function by expression profiling of the malaria parasite life cycle. Science 301: 1503-8, 2003. � http://biosun01.biostat.jhsph.edu/Eririzarr/Raffy/ � http://bioinf.wehi.edu.au/limma � Bowman A. and Azzalini A. Applied Smoothing Techniques for Data Analysis, Clarendon Press, Oxford, 1997. � Remm M, Storm CE, Sonnhammer EL. Automatic clustering of orthologs and in- paralogs from pairwise species comparisons. J Mol Biol. 314(5): 1041-52, 2001. � Gardner M, et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419(6906):498-511, 2002 21 CAMDA 2004 Yin Liu Nov. 12, 2004

  22. Acknowledgements � Dr. Hongyu Zhao’s Group � Dr. Ning Sun � Dr. Junfeng Liu � Liang Chen � Dr. Liangbiao Zheng’s Group � Dr. Michael McIntosh This work was supported by NSF grant DMS-0241160 and NIH Institutional Training Grants for Informatics Research. 22 CAMDA 2004 Yin Liu Nov. 12, 2004

  23. Thank You! 23 CAMDA 2004 Yin Liu Nov. 12, 2004

Recommend


More recommend