between analysis of microarray data
play

Between Analysis of Microarray Data Aedn Culhane Des Higgins - PowerPoint PPT Presentation

Between Analysis of Microarray Data Aedn Culhane Des Higgins Biochemistry Dept. - University College Cork, Ireland Guy Perrire Laboratoire BBE - Universit Claude Bernard Lyon 1 Specify Groups in Advance? Neighbourhood analysis


  1. Between Analysis of Microarray Data Aedín Culhane Des Higgins Biochemistry Dept. - University College Cork, Ireland Guy Perrière Laboratoire BBE - Université Claude Bernard Lyon 1

  2. Specify Groups in Advance? • Neighbourhood analysis (Golub et al., 1999) • Neural network (Khan et al., 2001) • Support vector machine (Brown et al., 2000) • Discriminant analysis –Linear combinations of genes which •maximise between group variance •minimise within group variance However must have J (samples) >> I (genes)

  3. Between-Group Eigenanalysis • Dolédec, S. & Chessel, D. (1987) Rhythmes saisonniers et composantes stationelles en milieu aquatique I- Description d’un plan d’observations complet par projection de variables. Acta Oecologica, Oecologica Generalis . 8 (3) 403-426. • Discriminate when Samples < Variables • Combine with PCA, CA etc.

  4. Between Group Eigenanalysis J samples I genes GSVD

  5. ADE-4 Thioulouse J., Chessel D., Dolédec S., & Olivier J.M. (1997) ADE-4: a multivariate analysis and graphical display software. Statistics and Computing , 7 , 1, 75-83. http://pbil.univ-lyon1.fr/ADE-4/

  6. Golub Leukaemia Data • Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Golub, T.R. … E.S. Lander Science, 286: 531-537 (1999) http://www.genome.wi.mit.edu/MPR • 47 Acute Lymphoblastic Leukaemia (ALL) – 38 B-cell – 9 T-cell • 25 Acute Myeloid Leukaemia (AML) • Affymetrix oligonucleotide array (6817 genes) • 38 training samples; 34 test samples

  7. BGA of Golub Data Define groups Ordinate GROUP centroids (using PCA or COA) Add individual samples as supplemental data points

  8. BGA of Golub Data Determine threshold of discriminating axes Project and classify new T data points Test model – Jackknifing, Blind test data

  9. Identification of genes Genes and samples can be plotted on “biplot” Simultaneous visual analysis of the entire set of genes

  10. Small round blue cell tumours of childhood • Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Javed Khan, Jun S. Wei, … and Paul S. Meltzer Nature Medicine , Volume 7, Number 6, June 2001 • cDNA microarray 6567 genes, 4 classes of cancer • EWS Ewing family of tumours • RMS Rhabdomyosarcoma • NB Neuroblastoma • BL Burkitt lymphoma • Training and test samples

  11. BGA of Khan data Axis 1, 2

  12. Accuracy • 19/20 EWS, BL, NB and RMS test samples were correctly predicted • One NB test sample, a biopsy sample Test 23 was not classified • 2 normal skeletal muscle samples clustered closest to the RMS cluster • 3 unrelated cancer cell lines clustered in the centre of the figures

  13. BGA of Khan Data Biplot of genes and arrays Axis 1,2

  14. Discriminating Genes • Similar to those reported by Khan • Rank of top 10 EWS identical to Khan • 9 of top 12 RMS discriminating genes matched Khan’s top 10 RMS • 4 top 5 NB genes matched Khan’s top 5 • Khan only reported 17 BL genes, 14 detected by BGA. • RMS discriminating genes • Image clone at locus 8p22-23 • Image clone MEST – imprinted gene on chr 7q12

  15. Conclusions • Ordination of grouped data • Number of variables >> number of samples. • Fast and simple but accurate class assignment • Detailed and simultaneous visualisation of variables • Discrimination of any number of subgroups can be easily explored

  16. Dr. Des Higgins Department of Biochemistry, University College Cork, Cork, Ireland. Guy Perrière Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS n o 5558 Université Claude Bernard – Lyon 1 France.

Recommend


More recommend