improving gene signatures by the identification of
play

Improving gene signatures by the identification of differentially - PowerPoint PPT Presentation

Improving gene signatures by the identification of differentially expressed modules in molecular networks : a local-score approach. Marine Jeanmougin JOBIM 2012, Rennes July 4th, 2012 1 Outline Introduction 1 Microarray experiments


  1. Improving gene signatures by the identification of differentially expressed modules in molecular networks : a local-score approach. Marine Jeanmougin JOBIM 2012, Rennes – July 4th, 2012 1

  2. Outline Introduction 1 Microarray experiments Identification of molecular signatures Motivations DIsease Associated Module Selection (DiAMS) 2 Global approach Local-score statistic for module ranking Evaluation process Results and application 3 Quantitative results Application to Estrogen Receptor status in breast cancer 2

  3. Microarray experiments Objectives of microarray experiments differential analysis Signature of genes Expression level of thousands of transcripts Biological purpose ◮ Signature: genes involved in a phenotype of interest ◮ Medical applications: diagnosis, prognosis, treatment efficacy 3

  4. Identification of molecular signatures Differential analysis Model X ( c ) ig : expression level of the i th sample for gene g under condition c such as: E ( X ( c ) ig ) = µ ( c ) g Under the assumption of homoscedasticity between conditions: V ( X ( c ) ig ) = ( σ g ) 2 Hypothesis testing strategy For two conditions, the null hypothesis to test comes down to � µ ( 1 ) = µ ( 2 ) H 0 , g : g g µ ( 1 ) � = µ ( 2 ) H 1 , g : g g ⊲ Classical approach: t -statistic Issues for gene-specific variance estimation 4

  5. Identification of molecular signatures Differential analysis Model X ( c ) ig : expression level of the i th sample for gene g under condition c such as: E ( X ( c ) ig ) = µ ( c ) g Under the assumption of homoscedasticity between conditions: V ( X ( c ) ig ) = ( σ g ) 2 Hypothesis testing strategy For two conditions, the null hypothesis to test comes down to � µ ( 1 ) = µ ( 2 ) H 0 , g : g g µ ( 1 ) � = µ ( 2 ) H 1 , g : g g ⊲ Classical approach: t -statistic Issues for gene-specific variance estimation 4

  6. Identification of molecular signatures Differential analysis Model X ( c ) ig : expression level of the i th sample for gene g under condition c such as: E ( X ( c ) ig ) = µ ( c ) g Under the assumption of homoscedasticity between conditions: V ( X ( c ) ig ) = ( σ g ) 2 Hypothesis testing strategy For two conditions, the null hypothesis to test comes down to � µ ( 1 ) = µ ( 2 ) H 0 , g : g g µ ( 1 ) � = µ ( 2 ) H 1 , g : g g ⊲ Classical approach: t -statistic Issues for gene-specific variance estimation 4

  7. Identification of molecular signatures Differential analysis Limma: a shrinkage approach (Smyth, 2004) Jeanmougin et al. 2010, PLoS ONE Empirical Bayes variance estimate = d 0 S 2 0 + d g S 2 S limma g , g d 0 + d g ◮ S 2 0 : prior variance from the scale-inverse-chi-square distribution � fixed with an empirical Bayes approach ◮ S 2 g : usual unbiased estimator of the variance ( σ g ) 2 ◮ d 0 , d g : residual degrees of freedom for S 2 0 and for the linear model for gene g Test statistic: x ( 1 ) x ( 2 ) ¯ · g − ¯ · g t limma = . g � 1 1 S limma n 1 + g n 2 5

  8. Motivations Limitations of classical approaches ◮ Low reproducibility Ein-Dor et al. 2005, Outcome signature genes in breast cancer: is there a unique set? Bioinformatics ◮ Difficulty to achieve a clear biological interpretation Improving gene signatures ◮ Genes causing the same phenotype are likely to interact together Gandhi, T.K. et al. 2006, Nature Genetics ◮ Identification of genes that are functionally related (i.e. modules) Functional relationship Expression data network 6

  9. Outline Introduction 1 Microarray experiments Identification of molecular signatures Motivations DIsease Associated Module Selection (DiAMS) 2 Global approach Local-score statistic for module ranking Evaluation process Results and application 3 Quantitative results Application to Estrogen Receptor status in breast cancer 7

  10. Global approach Goal Select functional modules presenting unexpected accumulation of high-scoring genes Input parameters ◮ PPI network (strong manifestation of functional relations) ◮ Gene scores from limma statistic DiAMS: a 3-step process Preprocessing 1 Local-score approach for module ranking 2 3 Selection of significant modules 8

  11. Global approach Step 1 - Preprocessing High-dimensional network ◮ Impossibility of exploring the huge space of possible gene subnetworks Hierarchical clustering ◮ Captures much information about network topology ◮ Enables to go easily through the structure ◮ Screen the entire network without constraints on module sizes ”Walktrap” approach • Random walks strategy • Distance (similarity measure of vertices) • Ward’s criterion Pons and Latapy 2006 JGAA 9

  12. Global approach Step 2 - Local-score approach for module ranking N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  13. Global approach Step 2 - Local-score approach for module ranking N 2 N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  14. Global approach Step 2 - Local-score approach for module ranking N 3 N 2 N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  15. Global approach Step 2 - Local-score approach for module ranking N 3 N 4 N 2 N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  16. Global approach Step 2 - Local-score approach for module ranking N 5 N 3 N 4 N 2 N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  17. Global approach Step 2 - Local-score approach for module ranking N 5 N 3 N 4 N 2 N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  18. Global approach Step 2 - Local-score approach for module ranking N 5 N 3 N 4 N 2 N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  19. Global approach Step 2 - Local-score approach for module ranking N 5 N 3 N 4 N 2 N 1 g1 g2 g3 g4 g5 g6 Iterative module ranking Score each module N k (by summing gene scores) 1 2 Identify the highest scoring module (local-score statistic) Remove it 3 Repeat setps 1) to 3) until all disjoint modules have been enumerated 4 10

  20. Global approach Step 3 - Selection of significant modules Goal Assess the global significance of each module Monte-Carlo approach 1 – Permutation of sample labels 2 – Distribution under H 0 3 – p -value computation � Selection of modules at 5% FDR level. 11

  21. Outline Introduction 1 Microarray experiments Identification of molecular signatures Motivations DIsease Associated Module Selection (DiAMS) 2 Global approach Local-score statistic for module ranking Evaluation process Results and application 3 Quantitative results Application to Estrogen Receptor status in breast cancer 12

  22. Module scoring Distribution of scores in function of p-values Individual gene scoring 10 The gene score is given by: 8 6 ν g = − log ( p g ) − δ, scores 4 2 ◮ p g : gene p -value from limma, 0 ◮ δ , a constant such as E ( ν g ) ≤ 0. -2 0.0 0.2 0.4 0.6 0.8 1.0 pvalues Local-score statistic Definition : value of the highest-scoring module. Given H , a hierarchical community structure, the local-score statistic is defined as:   �  , L = max ν g H ⊆H g ∈ H such as H is a subtree of H . 13

  23. Module scoring Distribution of scores in function of p-values Individual gene scoring 10 The gene score is given by: 8 6 ν g = − log ( p g ) − δ, scores 4 2 ◮ p g : gene p -value from limma, 0 δ ◮ δ , a constant such as E ( ν g ) ≤ 0. -2 0.0 0.2 0.4 0.6 0.8 1.0 pvalues Local-score statistic Definition : value of the highest-scoring module. Given H , a hierarchical community structure, the local-score statistic is defined as:   �  , L = max ν g H ⊆H g ∈ H such as H is a subtree of H . 13

  24. Outline Introduction 1 Microarray experiments Identification of molecular signatures Motivations DIsease Associated Module Selection (DiAMS) 2 Global approach Local-score statistic for module ranking Evaluation process Results and application 3 Quantitative results Application to Estrogen Receptor status in breast cancer 14

  25. Evaluation process Power and false-postive rate study Tree structure 15

Recommend


More recommend