csci 8980 advanced topics in graphical models application
play

CSci 8980: Advanced Topics in Graphical Models Application: Gene - PowerPoint PPT Presentation

Background Gene Expression Analysis Replicated Microarray Analysis CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor: Arindam Banerjee November 20, 2007 Background Gene Expression Analysis


  1. Background Gene Expression Analysis Replicated Microarray Analysis CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor: Arindam Banerjee November 20, 2007

  2. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Technology

  3. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data

  4. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions

  5. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene

  6. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene x im is the value for gene i , condition m

  7. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene x im is the value for gene i , condition m Each profile is assumed to be generated by one cluster

  8. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene x im is the value for gene i , condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞

  9. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene x im is the value for gene i , condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment ( c 1 , . . . , c T )

  10. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene x im is the value for gene i , condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment ( c 1 , . . . , c T ) Each c i ∈ { 1 , . . . , Q }

  11. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene x im is the value for gene i , condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment ( c 1 , . . . , c T ) Each c i ∈ { 1 , . . . , Q } Want posterior probability over assignments

  12. Background Gene Expression Analysis Replicated Microarray Analysis Microarray Data T gene expression profiles under M conditions x i is the profile for the i th gene x im is the value for gene i , condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment ( c 1 , . . . , c T ) Each c i ∈ { 1 , . . . , Q } Want posterior probability over assignments Idea: Use a Bayesian infinite mixture model

  13. Background Gene Expression Analysis Replicated Microarray Analysis Infinite Mixture Model for Gene Expression Level 1: Data generation p ( x i | c i = j , µ h , σ 2 h , [ h ] Q 1 ) = N ( x ; µ j , σ 2 j I )

  14. Background Gene Expression Analysis Replicated Microarray Analysis Infinite Mixture Model for Gene Expression Level 1: Data generation p ( x i | c i = j , µ h , σ 2 h , [ h ] Q 1 ) = N ( x ; µ j , σ 2 j I ) Level 2: Priors for parameters p ( µ j | λ, r ) = N ( µ ; λ, 1 / r I ) p ( σ − 2 f G ( σ − 2 ; β/ 2 , β w / 2) | β, w ) = j

  15. Background Gene Expression Analysis Replicated Microarray Analysis Infinite Mixture Model for Gene Expression Level 1: Data generation p ( x i | c i = j , µ h , σ 2 h , [ h ] Q 1 ) = N ( x ; µ j , σ 2 j I ) Level 2: Priors for parameters p ( µ j | λ, r ) = N ( µ ; λ, 1 / r I ) p ( σ − 2 f G ( σ − 2 ; β/ 2 , β w / 2) | β, w ) = j Level 2: Prior for clustering c i ∼ Discrete ( π )

  16. Background Gene Expression Analysis Replicated Microarray Analysis Infinite Mixture Model for Gene Expression (Contd.) Prior for hyper-parameters p ( w | σ 2 f G ( w ; 1 / 2 , 1 / (2 σ 2 x ) = x )) p ( β ) = f G ( β ; 1 / 2 , 1 / 2) p ( r | σ 2 f G ( r ; 1 / 2 , 1 / (2 σ 2 x ) = x )) p ( λ | µ x , σ 2 f N ( λ | µ x , σ 2 x ) = x I )

  17. Background Gene Expression Analysis Replicated Microarray Analysis Infinite Mixture Model for Gene Expression (Contd.) Prior for hyper-parameters p ( w | σ 2 f G ( w ; 1 / 2 , 1 / (2 σ 2 x ) = x )) p ( β ) = f G ( β ; 1 / 2 , 1 / 2) p ( r | σ 2 f G ( r ; 1 / 2 , 1 / (2 σ 2 x ) = x )) p ( λ | µ x , σ 2 f N ( λ | µ x , σ 2 x ) = x I ) ( µ x , σ 2 x ) are empirical mean, variance

  18. Background Gene Expression Analysis Replicated Microarray Analysis Infinite Mixture Model for Gene Expression (Contd.) Prior for hyper-parameters p ( w | σ 2 f G ( w ; 1 / 2 , 1 / (2 σ 2 x ) = x )) p ( β ) = f G ( β ; 1 / 2 , 1 / 2) p ( r | σ 2 f G ( r ; 1 / 2 , 1 / (2 σ 2 x ) = x )) p ( λ | µ x , σ 2 f N ( λ | µ x , σ 2 x ) = x I ) ( µ x , σ 2 x ) are empirical mean, variance Priors on cluster-prior π π ∼ Dirichlet ( α/ Q )

  19. Background Gene Expression Analysis Replicated Microarray Analysis Gibbs Sampler As Q → ∞ , we get a DPM

  20. Background Gene Expression Analysis Replicated Microarray Analysis Gibbs Sampler As Q → ∞ , we get a DPM Gibbs sampler for the model n − i , j p ( c i = j | c − i , x i , µ j , σ 2 T − 1 + α N ( x i ; µ j , σ 2 j ) ∝ j I ) α � p ( c i � = c j , j � = i | c − i , x i , µ x , σ 2 N ( x i | µ, σ 2 I ) p ( µ, σ ∝ x ) T − 1 + α

  21. Background Gene Expression Analysis Replicated Microarray Analysis Gibbs Sampler As Q → ∞ , we get a DPM Gibbs sampler for the model n − i , j p ( c i = j | c − i , x i , µ j , σ 2 T − 1 + α N ( x i ; µ j , σ 2 j ) ∝ j I ) α � p ( c i � = c j , j � = i | c − i , x i , µ x , σ 2 N ( x i | µ, σ 2 I ) p ( µ, σ ∝ x ) T − 1 + α Pairwise probability of being generated by the same pattern P ij = # samples after ‘burn-in’ with c i = c j S B

  22. Background Gene Expression Analysis Replicated Microarray Analysis Gibbs Sampler As Q → ∞ , we get a DPM Gibbs sampler for the model n − i , j p ( c i = j | c − i , x i , µ j , σ 2 T − 1 + α N ( x i ; µ j , σ 2 j ) ∝ j I ) α � p ( c i � = c j , j � = i | c − i , x i , µ x , σ 2 N ( x i | µ, σ 2 I ) p ( µ, σ ∝ x ) T − 1 + α Pairwise probability of being generated by the same pattern P ij = # samples after ‘burn-in’ with c i = c j S B Distance D ij = 1 − P ij

  23. Background Gene Expression Analysis Replicated Microarray Analysis Model for Replicated Data Generate replicates of each expression

  24. Background Gene Expression Analysis Replicated Microarray Analysis Model for Replicated Data Generate replicates of each expression Account for variability in gene expression data

  25. Background Gene Expression Analysis Replicated Microarray Analysis Model for Replicated Data Generate replicates of each expression Account for variability in gene expression data For G replicates N ( x ik ; y i , ψ 2 p ( x ik | y i , ψ i ) = i ) N ( y i | µ j , σ 2 p ( y i | c i = j , µ j , σ j ) = j I )

  26. Background Gene Expression Analysis Replicated Microarray Analysis Model for Replicated Data Generate replicates of each expression Account for variability in gene expression data For G replicates N ( x ik ; y i , ψ 2 p ( x ik | y i , ψ i ) = i ) N ( y i | µ j , σ 2 p ( y i | c i = j , µ j , σ j ) = j I ) Integrating out the mean expression profile j + ψ 2 p ( x i 1 , . . . , x ik | c i = j , µ j , σ 2 x i ; µ j , ( σ 2 � i N (¯ j , ψ i ) = G ) I ) k

  27. Background Gene Expression Analysis Replicated Microarray Analysis Model for Replicated Data Generate replicates of each expression Account for variability in gene expression data For G replicates N ( x ik ; y i , ψ 2 p ( x ik | y i , ψ i ) = i ) N ( y i | µ j , σ 2 p ( y i | c i = j , µ j , σ j ) = j I ) Integrating out the mean expression profile j + ψ 2 p ( x i 1 , . . . , x ik | c i = j , µ j , σ 2 x i ; µ j , ( σ 2 � i N (¯ j , ψ i ) = G ) I ) k Gibbs sampler used for inference

  28. Background Gene Expression Analysis Replicated Microarray Analysis Experimental Results Move to paper for results

  29. Background Gene Expression Analysis Replicated Microarray Analysis Results

  30. Background Gene Expression Analysis Replicated Microarray Analysis Gibbs Sampler For the replicated model p ( c i = q | C − i , x i 1 , . . . , x ik , µ q , σ 2 q , ψ i , α ) n − i , q x i ; µ q , ( σ 2 q + ψ 2 ∝ T − 1 + α N (¯ i / G ) I ) p ( c i � = c j , j � = i | C − i , x ik , ψ i , α ) α � x i ; µ, ( σ 2 + ψ 2 i / G ) I ) p ( µ, σ 2 | λ, τ, ∝ N (¯ T − 1 + α

  31. Background Gene Expression Analysis Replicated Microarray Analysis Gibbs Sampler For the replicated model p ( c i = q | C − i , x i 1 , . . . , x ik , µ q , σ 2 q , ψ i , α ) n − i , q x i ; µ q , ( σ 2 q + ψ 2 ∝ T − 1 + α N (¯ i / G ) I ) p ( c i � = c j , j � = i | C − i , x ik , ψ i , α ) α � x i ; µ, ( σ 2 + ψ 2 i / G ) I ) p ( µ, σ 2 | λ, τ, ∝ N (¯ T − 1 + α The integral is implemented as follows

Recommend


More recommend