mgt sm a method for constructing cellular signal
play

MGT-SM: A Method for Constructing Cellular Signal Transduction - PowerPoint PPT Presentation

GIW 2016: International Conference on Genome Informatics MGT-SM: A Method for Constructing Cellular Signal Transduction Networks Min Li, Ruiqing Zheng, Yaohang Li, Fang-Xiang Wu, and Jianxin Wang School of Information Science and Engineering,


  1. GIW 2016: International Conference on Genome Informatics MGT-SM: A Method for Constructing Cellular Signal Transduction Networks Min Li, Ruiqing Zheng, Yaohang Li, Fang-Xiang Wu, and Jianxin Wang School of Information Science and Engineering, Central South University Homepage: http://bioinformatics.csu.edu.cn/

  2. Outline ■ Background ■ Traditional Granger ■ Method ■ Result & Conclusion

  3. Background ■ signal transduction network refers to a directed network which composes of molecules and genes involved in signal transduction pathways

  4. Background Gene expression data is meaningful data for constructing the network ■ Bayesian Network ■ Mutual Information ■ Ensemble method Gene Expression ■ Regression Regression -Granger causality is an effective method v 1 v 2 v 3 v n-2 v n-1 v n v 1 v 2 v 3 v n-2 v n-1 v n Granger Test v 1 v n-1 v 3 v 2 v n-2 v n Directed Network

  5. Traditional Granger ■ Pairwise ■ Multivariate General form ■ Step 1. linear regression for coefficient matrix - - pairwise 𝑧 # = ∑ 𝛽 ' 𝑧 #(' + ∑ 𝛾 + 𝑦 #(' + 𝑓 # , 𝑢 = 𝑟 + 1, … 𝑈 -with null hypothesis './ +./ - Multivariate 𝑦 ',# = ∑ ∑ 𝑠 +,#(8 𝑦 +,#(8 + 𝑓 ',# 𝑢 = 𝑟 + 1, … 𝑈, 𝑗 = 1, …, 𝑜 9:+:; 8./ ■ Step 2. F-test for p-value ?@@ A,B (?@@ A /- 𝐺 = F~ freedom (q, m-nq) ?@@ A /(E(;-)

  6. Traditional Granger Problem l Indirect caused by pairwise granger v 1 v 2 v 3 (a)Bivariate Granger v 1 v 2 v 3 (b)Multivariate Granger direct edge indirect edge l For real gene expression data , there is sometimes m ≪ nq ( n is gene number, m=T-q), traditional granger is not applicable

  7. Method We propose an extended Granger test combining SVD and Monte Carlo Simulation Step 1 calculate coefficient matrix n Build 𝑍 = 𝑆𝑌 + 𝐹 𝑦 /,/ ⋯ 𝑦 /,#(/ 𝑦 /,R ⋯ 𝑦 /,# ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ 𝑌 = 𝑍 = 𝑦 ;,/ ⋯ 𝑦 ;,#(/ 𝑦 S,R ⋯ 𝑦 ;,# n Apply SVD for coefficient matrix R 𝑌 = U ∗ S ∗ 𝑊 X Y = 𝑍 ∗ 𝑊 ∗ 𝑇 (/ ∗ 𝑉 X 𝑆

  8. Method MG MGT -SM SM ALGORI RITHM Step 2 p-value for pairwise <i, j> Input : time-series gene expression data matrix In Monte Carlo Simulation instead of F-test Output : the directed edge with a significant level ST STEP 1. Normalize time-series gene expression data STEP 2. Use lmc test to analyze expression’s stationarity, if the ST expression is nonstationary, use the first order difference. Monte Carlo Simulation ST STEP 3. Employ SVD to calculate the coefficient matrix Input : time-series expression data matrix and gene index i In ST STEP 4. For gene i, use Monte Carlo to calculate the p-value of Output: the p-value Ou edge(i,j) ST STEP 1. Calculate 𝑆𝑇𝑇 ' based on estimation of R ST STEP 5. Repeat step 4 for all genes, and save the significant edge in STEP 2. Upset the order of the expression of gene j from ST a file. regression and calculate 𝑆𝑇𝑇 ',+ STEP 3 Repeat step 2 for k times, get the distribution of ST RSS ',+ STEP 4 Rank RSS ' in RSS ',+ in ascending order and calculate ST the p-value as follows 𝑞 = 𝑠𝑏𝑜𝑙 𝑝𝑔 RSS ' /(𝑙 + 1)

  9. Results & Conclusion Genes Samples Time points Real edges Simulation 5 10 10 7 Yeast Synthetic Network 5 4 17 7 MDA-MB-468 20 4 8 48 The Datasets of the experiment

  10. n Recall and AUROC for Evaluation 𝑈𝑄 Recall = 𝑈𝑄 + 𝐺𝑂 n Comparion of Method: CGC2SPR, PGC, DBN(Dynamic Bayesian Network)

  11. 8. Result ■ SIMULATION DATA 10 samples: all the TOP 7 edges of MGT-SM is real edges

  12. 8. Result ■ Recall and ROC in Yeast Synthetic Network MGT-SM(0.729) MGT-SM(0.677) 1.0 1.0 DBN(0.563) DBN(0.594) CGC2SPR(0.635) CGC2SPR(0.354) PGC(0.625) PGC(0.542) 0.8 True Positive Rate True Positive Rate 0.6 0.5 0.4 0.2 0.0 0.0 0.0 0.5 1.0 0.0 0.5 1.0 False Positive Rate False Positive Rate (a) Switch Off 1 (b) Switch Off 2 MGT-SM(0.740) MGT-SM(0.563) 1.0 1.0 DBN(0.625) DBN(0.625) CGC2SPR(0.396) CGC2SPR(0.396) PGC(0.25) PGC(0.25) True Positive Rate True Positive Rate 0.5 0.5 0.0 0.0 0.0 0.5 1.0 0.0 0.5 1.0 False Positive Rate False Positive Rate (c) Switch Off 3 (d) Switch Off 4

  13. 8. Result ■ Recall and ROC in MDA-MB-468 1.0 1.0 True Positive Rate True Positive Rate 0.5 0.5 MGT-SM(0.612) DBN(0.493) MGT-SM(0.539) DBN(0.533) CGC2SPR(0.561) CGC2SPR(0.423) PGC(0.554) PGC(0.522) 0.0 0.0 0.0 0.5 1.0 0.0 0.5 1.0 False Positive Rate False Positive Rate (a) EGF 0ng (b) EGF 5ng 1.0 1.0 True Positive Rate True Positive Rate 0.5 0.5 MGT-SM(0.590) MGT-SM(0.623) DBN(0.494) DBN(0.493) CGC2SPR(0.450) CGC2SPR(0.437) PGC(0.512) PGC(0.502) 0.0 0.0 0.0 0.5 1.0 0.0 0.5 1.0 False Positive Rate False Positive Rate (c) EGF 10ng (d) EGF 20ng

  14. Discussion ■ MGT-SM combining SVD and Monte Carlo Simulation has a widely application, no matter scale of the gene set ■ MGT-SM has a better performance than previous methods in signal transduction construction ■ The Granger methods with prior knowledge is a meaningful point in further study.

  15. Acknowledgment ■ This work was supported in part by the National Natural Science Foundation of China NO.61622213, NO. 61370024, NO. 61232001 and NO. 61428209. ■ Co-Authors:

  16. Thanks for your attention!

Recommend


More recommend