Ra Random matrix analysis for gene co co-ex expres ession ex exper erimen ents in in can ancer ce cells OIST-iTHES-CTSR 2016 July 9 th , 2016 Ayumi KIKKAWA (MTPU, OIST)
Introduction : What is co-expression of genes? • There are 20~30k genes in human DNA. • They are both coding or non-coding genes. • Complex netwoksof various transcripts. • Gene Interaction network (regulatory network) • Protein-Protein interactions. • mRNA, Non-coding RNA, Micro RNA, etc., … • Transcriptomes Ø Jonsson,P.F. and Bates,P.A . (2006) Global topological • System biology features of cancer proteins in the human interactome. Bioinformatics , 22 , 2291‒2297.
The microarray experiments to gene interaction network • NCBI GEO https://www.ncbi.nlm.nih.gov/gds GEO is an international public repository that archives and freely distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomics data submitted by the research community. • Series 70,997 • Platforms 16,042 • Samples 1,858,012 • More than 10k gene expression in a single assay. • Meta-analysis over many experiments is possible. • The gene interaction network should change its topology in various cellular states including disease. • The Bayesian inferred gene interaction network algorithm. (SiGN-NNSR)
The cancer gene interaction database: TCNG The Cancer Network Galaxy (TCNG) http://tcng.hgc.jp Experimental Data Learning Bayesian network Nonparametrix Bayesian network Sample Sample algorithm (SiGN) 1 2 3 n Bayesian network 1 Gene1 Ø Y., Tamada et al. Estimating genome- Gene2 wide gene networks using … … nonparametric bayesian network models on massively parallel computers. IEEE/ACM Trans. Comput. Biol. Bayesian network 2 Bioinforma. 8, 683–697 (2011). Gene1 Gene2 … … K : 京 Riken supercomputer Based on 256 GEO datasets. Total nodes = 22820 … … Total edges ~ 16M
RMT analysis for gene interaction • The random matrix theory (RMT) can be applied to various biological networks and we have studied the protein-protein interaction (PPI) networks previously. • In many organisms, PPI network shows the universal behavior. The nearest neighbor level (NNL) spacing distribution P(s) shows the Wigner distribution. • The important feature of this level statistics is that the eigenvalues (levels) of the adjacency matrix repel each other. • This is compared to the opposite case where the levels have no correlation mutually and the distribution behaves as Poisson distribution. • The difference of the gene networks between the normal and disease cells is very important. • We apply RMT in cancer gene network in order to study whether there is distinctive topological behavior in cancer cells.
The Work flow
The statistics of the TCNG data • Number of inferred edges • Number of samples Frequency (edge attribute) : Edge attribute calculated by SiGN-BN NNSR. It represents the frequency of the edge estimated during the iterations of the NNSR algorithm. The range of the value is from 0 to 1. By the default setting, an edge with Freq greater than 0.2 is regarded as being estimated. You can consider this value as the confidence of the estimated edge. This does not represent the accuracy nor the strength of the edge.
Poisson to WD distribution change due to the network size #236 (GSE7904) 51 samples #165(GSE29013) 50 samples 8000 nodes, 32,124 edges 8000 nodes, 51,702 edges
Poisson to WD distribution change due to the confidence factor of the edges #18 (GSE11135) #26 (GSE12276) 204 samples, 21,001 edges 204 samples, 51994 edges
#92: 111 samples , 26,717 edges
Summary i. From the view point of RMT, we have observed universal behaviors for gene interaction network in cancer cells with the data from the TCNG database. ii. The NNS distribution for gene interaction matrix changes from Poisson distribution to Wigner distribution when the network size is enlarged. iii. The NNS distribution change from P to W is also observed when the confidence factor of inferred edges are strict. iv. As far as our study, the Poisson distribution has been observed only in the cancer related molecular networks yet. (PPI or gene interaction networks).
Recommend
More recommend