Gene position scoring within transcription regulation networks Ivan Junier, Joan Hérisson, Mohamed Elati, François Képès Programme d’Épigénomique, Évry, France
Outline Why positions? How to score? Which outcome?
E. coli : conserved paired genes and their relative position Wright et al., PNAS, 104, 10559 –10564, 2007 Evolutionarily conserved gene pairs
E. coli : conserved paired genes and their relative position Ori Wright et al., PNAS, 104, 10559 –10564, 2007 Evolutionarily conserved gene pairs Ter Gene position <--> Transcription regulation
Co-regulation and gene position TF 1 D co-regulation is frequent binding regulated TU TF gene site in prokaryotes Rapid search hypothesis Periodic positioning of genes regulated by the same TFs 3D co-localization --> rapid search hypothesis Grid interval : 92.8 kbp Képès, JMB, 2004
Co-regulation and gene position Periodic positioning of genes regulated by the same TFs in yeast Grid interval : 7.75 kbp Grid interval : 15.5 kbp Grid interval : 15.5 kbp Képès, JMB, 2003
Why positions are important? Spatial co-localization Conceptual framework Képès, Vaillant, ComplexUS, 2003 Experimental facts Bacterial cells Eukaryotic nucleus Jackson et al., Mol. Biol. Cell, 1998 2 µ m Cook et. al, Nature, 2002 1 µm Cabrera, Jin, JMB, 2004
Why positions are important? Polymer theory I. Junier, O. Martin, F. Képès, submitted to Biophys. J.
Why positions are important? Polymer theory Periodic Random I. Junier, O. Martin, F. Képès, submitted to Biophys. J.
How to detect periodicity? 1) What is periodic? 2) Noise Genes out of the periodicity / False positives Blank sites / False negatives + fluctuations around the original sites
Solenoidal framework
Periodicity detection = clustering detection Principle : better score than Play with the period : spectrum (score vs. period)
Statistics of circular distributions ρ 11 , 3 ( x ) i 2 X Binomial: ρ N, | j − i | ( X = x ) = C | j − i | N − 1 x | j − i | − 1 (1 − x ) N − | j − i | − 2 1 j 0 0.5 1 Pair score: x s ( x ij ) = − log[ p v ( x ij | ρ N, | j − i | )] 2 Gene score (sum up the n first neighbors) : 1 ( i + n )% N S i ( { x ij } ) = 1 � 0 0.5 1 s ( x ij ) n j =( i +1)% N Final score S ( { x } ) = 1 � S i ( { x ij } ) N i
Exemple 0 40 55 100 Clustering spectrum Score 6 4 2 0 100 200 Period 40 50 Discrete Fourier Amplitude spectrum Amplitude 75 0.6 50 0.4 25 0.2 0 0 50 100 150 200 250 50 100 150 200 250 Positions Period Interdistance Period I. Junier, J. Hérisson, F. Képès, to be submitted
Some results in E. coli CRP binding sites (RegulonDB 2008) : 160 targets ( 450 genes) --> 90 strong evidence ∼ - Score -1 CRP -2 28000 p-value : 10 -3 5 5 9510 1 × 10 2 × 10 Period 19020 p-value : 10 -3 p-value : 10 -4 Unveiling chromosomal structures I. Junier, J. Hérisson, F. Képès, in preparation
Some results in E. coli CRP binding sites (RegulonDB 2008) : 160 targets ( 450 genes) --> 90 strong evidence ∼ - Score Score Score -1 -1 -1 CRP CRP CRP -2 -2 -2 CRP-PhoB CRP-OxyR 5 5 5 5 5 5 1 × 10 2 × 10 1 × 10 1 × 10 2 × 10 2 × 10 Period Period Period Unveiling functional relation between TFs : inference of transcription regulation J. Hérisson, I. Junier, F. Képès, in preparation
Positional score of a site (binding sites, genes,...) - Score -1 Needs to specific with respect to which TF S ∗ S CRP -2 S pos = f ( | S ∗ − S | ) × g (max( S ∗ , S )) 5 5 1 × 10 2 × 10 Period S Combining scores : learning machine technique S pos S seq with J. hérisson, M. Elati, F. Képès Biological Data S global
Conclusion Why positions? Biological data show regular pattern --> space co-localization transcription regulation How to score? Method based on a solenoidal framework + clustering detection => valuable information for finding repetitive patterns Which outcome? Structural information about spatial organization of chromosomes Predicting functional relation between genes
Recommend
More recommend