Compstat Integration of biological knowledge related to gene co-expression Marie VERBANCK (Agrocampus Ouest / CNRS-UMR6625, France) Sébastien LÊ (Agrocampus Ouest / CNRS-UMR6625, France)
The data <Experiment> Chickens (x27): physiological state - N: fed (ad libitum access to food) (x6) - J16: 16-hour fasting (x5) - J16R5: 16-hour fasting + 5-hour renutrition phase (x7) - J16R16: 16-hour fasting + 16-hour renutrition phase (x9) - gene expressions (selection) - fatty acid concentrations (hepatic and plasmatic)
The data <Experiment> Chickens (x27): physiological state - N: fed (ad libitum access to food) (x6) - J16: 16-hour fasting (x5) - J16R5: 16-hour fasting + 5-hour renutrition phase (x7) - J16R16: 16-hour fasting + 16-hour renutrition phase (x9) - gene expressions (selection) - fatty acid concentrations (hepatic and plasmatic) What are the mechanisms implemented in reply to fasting?
The data, the expectations <Merged data tables> < Gene expressions > < Fatty acid concentrations > 1 j 2 J 2 1 j 1 J 1 1 i I ‘ - omics’ data
The data, the expectations <Merged data tables> < Gene expressions > < Fatty acid concentrations > 1 j 2 J 2 1 j 1 J 1 1 i I ‘ - omics’ data <Expectations> To provide an help on the functional interpretation in an exploratory multivariate analysis framework
Exploratory multivariate analysis framework
Exploratory multivariate analysis framework The multitude of gene expressions is projected onto the the correlation circle uninterpretable
Exploratory multivariate analysis framework 1.0 0.8 The multitude of 0.6 gene expressions is 0.4 projected onto the the correlation circle uninterpretable 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Assembly of genes into modules Interpretation at the level of the groups
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 i I ‘ - omics’ data Simultaneous interpretation of several genes through the supplementary groups projected onto the group representation
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 M1 i I ‘ - omics’ data Simultaneous interpretation of several genes through the supplementary groups projected onto the group representation
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 M1 M2 i I ‘ - omics’ data Simultaneous interpretation of several genes through the supplementary groups projected onto the group representation
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 ….. M1 M2 M3 i I Modules ‘ - omics’ data Simultaneous interpretation of several genes through the supplementary groups projected onto the group representation
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 ….. M1 M2 M3 i I <z 2 ,z 2 > Where z 1 denotes the first main axis of variability among the individuals I 2 R <z 1 ,z 1 >
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 ….. M1 M2 M3 i I <z 2 ,z 2 > M1 M1 M1 Scalar products matrices between chickens I 2 R <z 1 ,z 1 >
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 ….. M1 M2 M3 i I <z 2 ,z 2 > M1 M1 M1 Scalar products matrices between chickens I 2 R <z 1 ,z 1 >
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 ….. M1 M2 M3 i I <z 2 ,z 2 > M1 M1 M1 Scalar products matrices between chickens I 2 L g (z 1 ,M1) R <z 1 ,z 1 >
Modules <MODULES of GENES> Modular approach 1 j 1 J 1 1 j 2 J 2 1 ….. M1 M2 M3 i I <z 2 ,z 2 > st L ( z , K ) 1 z is the 1 princ. comp. of K g j j M1 M1 M1 Scalar products matrices between chickens I 2 L g (z 1 ,M1) R <z 1 ,z 1 >
Biological knowledge < “a priori” information > Description of genes and genes products Gene Ontology - Cellular Component - Molecular Function - Biological Process (BP) Genes could be grouped by GO BP terms
Our Approach < “a posteriori” information >
Our Approach < “a posteriori” information > Expression Terms profiles(microarrays) 1 . . . . j . . . . q 1 . . . . j' . . . . p 1 : : g ij m ij' Genes i : : G M n M: Quantitative data frame Transpose of the table microarrays x genes, the data being centered by row G: Contingency table gij = 1 if the gene i belongs to the process j 0 if not
Our Approach < “a posteriori” information > Construction of a space with a new distance between the genes: Two genes are close in this space if: 1- They are involved in the same biological processes 2- They are co-expressed 3- They are situated at a similar level of the regulatory network
Our Approach < “a posteriori” information > Construction of a space with a new distance between the genes: Two genes are close in this space if: 1- They are involved in the same biological processes The two genes must be associated to the same terms Matrix of the terms 2- They are co-expressed 3- They are situated at a similar level of the regulatory network
Our Approach < “a posteriori” information > Construction of a space with a new distance between the genes: Two genes are close in this space if: 1- They are involved in the same biological processes The two genes must be associated to the same terms Matrix of the terms 2- They are co-expressed The two gene expressions must induce the same structure on the individuals Gene expressions data frame 3- They are situated at a similar level of the regulatory network
Our Approach < “a posteriori” information > Construction of a space with a new distance between the genes: Two genes are close in this space if: 1- They are involved in the same biological processes The two genes must be associated to the same terms Matrix of the terms 2- They are co-expressed The two gene expressions must induce the same structure on the individuals Gene expressions data frame 3- They are situated at a similar level of the regulatory network The number of processes the gene is involved in could determine its level in the network Weighting
Our Approach < “a posteriori” information > Construction of a space with a new distance between the genes: Two genes are close in this space if: 1- They are involved in the same biological processes The two genes must be associated to the same terms Matrix of the terms 2- They are co-expressed The two gene expressions must induce the same structure on the individuals Gene expressions data frame 3- They are situated at a similar level of the regulatory network The number of processes the gene is involved in could determine its level in the network Weighting Canonical Correspondence Analysis
Our Approach Representation of the genes onto the canonical variables 3 RIGG12903 2 RIGG16849 RIGG05911 RIGG00865 RIGG15481 1 RIGG08140 RIGG06437 RIGG02730 RIGG18271 RIGG10681 RIGG00015 RIGG01056 RIGG02005 RIGG15064 RIGG04625 0 RIGG19372 RIGG20299 RIGG06749 RIGG07959 RIGG13646 RIGG14333 RIGG17220 RIGG02523 RIGG06682 RIGG14063 RIGG20074 RIGG03089 RIGG05667 RIGG15083 RIGG16397 RIGG19793 RIGG02231 -1 RIGG08970 RIGG09550 RIGG18276 RIGG08865 RIGG18148 RIGG11544 RIGG11656 RIGG03937 RIGG15080 -2 -1 0 1 2
Our Approach Representation of the genes onto the canonical variables 3 These genes are brought RIGG12903 together on the plan coming from CCA: they 2 induce the same structure RIGG16849 onto the expression RIGG05911 RIGG00865 profiles (correlation of RIGG15481 0.94) 1 RIGG08140 RIGG06437 RIGG02730 RIGG18271 RIGG10681 RIGG00015 RIGG01056 RIGG02005 RIGG15064 RIGG04625 0 RIGG19372 RIGG20299 RIGG06749 RIGG07959 RIGG13646 RIGG14333 RIGG17220 RIGG02523 RIGG06682 RIGG14063 RIGG20074 RIGG03089 RIGG05667 RIGG15083 RIGG16397 RIGG19793 RIGG02231 -1 RIGG08970 RIGG09550 RIGG18276 RIGG08865 RIGG18148 RIGG11544 RIGG11656 RIGG03937 RIGG15080 -2 -1 0 1 2
Our Approach Representation of the genes onto the canonical variables 3 Those two genes are These genes are brought RIGG12903 moved apart by the together on the plan CCA: they induce coming from CCA: they different structures onto 2 induce the same structure the expression profiles RIGG16849 onto the expression RIGG05911 (correlation of 0.11) RIGG00865 profiles (correlation of RIGG15481 0.94) 1 RIGG08140 RIGG06437 RIGG02730 RIGG18271 RIGG10681 RIGG00015 RIGG01056 RIGG02005 RIGG15064 RIGG04625 0 RIGG19372 RIGG20299 RIGG06749 RIGG07959 RIGG13646 RIGG14333 RIGG17220 RIGG02523 RIGG06682 RIGG14063 RIGG20074 RIGG03089 RIGG05667 RIGG15083 RIGG16397 RIGG19793 RIGG02231 -1 RIGG08970 RIGG09550 RIGG18276 RIGG08865 RIGG18148 RIGG11544 RIGG11656 RIGG03937 RIGG15080 -2 -1 0 1 2
Our Approach Objective: to constitute groups of genes . Classification of the genes according to their coordinates on the canonical variables (150 groups).
Recommend
More recommend