Using Network Flow to Bridge the Gap Using Network Flow to Bridge the Gap between Genotype and Phenotype Teresa Przytycka NIH / NLM / NCBI NIH / NLM / NCBI
Journal “Wisla” (1902) Picture from a local fare in Lublin Poland from a local fare in Lublin, Poland
Genotypes Phenotypes Journal “Wisla” (1902) Picture from a local fare in Lublin Poland from a local fare in Lublin, Poland
Association studies G 1 Genome 1 Genome 2 Genome 3 Genome n
Genotype: effects of genotypic effects of genotypic variation: - change in amino acid - change in gene structure - copy number variations …. 5
Genotype: Phenotype (e.g. disease) effects of genotypic effects of genotypic variation: - change in amino acid - change in gene structure - copy number variations …. 6
G Goals : l • A method for system level analysis of propagation of y y p p g such perturbation in the network • Prediction of “causal” mutations • Prediction of master regulators (network hubs) involved in disease • Prediction of pathways dys-regulated in disease
Propagation of the effects of Copy number aberrations in Glioma CNV Cancer Cases G Gene expression data i d t Gene 1 Gene 2 Gene 3 mosomes . . . chrom . . Gene n I t Integrated t d Protein-protein, protein-DNA phosphorylation network
Copy number aberrations py or/and mutations Gene expression
Copy number aberrations py or/and mutations Gene expression Signature genes
Copy number aberrations py or/and mutations Signature genes
Copy number aberrations py or/and mutations Signature genes
Method outline Method outline 1. Selecting marker genes to be used as “phenotype” 2. Genotype-phenotype association 3. Uncovering information flow between genotype and phenotype 4. Inferring dys- regulated, genes, pathways, and causal mutations 13
Selecting “phenotype” genes C Cancer Cases C Gene expression data Gene 1 Gene 2 Gene 2 Gene 3 . . target genes . . . Gene n
Selecting “phenotype” genes
Selecting “phenotype” genes Smallest set of genes so that each case is “covered” at least specified number of times
Associations between copy number variations and gene expression of selected target genes and gene expression of selected target genes Cancer Cases Cancer Cases 17 Gene expression data CNV data
Significant correlation between CNV and expression expression Cancer Cases Gene expression da Gene 1 Gene 2 Gene 3 . . . . . Gene n 18
Significant correlation between CNV and expression expression Cancer Cases Gene expression da target gene locus 19
Significant correlation between CNV and expression expression Cancer Cases Gene expression da target gene candidate causal genes candidate causal genes 20
Uncovering pathways of information flow between CNV and target gene CNV and target gene Cancer Cases Gene expression da 21
Gene expression da Cancer Cases 22 Using expression to guide path discovery
Translating probabilities it resistances Cancer Cases Gene expression da Resistance - set to favor most likely path -based on gene expression values 23 (reversely proportional to the average correlation of the expression of the adjacent genes with expression of the target gene)
Finding subnetworks with significant current flow Cancer Cases Gene expression da Resistance - set to favor most likely path -based on gene expression values 24 (reversely proportional to the average correlation of the expression of the adjacent genes with expression of the target gene)
G Goals : l • A method for system level analysis of propagation of y y p p g such perturbation in the network • Prediction of “causal” mutations • Identification master regulators (network hubs) involved in disease • Identification pathways dys-regulated in disease
Putative causal variation (with lots of additional caveats) (with lots of additional caveats) Cancer Cases Gene expression da Resistance - set to favor most likely path -based on gene expression values 26 (reversely proportional to the average correlation of the expression of the adjacent genes with expression of the target gene)
Causal copy number aberrations Causal copy number aberrations 27 27
G Goals : l • A method for system level analysis of propagation of y y p p g such perturbation in the network • Prediction of “causal” mutations • Prediction “master regulators” (network hubs) involved in disease • Prediction pathways dys-regulated in disease
Solve current flow for all pairs and find nodes belonging to many paths g g y p Cancer Cases Cancer Cases 29 Gene expression data CNV data
30 Hubs Hubs
G Goals : l • A method for system level analysis of propagation of y y p p g such perturbation in the network • Prediction of “causal” mutations • Prediction of “master regulators” (network hubs) involved in disease • Prediction of pathways dys-regulated in disease
Are there common functional pathways? Cancer Cases Cancer Cases CNV data Gene expression dat 32
33 Common GO pathways
G Goals : l • A method for system level analysis of propagation of y y p p g such perturbation in the network • Prediction of “causal” mutations • Prediction of “master regulators” (network hubs) involved in disease • Prediction of pathways dys-regulated in disease
Design details under the hood g • Current flow reduces to solving a set of linear equations (Kirchhoff's laws) Caveat: We had to solving a linear system with 20,000 variables thousands of times for permutation test required new methodology • Many biological interactions are directional. This can be taken care by solving linear program with corresponding constraints - Caveat: the network is to big for solving thousands of linear programs network is to big for solving thousands of linear programs • Null model and p-value estimations Kim, Wuchty, Przytycka – PloS Comp Bio 2011 Kim, Przytycki, Wuchty, Przytycka – Phys. Bio. 2011 35
Acknowledgments Group members : Yoo-Ah Kim DongYeon Cho Xiangjun Du Jan Hoinka Yang Huang g g Raheleh Salari Damian Wojtowicz Journal “Wisla” (1902) Picture from a local fare Collaborators : in Lublin, Poland Stefan Wuchty (NCBI) Stefan Wuchty (NCBI) Jozef Przytycki (GWU) my great-great uncle (the “Giant”)
37
Acknowledgments Group members : Collaborators : Yoo-Ah Kim Stefan Wuchty (NCBI) DongYeon Cho Brian Oliver (NIDDK) B i Oli (NIDDK) Xiangjun Du John Malone Jan Hoinka Nicolas Mattiuzzo Yang Huang g g Justin Andrews (Indiana University) J ti A d (I di U i it ) Raheleh Salari Damian Wojtowicz Jozef Przytycki (GWU)
39
Impact of gene copy number on gene expression in Drosophila melanogaster expression in Drosophila melanogaster ge (log 2 ) old chang 0 ression fo -1 Exp E Expression (wild type) i ( ild t ) 40 collaboration with Brian Oliver group (NIDDK)
CNV-related perturbations propagate t trough interaction network h i t ti t k 41 Co-complex network from Artavanis-Tsakonas group (unpublished)
Impact on copy number on gene expression in glioma i i li CNV Chromosomes Correlation between CNV and expression 42
Genotype: effects of genotypic effects of genotypic variation: - change in amino acid - change in gene structure - copy number variations …. 43
Phenotype Genotype: effects of genotypic effects of genotypic variation: - change in amino acid - change in gene structure - copy number variations …. 44
Phenotype Genotype: effects of genotypic effects of genotypic variation: Molecular - change in amino acid phenotypes phenotypes - change in gene structure - gene expression - copy number variations …. - Metabolite level 45
Copy number variations (CNV) (gene dosage) (gene dosage) • implicated in large number of human diseases (cancer, Crohn's disease, autism) • 28,025 structural variants identified in 1000 genome study (2,000 changes affecting full genes or exons) • Frequent type of somatic mutations in cancer
Phenotype Genotype: Molecular phenotypes phenotypes - gene expression - Metabolite level 47
Recommend
More recommend