Genotypic analysis of coreceptor usage New developments and applications for geno2pheno [ coreceptor ] Alexander Thielen Department of Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany
Outlook Genotypic prediction of coreceptor usage New developments geno2pheno [coreceptor] in different applications 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 2
HIV Entry Cell Assay (Trofile) CD4 + CXCR4 + HIV env HIV genomic expression luc vector vector + Transfection Infection Pseudovirus - capable of a single round of replication CD4+ CCR5+ CCR5 and CXCR4 antagonists are used to confirm tropism Adapted from Petropoulos CJ, et al. Antimicrob Agents Chemother. 2000;44:920-8. 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 3
Why do we need genotypic approaches? very accurate used in clinical trials of entry-inhibitors (Trofile) But: very expensive slow turnaround (up to 5 weeks) not always available (samples for Trofile have to be shipped to South San Francisco) restrictions (e.g. Trofile: viral load >1000, sometimes dry ice needed) 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 4
Genotypic Monitoring of Coreceptor Usage Matched genotype (V3 region)-phenotype pairs: genotyping phenotyping multiple alignment (against fixed reference alignment) CTRPNNNTRRSISIGPGRAFYATGDIIGDIRQAHC R5 CTRPNNNTRKGIHMGPGS-FYVTGEIIGDIRQAHC R5 CSRPNNNTRKSVHIGPGQAFYATGDVIGDIRQAHC X4 Y X=(X 1 , …, X i , …, X n ) statistical learning 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 5
Performance comparison of different methods Specificity ~ 90% WetCat: Sensitivity ~80% • Charge Rule (11/25) Dataset of 1110 clonal samples from • Decision trees the Los Alamos database • Support Vector Machines 10 replicates of 10-fold cross-validation • http://genomiac2.ucsd.edu:8080/wetcat/tropism.html WebPSSM: Support Vector Machines and PSSMs significantly better than other methods • Position specific scoring matrices • http://ubik.microbiol.washington.edu/computing/pssm Best performance among all tested Geno2pheno [coreceptor] : methods: Support Vector Machines • Support Vector Machines • http://coreceptor.bioinf.mpi-inf.mpg.de (Sing et al, 2004) 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 6
Clinical samples How do known predictors compare on clinically derived data? 920 antiretroviral naïve samples Method Sensitivity Specificity 11/25 rule 30.5% 93.4% SVM genomiac 21.8% 89.6% PSSM Si/NSI 33.8% (43.7%) 95.3% (90%) PSSM X4/R5 24.5% (43.7%) 96.9% (90%) Neural Network 44.5% 87.5% SVM geno2pheno 44.7% 90.6% (Low et al, 2007) 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 7
Problems with clinical samples 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 8
Solution massively parallel sequencing? => see talk by M.Däumer 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 9
Prevalence of X4 Phenotype by Baseline CD4 Count 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 10
Clinical samples Predictions of population based sequences generally worse Approach: Incorporation of clinical markers into the prediction model Data: HOMER cohort, coreceptors determined with Trofile Results: Significant improvements with clinical markers such as CD4- cell counts, viral loads (Sing et al., 2007) 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 11
Geno2pheno[coreceptor] version 2.0 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 12
Regions involved in coreceptor binding V3 is not the only region involved in coreceptor binding Mutations beyond V3 affecting coreceptor usage reported, e.g. in: Boyd et al., 1993: A single amino acid substitution in the V1 loop of human immunodeficiency virus type 1 gp120 alters cellular tropism. Koito et al., 1995: Small amino acid sequence changes within the V2 domain can affect the function of a T-cell line-tropic human immunodeficiency virus type 1 envelope gp120. Carrillo et al., 1996: Human immunodeficiency virus type 1 tropism for T-lymphoid cell lines: role of the V3 loop and C4 envelope determinants. Cho et al., 1998: “…both the V1/V2 and V3 regions increased the efficiency of CXCR4 use” However: no analysis on large dataset of experimentally determined genotype-phenotype (Kwong et al., 1998) pairs 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 13
V2-V3 dataset 916 samples from 312 different patients Epidemiological bias reduced: Only at most one R5- and one X4-sequence per patient allowed (randomly selected) Experiments repeated 10 times Features / positions had to be significant in all 10 runs Position numbering according to Consensus B (Los Alamos, July 2007) 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 14
Data preparation / region of extensive length polymorphism Sequences profile-aligned with ClustalW 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 15
Properties of the different regions Region Feature Mean in R5 Mean in X4 p-value # pos. charged amino acids 5.73 5.72 0.9381 # neg. charged amino acids 3.10 2.95 0.1201 charge 2.63 2.77 0.3926 V2-stem 0.0247 # N-glycosylation sites 0.91 0.81 length of region 32.95 32.86 0.2301 # pos. charged amino acids 0.63 1.03 0.0037 # neg. charged amino acids 1.85 1.48 0.0006 charge -1.21 -0.45 < 0.0001 V2-polymorphic # N-glycosylation sites 0.69 0.83 0.1981 length of region 8.63 9.51 0.0495 # pos. charged amino acids 6.37 6.75 0.0718 0.0004 # neg. charged amino acids 4.95 4.43 V2 (full) charge 1.41 2.31 0.0001 # N-glycosylation sites 1.61 1.65 0.6959 length of region 41.58 42.38 0.0747 # pos. charged amino acids 10.40 10.44 0.8100 # neg. charged amino acids 6.71 6.80 0.5728 C2 charge 3.68 3.63 0.8269 # N-glycosylation sites 5.77 5.78 0.9309 length of region 98.93 98.98 0.1789 # pos. charged amino acids 6.08 7.86 < 0.0001 # neg. charged amino acids 1.69 1.24 < 0.0001 charge 4.39 6.62 < 0.0001 V3 < 0.0001 # N-glycosylation sites 0.98 0.59 length of region 34.90 35.08 0.1532 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 16
Mutations significantly associated with coreceptor usage Fisher’s exact test every occurring mutation at every position within V2-V3 tested Significant over all 10 replicates: 105 mutations at 52 positions in total ( V2: 17/12, C2: 12/10, V3: 76/30 ) 64 mutations at 42 positions correlated with X4-phenotype ( V2: 8/8, C2: 8/8, V3: 48/22 ) 41 mutations at 40 positions correlated with R5-phenotype ( V2: 9/9, C2: 4/4, V3: 28/27 ) 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 17
Prediction results C2 slightly better than guessing 83.5% V2 surprisingly “good” 78.5% 11/25-rule 11/25-rule, V2V3 significantly better Specificity: 96.1%, Sensitivity: 66.7% than V3 alone (P = 0.0019) Region Ø AUC C2 0.658 V2 0.730 V3 0.914 V2V3 0.933 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 18
Evaluation on clinical isolates 268 samples from therapy-naïve patients with Trofile phenotype models trained on Los Alamos data results: sensitivity at specificity of 90% V3-alone: 54.2% V2+V3: 62.8% area under the ROC curve: V3-alone: 0.778 V2+V3: 0.841 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 19
Tropism reference labs Labor Lademannbogen Hamburg Dr. Rolf Kaiser Labor Fenner Uni Köln Hamburg Patrick Braun PZB Aachen Dr. Berg Berlin Labor Thiele NRZ Dr. H. Walter Kaiserslautern Uni Erlangen Labor Jajaprax München Dr. Martin Stürmer Uni Frankfurt/ Labor Schönian Harzer/ Raunheim 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 20
Results on German “cohort” reference labs sequence V3 loop comparison between Trofile assay and geno2pheno dataset: 234 genotype-phenotype pairs in February 2008 161 (68.6%) R5, 73 (31.2%) X4 Geno2pheno-results: 10%-FPR: 64.4% sensitivity, 87.6% specificity 20%-FPR: 75.3% sensitivity, 76.4% specificity large differences between different labs => see talk by M.Obermeier 4/3/2008 Genotypic analysis of coreceptor usage – New developments for geno2pheno [coreceptor] Thielen, Alexander 21
Recommend
More recommend