Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max Planck Institute for Informatics
Cell entry Wu et al. Structures of the CXCR4 Chemokine GPCR with Small-Molecule and Cyclic Peptide Antagonists Science 19 November 2010: 330 (6007), 1066-1071.
V3 loop binds to coreceptor Wu et al. Structures of the CXCR4 Chemokine GPCR with Small-Molecule and Cyclic Peptide Antagonists Science 19 November 2010: 330 (6007), 1066-1071.
HIV tropism • Relevant coreceptors: CCR5 and CXCR4 • Viruses that can only use the CCR5 coreceptor: R5 • Viruses that can use the CXCR4 coreceptor: X4-capable
Entry inhibitors • Maraviroc – CCR5 antagonist – Approved for patient treatment • AMD-3100 – CXCR4 antagonist – Never approved for patient treatment
Want to know which patients benefit from taking maraviroc • Assays for tropism determination – Trofile – ESTA (enhanced sensitivity trofile assay) – Disadvantages: • Long turnaround • Require large sample volume • Genetic tests (V3 loop of gp120) – Sanger data – Next Generation Sequencing (NGS) data
Tools to predict tropism from genetic data • Sanger data – geno2pheno [coreceptor] [1] – WetCat [2] – WebPSSM [3] • NGS data – Variants of geno2pheno [coreceptor] and WebPSSM [4] 1. Lengauer T, Sander O, Sierra S, Thielen A, Kaiser R. Nat Biotechnol. , 2007 2. Pillai, S. et al . AIDS Res. Hum. Retroviruses 19, 145–149 3. Jensen, M.A. et al . J. Virol. 77, 13376–13388 4. Swenson, L. C. et al. J Infect Dis. (2011) 203 (2): 237-245
How do we represent the virus population inside a patient (V3 loop sequences)? CIRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTREGV-MGPG-AIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPG-AIYATGQIIGNIRQAHC CT---N--REGVHMGPG-AIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPG-AIYATGRIIGNIRQAHC CTR-NN-TREGVHMGPG-AIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPGGAIHATGQIIGNIRQAHC CTR-NN-TREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRANNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQARC CTRLNDNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTREGVHMGPGGAIYATRQIIGNIRQAHC CTRLNNNTREGVHMVPGGAIYATGQIIGNIRQAHC CTRLNN-TSEHISIGPGRAWVAARNIIGD-RKAHC CTRLNNNTRVGVHMGPGGAIYATGQIIGNIRQAHC CTRLNN-TSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSE-ISIGPGRAWVAARNIIGDIRKAHC CTRLNNNT-EHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTGEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARN-IGDIRKAHC CTRLNNNTNKHISIEPGRAWVAAREIKGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNVIGDIRKAHC CTRLNNNTNKHISIGLGRAWVAAREIKGDIRKAHC CTRLNNNTSEHISIGPGRAWVVARNIIGDIRKAHC CTRLNNNTNKHISIGPGKAWVAAREIKGDIRKAHC CTRLNNNTSERISIGPGRAWVAARNVIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAR-IKRSIRKAHC CTRLNNNTSKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARDIKGDIRKAHC CTRLNSNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREI-GDIRKAHC CTRLSNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRPNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRPNNNTRRSIHIGPGRAFYAG---IGDIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHR CTRPNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDMRKAHC CTRPYANRKKSIHIGTG--FYTIKEIKGNVKQAYC CTRLNNNTNKHISIGPGRAWVAAREIKGGIRKAHC CTRPYANRKKSIHIGTGR-FYTIKEIKGNVKQAYC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRPYANRRKSIHIGTG--FYTIKEIKGNVKQAYC CTRPYANRRKSIHIGTGR-FYTIKEIKGNVKQAYC CTRLNNNTNKHISIGPGRAWVAARNIIGGIRKAHC CTRPYANSRKSIHIGTG--FYTIKEIKGNVKQAYC CTRLNNNTNKHISIGPGRAWVAARNVIGDIRKAHC CTRVNNNTREGVHMGPG-AICATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRVNNNTREGVHMGPG-AIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRTWVAARQIIGDIRKAHC CTRVNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISLGPGRAWVAARNIIGDIRKAHC YTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC
Principal Component Analysis (PCA) • Represent axes of maximal variance (principal components)
Principal Component Analysis (PCA) • Represent axes of maximal variance (principal components) Principal component 1 (PC1)
PCA CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC
Next Generation Multi-Instance Learning Patient 1 Patient 2 CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRPYANRRKSIHIGTGRAFYTIKEIKGNVKQAYC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTREGVHMGPGRAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Support Vector Machine with normalized set kernel: 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑘 𝑙 𝑂𝑂𝑂 = � 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑗 𝑙 𝑡 ( 𝑦 𝑘 , 𝑦 𝑘 ) 𝑦 𝑗 ∈𝑌 𝑗 , 𝑦 𝑘 ∈𝑌 𝑘 Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels . International Conference on Machine Learning
Next Generation Multi-Instance Learning Patient 1 Patient 2 CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRPYANRRKSIHIGTGRAFYTIKEIKGNVKQAYC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTREGVHMGPGRAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Support Vector Machine with normalized set kernel: 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑘 𝑙 𝑂𝑂𝑂 = � 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑗 𝑙 𝑡 ( 𝑦 𝑘 , 𝑦 𝑘 ) 𝑦 𝑗 ∈𝑌 𝑗 , 𝑦 𝑘 ∈𝑌 𝑘 Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels . International Conference on Machine Learning
Next Generation Multi-Instance Learning Patient 1 Patient 2 CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRPYANRRKSIHIGTGRAFYTIKEIKGNVKQAYC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTREGVHMGPGRAIYATGQIIGNIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Support Vector Machine with normalized set kernel: 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑘 𝑙 𝑂𝑂𝑂 = � 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑗 𝑙 𝑡 ( 𝑦 𝑘 , 𝑦 𝑘 ) 𝑦 𝑗 ∈𝑌 𝑗 , 𝑦 𝑘 ∈𝑌 𝑘 Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels . International Conference on Machine Learning
Improve predictions for last generation sequencing Patient 1 Patient 2 CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Support Vector Machine with normalized set kernel: 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑘 𝑙 𝑂𝑂𝑂 = � 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑗 𝑙 𝑡 ( 𝑦 𝑘 , 𝑦 𝑘 ) 𝑦 𝑗 ∈𝑌 𝑗 , 𝑦 𝑘 ∈𝑌 𝑘 Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels . International Conference on Machine Learning
Improve predictions for last generation sequencing Patient 1 Patient 2 CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Support Vector Machine with normalized set kernel: 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑘 𝑙 𝑂𝑂𝑂 = � 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑗 𝑙 𝑡 ( 𝑦 𝑘 , 𝑦 𝑘 ) 𝑦 𝑗 ∈𝑌 𝑗 , 𝑦 𝑘 ∈𝑌 𝑘 Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels . International Conference on Machine Learning
Improve predictions for last generation sequencing Patient 1 Patient 2 CTRLNNNTREGVHMGPGGAIYATGQIIGNIRQAHC CTRLNNNTSEHISIGPGRAWVAARNIIGDIRKAHC CTRLNNNTNKHISIGPGRAWVAAREIKGDIRKAHC CTRLNNNTREHISIGPGGAWVAAREIKGDIRKAHC CTRLNNNTNKHISIGPGRAWVAARQIIGDIRKAHC CTRLNNNTNKHISMGPGRAWVATGQIIGDIRQAHC CTRLNNNTNKHISIGPGRAWVAARNIIGDIRKAHC Support Vector Machine with normalized set kernel: 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑘 𝑙 𝑂𝑂𝑂 = � 𝑙 𝑡 𝑦 𝑗 , 𝑦 𝑗 𝑙 𝑡 ( 𝑦 𝑘 , 𝑦 𝑘 ) 𝑦 𝑗 ∈𝑌 𝑗 , 𝑦 𝑘 ∈𝑌 𝑘 Gärtner, T., Flach, P. A., Kowalczyk, A., Smola, A., J., Multi-Instance Kernels . International Conference on Machine Learning
Data • Maraviroc versus Optimized Therapy in Viremic Antiretroviral Treatment- Experienced Patients (MOTIVATE) + 1029 – 876 patients with NGS data of V3 loop • Also patients with X4-capable viruses (according to Trofile) – Treatment: maraviroc once-daily/twice-daily – Viral loads measured at various time points Swenson, L. C. et al. J Infect Dis. (2011) 203 (2): 237-245
Performance comparison • Predict class label: Treatment success • Compare measures in patient classes – Median log10 reduction in pVL after eight weeks • 5-fold nested cross-validation
Recommend
More recommend