HIV- -1 tropism prediction 1 tropism prediction HIV Mattia CF - PowerPoint PPT Presentation

HIV- -1 tropism prediction 1 tropism prediction HIV Mattia CF Prosperi ahnven@yahoo.it University of “ Roma TRE ” Faculty of Computer Science Engineering Dept of Computer Science and Automation (DIA) via della vasca navale, 79 – 00149 – Rome, ITALY

Summary • State of the art – From charge rule to structural descriptors • Roma TRE modelling – Data collection • Sequence manipulation • Enhanced domain coding – Univariable analysis and clustering – Model technologies • Logistic regression and feature selection • Validation and comparison with other models • Interpretation of relevant features

State of the art • Charge rule (De Jong, 1992) • Neural Networks, Decision Trees, Support Vector Machines (Resch, Pillai) on 200-300 examples • Position Specific Scoring Matrices (Jensen) • Support Vector Machines (Sing) on 1’100 examples with AUC maximisation adding CD4+ cell count as additional input variable • Support Vector Machines + Structural Analysis (Sander, 2007) with AUC maximisation • Neural Networks for dual-tropism prediction (Lamers, 2008) • All models work on the sole V3 loop

State of the art (2) • SVM + Structural Analysis (Sander, 2007) seems to be the best performing model at present – 91.56% accuracy – 0.93 AUC – Minor critics concerning sample collection (all different sequences, regardless patient, without accounting for real sequence population distribution) – Improvements gained with the structural analysis, over a reference SVM trained only on the V3 dummy variable encoding

Roma TRE approach • Data: collection of samples from “Los Alamos” data base – Only one sequence per patient (the longest available, no clones) except for sequences with different tropism – No problematic sequences – All subtypes – At least V3 loop, possibly all envelope gene – Clinical markers recorded • Goal: prediction of CXCR4 usage probability (regardless CCR5 usage, dual tropic strains are pooled into X4 strains)

Sequence manipulation • Previous works used multiple alignment (clustalw or muscle) either on nucleotide or amino-acids • We used local pairwise alignment (Smith- Waterman-Gotoh) with ambiguities and frameshifts correction/detection against HXB2 strain (which is X4) – Minor differences with the output of other models

Domain coding • Binary dummy variables for specific amino acidic changes (plus ins-del and “any” substitution) in the V3 loop and in the envelope • Phisico-chemical coding for position changes • Subtype • Clinical markers (HIV RNA load, CD4 and CD8 cell counts)

Univariable analysis • CD4+ are significantly associated with tropism (low CD4+ → X4) • Subtype B, D isolates are prevalently X4 • Subtype A, C, 02_AG isolates are prevalently R5

Univariable analysis • Highly significant positions in the V3 loop • 306 ( 11 ) • 302 ( 7 ) • 303 ( 8 ) • 323 ( 28 ) • 301 ( 6 ) • 313 ( 18 ) • 321 ( 26 ) • 322 ( 27 ) • 300 ( 5 ) • 315 ( 20 ) • 320 ( 25 ) • 307 ( 12 ) • 316 ( 21 ) • 325 ( 30 ) • 304 ( 9 ) • … • A few positions outside the V3 loop are significant, but slightly over the Benj-Hoch adjusted threshold (adj.p<0.1) • 440, 192, 169

Hierachical Clustering • Threshold of 0.35: {318A, ins317}, {311I, 308S, 306del, 307del}, {322I, 320 hydrophilic, 326I} • mutations positively associated with X4 viruses tend to behave more independently (306S, 303I, 308K, 300Y and 307T)

Machine Learning • Logistic Regression (LR) • Feature selection via filter and embedded methods (univariable analysis, AIC selection, CFS, ridge shrinkage) • Comparison with other (non-linear) machine learning techniques – SVM (same settings as Sander, 2007) – Random Forests and Decision Trees (RF, DT) – Rule Bases (RIPPER, JRIP) – Instance Based Reasoning (IBR) • Multiple 10-fold cross validation for model performance assessment and model comparison – Student’s t-test adjusted (Bengio and Nadeau) for sample overlap and multiple comparisons over 10 independent runs

Results • Logistic Regression – High accuracy (92.76%) and AUC (0.93) – Enhanced domain coding performs significantly better that naïve variable encoding and sole V3 loop – Equally performing as the reference SVM

Results (2)

Conclusions • Logistic Regression is a powerful and interpretable tool for tropism prediction – Importance of envelope region analysis – Importance of enhanced variable encoding – Importance of feature selection techniques – Importance of robust validation and comparison statistics – We have a linear model: from the comparison analysis, non-linear models seem not to improve performances • The modelling technique is also suitable for combination with structure-based methods

HIV- -1 tropism prediction 1 tropism prediction HIV Mattia CF - PowerPoint PPT Presentation

HIV- -1 tropism prediction 1 tropism prediction HIV Mattia CF Prosperi ahnven@yahoo.it University of Roma TRE Faculty of Computer Science Engineering Dept of Computer Science and Automation (DIA) via della vasca navale, 79 00149

HIV tropism assessment HIV tropism assessment HIV tropism assessment HIV tropism assessment

The impact of APOBEC on the tropism of HIV Eva Heger AREVIR 09.05.2015 Eva Heger - The

Coreceptor Tropism of HIV-1 Development of a virus-free Assay HIV-1 Tropism Study

European Clinical Data on HIV-1 Coreceptor Usage and Genotypic Identification of Tropism in HIV-2

HIV RESISTENCE/ HIV TROPISM AREV I R 2 0 1 8 S TA D T H OT E L A M R M E RT U R M , K L

HIV RESISTENCE/ HIV TROPISM AREV I R 2 0 1 9 S TA D T H OT E L A M R M E RT U R M , K L

Use of a genotypic assay for the prediction of HIV-1 co-receptor tropism and guiding the use of

Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max Planck Institute for

Multicenter comparison of genotypic tropism testing: results from viral RNA and proviral DNA M.

HIV-1 Tropism Arevir workshop Martin Obermeier Specialist for laboratory medicine

Tropism Determination for HIV-1 Subtype C Isolates Alexandra Haas Jolle Bader Thomas Klimkait

HIV- -1 Integrase: 1 Integrase: HIV not just an not just an other HIV enzyme other HIV

Structural study on HIV viral tropism Olga V. Kalinina Max Planck Institute for Informatics 1

Discordant Tropism Determination for HIV-1 Isolates of CRF01_AE from Asia Wataru Sugiura 1 ,

Correlation of HIV tropism with immunological response under HAART Jolle Bader Thomas Klimkait

Genotypic prediction of viral co-receptor tropism: Correlation with enhanced Trofile Angela L

of Implementation System (PRAIS) 7 th Reporting Process National Soil Services Centre Department

Next Generation OR Lighting System Fadl Amsalam Alex Enriquez Jun Huang David Lyon Ryan Seidel

City Council May 23, 2018 City of Encinitas City Council Development Standard Review April

Go With The Flow Optical Flow-based Transport for Image Manifolds Chinmay Hegde Rice University

Medios H1 2020 Results Matthias Gaertner, CFO 25 August 2020 Disclaimer This presentation has

Risk Assessment Information is presented in good faith and is intended to be representative,

NORTH MARION School District- Bond Projects BOND OVERSIGHT COMMITTEE June 19, 2019 BOND UPDATE:

Basin 101 Public Utilities Renewal Presented by : Public Utilities Department Growth Management

Sambuz

Useful Links

Newsletter

Mail Us