Mattia CF Prosperi ahnven@yahoo.it University of Roma TRE Faculty - PowerPoint PPT Presentation

Mattia CF Prosperi ahnven@yahoo.it University of “ Roma TRE ” Faculty of Computer Science Engineering Dept of Computer Science and Automation (DIA) via della vasca navale, 79 – 00149 – Rome, ITALY

Summary • The EuResist project – Project partners – Aims – Collaboration with other projects • The Integrated Data Base – Technologies • Therapy optimisation issues – Theoretical models, validation, comparison with state of the art • Web-service development – User interface – Expert validation

The EuResist Project • Funded by EU under 6th framework • Partners – Machine learning and data bases : IBM (Isr), MPI (Ger), Roma TRE (Ita), RMKI (Hun) – Statistical analyses : Kingston university (UK) – Clinical and genomic data collection, virology and clinical expertise : University of Siena (Ita), Karolinska Inst (Swe), University of Cologne (Ger) – Coordination and administration : Informa CRO (Ita) • Collaboration with the “Virolab” (funded by EU as well) exchanging data

Aims • Collect and integrate clinical and genomic data of HIV+ patients • Perform retrospective statistical studies • Develop prediction models for therapy optimisation

Data sources • ARCA (Italy) • AREVIR (Germany) • Karolinska (Sweden) • Luxembourg cohort • Probably the largest amount of information about HIV+ patients (as it concerns sequences and clinical markers) in Europe or in the world (only EuroSIDA is comparable)

Data base technologies • IBM used a centralised approach – The data are replicated from the single sources in a new data base – It is an old-fashioned data integration technology, since now the federated approach is preferred (where data are virtually stored accessing to local data bases), but possesses some practical advantages, especially with heterogeneous data sources

Data base technologies (2) • Local sources are mapped to the central DB •Reliable server •Quality controls •Interface for statistical studies and model development •HL7 compliance

Data base schema • Normalised schema (important issue from an IT point of view)

Data base size

Therapy optimisation • Objective: to determine the optimal Combined Anti- Retroviral Therapy (CART) given patient’s baseline (demographics, genomic, clinical) and historical characteristics when experiencing a Treatment Change Episode (TCE) or a first line therapy

Study Design

State of the art • Phenotype (in-vitro) – VIRCO, Virologic, virtual Geno2pheno • Rule based methods (in-vivo) – Stanford hivdb, REGA, ANRS, HIV-GRADE, various scores for specific drugs (Marcelin, Bertoli…) • Based on literature evidences, expert opinions and statistical studies • Not cross-validated, but proven to be significantly associated with virological outcomes through linear multivariable analysis • Give prediction based only on genotype, without accounting for other variables (i.e. viral load, CD4, demographics), even if sometimes their significance is adjusted for such covariates • Don’t work on combination therapies (CART) • Data driven approaches (in-vivo) – RDI (Artificial Neural Networks) • Biased study design, not properly validated

The EuResist approach • Data driven models • Large sample size • Robust cross validation • Comparison with state of the art • Comparison with expert opinions

Exploring the feature space • Usage of all information available added to the baseline genotype and treatment – Demographics, treatment history, baseline markers, past genotypes… – Derived features • Mutagenetic trees (genetic barrier) • Bayesian networks for past combination treatments • Higher order interactions • Only minimal feature set required (genotype and treatment) to perform a prediction – Not always treatment history or past genotypes are available – But the usage of additional information can enhance performances

Modelling techniques • Three independent engines developed by IBM, RM3 and MPI • The engines are combined in a meta-engine

Modelling techniques (2) • All engines use Logistic Regression (LR) – IBM uses additional features training a bayesian network on past treatments – MPI uses additional features estimating genetic barrier through mutagenetic trees – RM3 uses higher order interactions • mutation x mutation • drug x drug (x drug) • drug x mutation • drug x past drug

Modelling techniques (3) • A lot of features!!! – Hundreds of mutations (not only literature reported) – Hundreds of different CART – Other covariates – All higher order interactions (thousands!!!) • Several feature selection techniques used – AIC selection – Correlation-based Feature Selection (CFS) – SVM z-scores

Results • Individual prediction engines perform similarly • Combination of engines enhances performances – Several combination techniques explored • Usage of additional information enhances performances

Results (2) • Comparison with state of the art: – The combined engine outperforms Stanford hivdb – Also single engines do, even if less

Results (3) Variable (success prediction) odds.ratio p.value sign. Number of drugs in CART 1.9 2.00E-16 *** • Example of HIV RNA baseline LOG cp/ml 0.6 2.66E-12 *** PR_IAS_54_V 0.2 1.31E-06 *** EFV and EFV experience 0.2 2.00E-05 *** logistic model RT_184_V and 3TC 0.5 2.79E-05 *** SQV and AZT experience 0.4 0.000146 *** with higher- NFV and PI experience 0.5 0.000224 *** RT_184_V and NVP 0.4 0.000344 *** order RT_39_A and RT_211_K 0.4 0.000378 *** (Intercept) 4.8 0.000399 *** interactions RT_67_N and RT_184_V 2 0.00056 *** RTV experience 0.5 0.00061 *** TDF and EFV experience 0.5 0.000633 *** • Variable PR_63_P and PR_90_M 0.6 0.00082 *** PR_89_M and PR_93_L 3.8 0.000873 *** importance is PR_IAS_20_M 0.2 0.001149 ** EFV 1.8 0.001223 ** assessed easily PR_IAS_10_I 0.6 0.002524 ** RT_177_E and RT_207_A 2.3 0.007537 ** PR_IAS_54_L 0.2 0.007575 ** APV experience 0.5 0.008579 ** LPV and DDC experience 1.9 0.0087 ** PI_boosted and LPV experience 0.5 0.009403 **

Comparison with experts’ opinion • The “EVE” ( Expert Vs Engine ) study – Aim: assess EuResist prediction engine performances and agreement with expert opinion – Design: a set of TCE is defined, with complete information, and physicians have to give their opinion about the probability of virological success – Evaluation: kappa-statistic (measure of agreement among experts), accuracy, AUC

Web service • Technology: Ruby on Rails – open source web framework – large developers community – well documented – very good for web-service development

Web service (2) • The user inserts – Baseline viral sequence (fasta or mutation list) – Optional covariates • Baseline markers (CD4 and HIV RNA) • Age, sex, risk group • Previously experienced treatments – A suitable CART to be evaluated • The user gets – Sequence mutations and subtype match – Probability of success (with CI) for the chosen CART – A ranking of other suitable therapies (over a set of CART allowed by international guidelines)

Web service (3)

Web service (4)

Mattia CF Prosperi ahnven@yahoo.it University of Roma TRE Faculty - PowerPoint PPT Presentation

Mattia CF Prosperi ahnven@yahoo.it University of Roma TRE Faculty of Computer Science Engineering Dept of Computer Science and Automation (DIA) via della vasca navale, 79 00149 Rome, ITALY Summary The EuResist project

HIV- -1 tropism prediction 1 tropism prediction HIV Mattia CF Prosperi ahnven@yahoo.it

Review of recent developments on leptonic and semileptonic charm decays from lattice QCD

Spark and Hadoop at Yahoo: Brought to you by YARN Andy Feng Yahoo! Hadoop (afeng@yahoo-inc.com)

Performability at Yahoo Search Amr Awadallah and a bunch of other yahoos amr@yahoo-inc.com Now,

Fact Harvesting from Natural Language Text in Wikipedia Matteo Cannaviccio (Roma Tre University)

Nick Hugh VP, EMEA Yahoo 2015. Confidential & Proprietary. Yahoo 2015. Confidential &

HDFS Under the Hood Sanjay Radia Sradia@yahoo-inc.com Grid Computing, Hadoop Yahoo Inc.

Roma Engagement and Integration Conference Parallel Lives Roma Project Tuesday 10 March 2020

Sequential and Parallel Abstract Machines for Optimal Reduction Marco Pedicini (Roma Tre

Zonotopes, toric arrangements, and generalized Tutte polynomials FPSAC 2010 Luca Moci Roma Tre

Star operations on numerical semigroups Dario Spirito Universit di Roma Tre International

Learnings from scaling Ironic at Yahoo Arun S A G saga@yahoo-inc.com zer0c00l on freenode Yahoo

Yahoo! Homepage Yahoo! Homepage Nicholas C. Zakas Nicholas C. Zakas Principal Front End

Yahoo! Communities Architectures Ian Flint November 9, 2007 1 Agenda What makes Yahoo!

Top-k Aggregation Using Intersections Yahoo! Research Ravi Kumar Yahoo! Research Kunal Punera

the Robust Network Loading Problem with Dynamic Routing Sara Mattia DIS - Dipartimento di

1 Right Tool for the Job? Unique position!! They wont give up control Old fashioned ideas

2020 Census Disabled People & Older Adults The Big Shift May Day 2020 Why are Disabled

Ted Lentz AIA Ted Lentz AIA Nicolas Brewer 1908 Cass Gilbert Architect 1859-1934

My name is Aleksandra Bonikowska. I was born in d old, industrial city which is located

Benefiting a regional economy with societal-driven innovation adoption in high-tech small firms

Institutional research and decision support Marin Clarkberg Associate Vice Provost for

Ajinomoto Co., Inc. FY2012 Market and other information Note: This includes forward-looking

Aqueous Dispersions of Polyolefins Breaking the Extrusion Barrier Ronald Wevers Session 6.2 ,

Sambuz

Useful Links

Newsletter

Mail Us

Mattia CF Prosperi ahnven@yahoo.it University of Roma TRE Faculty - PowerPoint PPT Presentation

Mattia CF Prosperi ahnven@yahoo.it University of Roma TRE Faculty of Computer Science Engineering Dept of Computer Science and Automation (DIA) via della vasca navale, 79 00149 Rome, ITALY Summary The EuResist project

HIV- -1 tropism prediction 1 tropism prediction HIV Mattia CF Prosperi ahnven@yahoo.it

Review of recent developments on leptonic and semileptonic charm decays from lattice QCD

Spark and Hadoop at Yahoo: Brought to you by YARN Andy Feng Yahoo! Hadoop (afeng@yahoo-inc.com)

Performability at Yahoo Search Amr Awadallah and a bunch of other yahoos amr@yahoo-inc.com Now,

Fact Harvesting from Natural Language Text in Wikipedia Matteo Cannaviccio (Roma Tre University)

Nick Hugh VP, EMEA Yahoo 2015. Confidential &amp; Proprietary. Yahoo 2015. Confidential &amp;

HDFS Under the Hood Sanjay Radia Sradia@yahoo-inc.com Grid Computing, Hadoop Yahoo Inc.

Roma Engagement and Integration Conference Parallel Lives Roma Project Tuesday 10 March 2020

Sequential and Parallel Abstract Machines for Optimal Reduction Marco Pedicini (Roma Tre

Zonotopes, toric arrangements, and generalized Tutte polynomials FPSAC 2010 Luca Moci Roma Tre

Star operations on numerical semigroups Dario Spirito Universit di Roma Tre International

Learnings from scaling Ironic at Yahoo Arun S A G saga@yahoo-inc.com zer0c00l on freenode Yahoo

Yahoo! Homepage Yahoo! Homepage Nicholas C. Zakas Nicholas C. Zakas Principal Front End

Yahoo! Communities Architectures Ian Flint November 9, 2007 1 Agenda What makes Yahoo!

Top-k Aggregation Using Intersections Yahoo! Research Ravi Kumar Yahoo! Research Kunal Punera

the Robust Network Loading Problem with Dynamic Routing Sara Mattia DIS - Dipartimento di

1 Right Tool for the Job? Unique position!! They wont give up control Old fashioned ideas

2020 Census Disabled People &amp; Older Adults The Big Shift May Day 2020 Why are Disabled

Ted Lentz AIA Ted Lentz AIA Nicolas Brewer 1908 Cass Gilbert Architect 1859-1934

My name is Aleksandra Bonikowska. I was born in d old, industrial city which is located

Benefiting a regional economy with societal-driven innovation adoption in high-tech small firms

Institutional research and decision support Marin Clarkberg Associate Vice Provost for

Ajinomoto Co., Inc. FY2012 Market and other information Note: This includes forward-looking

Aqueous Dispersions of Polyolefins Breaking the Extrusion Barrier Ronald Wevers Session 6.2 ,

Sambuz

Useful Links

Newsletter

Mail Us

Nick Hugh VP, EMEA Yahoo 2015. Confidential & Proprietary. Yahoo 2015. Confidential &

2020 Census Disabled People & Older Adults The Big Shift May Day 2020 Why are Disabled