History-alignment Models for Bias-aware Prediction of Virological Response to HIV Combination Therapy Jasm ina Bogojeska Department of C omputational Biology and Applied Algorithmics, Max-Planck Institute for Informatics AR E VIR – G enaF or – Meeting, 2012
Problem Setting Bias-aware prediction of the outcome of combination therapies given to HIV patients Develop methods that can deal with: Evolving trends in treating patients over time Sparse, uneven therapy representation Different treatment backgrounds of the samples
History-Aware Models Problem with clinical data: Samples with different treatment backgrounds – uneven sample representation regarding level of therapy experience Only dominant viral strain sequenced – no information on latent virus population Idea: Use treatment history information Therapy sequence = ≤ = ( ) { | ( ) ( ) and ( ) ( )} r t z start z start t patient z patient t Pairwise similarity of therapy sequences Adapt sequence alignment methods!
Similarity of Therapy Sequences Therapy sequence alignment Quantify pairwise therapy similarity Use drug resistance mutations
History Alignment Model Train a separate model for each therapy sequence by using knowledge from similar therapy sequences Sample weighted regularized logistic regression 1 ∑ γ + σ T arg max ( ( ), ( )) ( ( , , , ), ) S r z r t l f x z h w y w w i i i i t i t t | | D w ∈ t ( , , , ) x z h y D i i i i seq similarity loss function Consider the treatment backgrounds of the samples, the latent virus population and the current therapy Account for the sparse representation of the different therapy-histories Account for the missing information on the latent virus population Account for the sparse, uneven therapy representation
History Distribution Matching Model Cluster the training data using the pairwise similarities of their corresponding therapy sequences Apply the multi-task distribution matching method with clusters as tasks For each (target) cluster t: Multi-class logistic regression Sample-weighted logistic regression
History Distribution Matching Model Added advantages Address increasing imbalance in the representation of the effective and ineffective therapies over time Data set training tuning test Sample count 3596 1634 1307 Success rate 69 % 79 % 83 % Independent estimation of the models for the effective and the ineffective therapies in the distribution matching step Time-oriented model selection Address missing treatment history information The model for each cluster also uses the data from other clusters with apropriately derived relevance weights
Results (Treatment History) History-aware models achieve better predictions for treatment-experienced patients
Results (Therapy Abundances) History-aware models achieve better predictions for rare therapies
Conclusions Treatment history-aware methods Information extracted from treatment history enhances the performance for therapy-experienced patients and for rare therapies History-alignment method Patient-specific models that utilize detailed treatment history information History distribution matching method Address the increasing gap between the representation of successful and failing therapies over time Tackle the problem of missing treatment history information All this enhances the accuracy performance
Acknowledgements Thomas Lengauer and the HIV group at MPI Rolf Kaiser and his group
Thank You!
Data occurrence of Viral genotype 0 1 0 0 1 … resistance mutations drugs used in Current treatment 1 0 0 1 0 … current treatment drugs used in all Treatment history 1 1 0 0 1 … previous treatments Label (success or failure) 1 or -1 6336 labeled samples with different 638 combination therapies
Recommend
More recommend