SUPPORT VECTOR MACHINES FOR DIFFERENTIAL PREDICTION Finn Kuusisto 1 , Vitor Santos Costa 2 , Houssam Nassif 3 , Elizabeth Burnside 1 , David Page 1 , and Jude Shavlik 1 1 University of Wisconsin β Madison 2 University of Porto 3 Amazon 9/18/2014 Support Vector Machines for Differential Prediction
DIFFERENTIAL PREDICTION Goal Use modeling techniques to gain insight about the differences between two subgroups of a population. 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING ( R A D C L I F F E & S I M P S O N , 2 0 0 8 ) How do we choose which customers to target with some marketing activity? 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING ( R A D C L I F F E & S I M P S O N , 2 0 0 8 ) How do we choose which customers to target with some marketing activity? Persuadables Customers who respond positively to marketing activity. Sure Things Customers who respond positively regardless. Lost Causes Customers who respond negatively regardless. Sleeping Dogs Customers who respond negatively to marketing activity. 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING ( R A D C L I F F E & S I M P S O N , 2 0 0 8 ) True customer groups are unknown. Target Control Response No Response Response No Response Persuadables, Sleeping Dogs, Sleeping Dogs, Persuadables, Sure Things Lost Causes Sure Things Lost Causes 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING Lift The number of true positives that a classifier achieves at a given proportion of the population labeled positive. Uplift The difference in lift produced by a classifier between target and control subgroups. π©π½π½ = π©π½π΄ πΌ β π©π½π΄ π« 9/18/2014 Support Vector Machines for Differential Prediction
TASK: ADVERSE COX-2 INHIBITOR EFFECTS ο‘ Non-steroidal anti-inflammatory drug (NSAID) ο‘ Significantly reduced occurrence of adverse gastrointestinal effects common to other NSAIDs (e.g. ibuprofen) ο‘ Rapid and widespread acceptance for treatment of ailments such as arthritis ο‘ Later clinical trials showed increased risk of myocardial infarction (MI), or βheart attackβ Identify patients who are susceptible to an increased risk of MI as a direct result of taking COX-2 inhibitors. 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING TO MEDICINE: COX-2 INHIBITORS Want Identify patients who demonstrate an increased risk of MI as a direct result of being treated with COX-2 inhibitors. Main Assumption Patients with an increased risk of MI due to treatment with COX-2 inhibitors are directly analogous to customers with an increased chance of buying due to targeting β the persuadables. 9/18/2014 Support Vector Machines for Differential Prediction
METHODS ο‘ Compared SVM Upl against 4 alternate SVM methods ο‘ 10-fold cross-validation for evaluation ο‘ Cost parameters selected from 10 through 10 β6 ο‘ Mann-Whitney test at 95% confidence for per-fold AUU comparison 9/18/2014 Support Vector Machines for Differential Prediction
RESULTS: COX-2 INHIBITORS COX-2 No COX-2 SVM Upl Model AUU AUL AUL π -value πππ πππ 50.7 123.4 72.7 - 0.004 * Two-Cost 20.0 126.2 106.3 0.002 * COX-2-Only 13.8 151.5 137.7 0.002 * Standard 1.2 147.7 146.5 0.037 * Flipped 28.5 102.2 73.6 0.002 * Baseline 0.0 0.0 0.0 9/18/2014 Support Vector Machines for Differential Prediction
RESULTS: COX-2 INHIBITORS 9/18/2014 Support Vector Machines for Differential Prediction
HOW Extend previous SVM work maximizing AUC (Joachims, 2005) to maximize AUU instead. 9/18/2014 Support Vector Machines for Differential Prediction
SVM FOR UPLIFT Let the positive skew of data be: π π = π + π Then ( Tuffery, 2011) : π΅ππ = π Γ π 2 + 1 β π π΅ππ· 9/18/2014 Support Vector Machines for Differential Prediction
SVM FOR UPLIFT π΅ππ = π΅ππ π β π΅ππ π· = π π Γ π π 2 + 1 β π π π΅ππ· π β π π· Γ π π· 2 + 1 β π π· π΅ππ· π· πππ¦ π΅ππ β‘ πππ¦ π π Γ 1 β π π π΅ππ· π β π π· Γ 1 β π π· π΅ππ· π· β πππ¦ π΅ππ· π β π π· Γ 1 β π π· π΅ππ· π· π π Γ 1 β π π π πππ¦ π΅ππ β‘ πππ¦ π΅ππ· π β ππ΅ππ· π· 9/18/2014 Support Vector Machines for Differential Prediction
TASK: IN SITU BREAST CANCER ο‘ Most common cancer in women ο‘ Two basic stages: In situ and invasive ο§ In situ cancer cells are localized ο§ Invasive cancer cells have infiltrated surrounding tissue ο‘ Younger women tend to have more aggressive in situ cancer ο‘ Older women sometimes have indolent in situ cancer Identify older patients with indolent in situ breast cancer. 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING TO MEDICINE: BREAST CANCER Want Identify older patients with in situ breast cancer that is distinct from that of younger patients. Main Assumption Older patients with in situ breast cancer that is distinct from that of younger patients, who tend to have aggressive cancer, have a decreased risk of invasive progression. 9/18/2014 Support Vector Machines for Differential Prediction
RESULTS: BREAST CANCER Older Younger SVM Upl Model AUU AUL AUL π -value πππ πππ 19.2 64.3 45.1 - Two-Cost 13.5 74.3 60.8 0.432 0.037 * Older-Only 5.9 67.7 61.9 0.049 * Standard 11.0 75.4 64.3 0.020 * Flipped 4.8 53.9 49.1 0.004 * Baseline 11.0 66.0 55.0 9/18/2014 Support Vector Machines for Differential Prediction
RESULTS: BREAST CANCER 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING SIMULATION ο‘ Generated synthetic customer population ο‘ Subjected customer population randomly to simulated marketing activity ο‘ Measured uplift as usual ο‘ Measured ROC with Persuadables as the positive class, others as negative 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING SIMULATION: UPLIFT CURVE 9/18/2014 Support Vector Machines for Differential Prediction
UPLIFT MODELING SIMULATION: PERSUADABLE ROC 9/18/2014 Support Vector Machines for Differential Prediction
CONCLUSIONS & FUTURE WORK ο‘ Extended previous SVM work on AUC maximization to AUU ο‘ Results suggest SVM Upl achieves better uplift than many alternate SVM methods ο‘ May want to make performance guarantees for control group ο‘ May want to interpret learned model ο‘ Better verification that maximizing uplift is appropriate goal 9/18/2014 Support Vector Machines for Differential Prediction
THANKS Questions? 9/18/2014 Support Vector Machines for Differential Prediction
SELECTED REFERENCES Radcliffe, N. and Simpson. R.: Identifying who can be saved and who will be driven away by retention activity. Journal of Telecommunications Management (2008). Tuffery, S.: Data Mining and Statistics for Decision Making. John Wiley & Sons, 2 nd edn. (2011). Joachims, T.: A support vector method for multivariate performance measuers. In: Proceedings of the 22 nd International Conference on Machine Learning (2005). 9/18/2014 Support Vector Machines for Differential Prediction
APPENDIX 9/18/2014 Support Vector Machines for Differential Prediction
SVM FOR ROC ( J O A C H I M S , 2 0 0 5 ) β² as a predicted label on pairs of positive ( i ) and negative ( j ) examples, Define π§ ππ β² = 1 if π§ π β² > π§ π β² , and -1 otherwise. where π§ ππ β that maximizes: Joachims β algorithm to maximize AUC corresponds to finding π§ ππ π± π Ξ¨ π¦ , π§ β² , π§ β² + Ξ π΅ππ· π§ Where: π π π π = 1 β² = π± π 1 β² π² π β π² π β² β² , π§ π± π Ξ¨ π² , π§ Ξ π΅ππ· π§ 2 1 β π§ ππ 2 π§ ππ π=1 π=1 π=1 π=1 9/18/2014 Support Vector Machines for Differential Prediction
SVM FOR UPLIFT Let the positive skew of data be: π π = π + π Then: π΅ππ = π π 2 + 1 β π π΅ππ· 9/18/2014 Support Vector Machines for Differential Prediction
SVM FOR UPLIFT π π΅ π πΆ π΅ππ = π 2 + 1 β π π΅ π΅ππ· π΅ β π πΆ 2 + 1 β π πΆ π΅ππ· πΆ π΅ πππ¦ π΅ππ β‘ πππ¦ π π΅ 1 β π π΅ π΅ππ· π΅ β π πΆ 1 β π πΆ π΅ππ· πΆ β πππ¦ π΅ππ· π΅ β π πΆ 1 β π πΆ π΅ππ· πΆ π π΅ 1 β π π΅ Then: Defining: π = π πΆ 1 β π πΆ πππ¦ π΅ππ β‘ πππ¦ π΅ππ· π΅ β ππ΅ππ· πΆ π π΅ 1 β π π΅ 9/18/2014 Support Vector Machines for Differential Prediction
SVM FOR UPLIFT Now simply redefine the AUC optimization: π π΅ π π΅ π πΆ π πΆ β² = π± π 1 + ππ± π 1 β² π² π β π² π β² π± π Ξ¨ π² , π§ 2 π§ ππ 2 π§ ππ π² π β π² π π=1 π=1 π=1 π=1 π π΅ π π΅ π πΆ π πΆ = 1 + π 1 β² , π§ β² β² Ξ π΅ππ π§ 2 1 β π§ ππ 2 1 β π§ ππ π=1 π=1 π=1 π=1 9/18/2014 Support Vector Machines for Differential Prediction
Recommend
More recommend