A Feedback Shift Correction in Predicting Conversion Rates under Delayed Feedback Shota Yasui, Gota Morishita, Komei Fujita, Masashi Shibata The Web Conference 2020
Introduction and Problem Setting 2
Conversion Prediction Predict Conversion-Rate(CVR) for each request. DSP bid request User use Apps AD Auction Predicting CVR is important to decide the bid price 3
Ideal loss function The following loss should be minimized. features Conversion The ideal parameters are as follow model This is not possible! Because we do not observe c due to the delayed feedback. 4
Delayed Feedback 5
Delayed Feedback timestamp of click and cv for certain user time timestamp of timestamp of CV Click delay ● user takes sometimes to purchase items after clicked the ad. 6
The problem of Delayed Feedback training timestamp of click and cv for certain user Unobserved begins time timestamp of timestamp of CV Click included in training data ● we can not observe CV for this user ● this sample is recognized as negative label! (mislabeled) 7
The relation between Y and C correctly Y=1 labeled S = 1 C=1 Prob of correctly labeled mislabeled S = 0 Y=0 C=0 Prob of mislabel observable label 8 true label
Bias in standard supervised approach ideal loss actual loss(ERM) 9 Inconsistent!
Our Solution Importance Weight Approach 10
Importance Weight(FSIW) approach We propose consistent loss based on the Importance Weight(Propensity Score) ideal-loss Unbiased-loss (consistent?) Importance Weight 11
Importance Weight(FSIW) approach Our empirical loss Importance Weight The basic idea is to weight each sample by the conditional density ratio. 12
How to estimate FSIW We estimate these probability from data old enough to observe S and C. 13
Counterfactual Dead Line training data week 1 week 2 week 3 discard 14
Counterfactual Dead Line training data week 1 week 2 week 3 discard Train models for 15
Counterfactual Dead Line training data week 1 week 2 week 3 discard Train models for training data Importance weight week 1 week 2 week 3 Train the CVR model 16
features of our proposed method It is just a importance weight ○ can be used for any CVR model ○ can fit the delay nonparametrically ○ does not increase the time complexity of CVR models 17
Experiment 18
Conversion Logs Dataset ● Open data provided by Criteo(Link) ● 30days of click and CV log ● Used in Chapelle(2014) ● observation period is 30days 19
Experiment procedure iterate for 7days day = 22 train(3 weeks) test day = 23 train(3 weeks) test day = 24 train(3 weeks) test averaging these results day = 28 train(3 weeks) test time 20
Result 1 Pure-Logistic Chapelle(2014) Proposed Method Regression ● Normalized-logloss(NLL) is the most important metrics ○ we use prediction probability for bidding ○ logloss(LL) is sensitive to the base CVR 21
Dynalyst Data ● DSP in Cyberagent.inc ● 2 experiments ○ the same procedure as the first experiment ■ focus on three campaigns ■ baseline model is FFM (Juan 2017) ○ Online A/B test 22
Three Campaigns ● Observational period is different by campaings ○ S: 1days ○ M: 3days 23 ○ L: 7days
Result 2 Only Campaign L shows the improvement. 24
Follow Up Online Experiment@Campaign-L ● Improved cost consumption and CV. ● CPA does not change or slightly decreased. 25
Conclusion ● We proposed a consistent loss to predict CVR under Delayed Feedback. ● Our method performs better in two offline and one online experiment. Thank you for listening! 26
appendix
cumulative distribution of delay 28
effect of counterfactual deadline 29
Recommend
More recommend