EvalUMAPWorkshop Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity College Dublin The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
Content www.adaptcentre.ie ❖ Proposed Challenge ❖ Gym-Push ❖ Evaluation Metrics ❖ Challenge Entry ❖ Results & Discussion ❖ Limitations ❖ Final Thoughts
Proposed Challenge www.adaptcentre.ie Currently no established or standardized means for repeatable and comparative evaluation of algorithms and systems in the UMAP space. Goal Shared Task “focuses on user model generation using logged mobile phone data, with an assumed purpose of supporting mobile phone notification suggestion.” 1 1. Proposal for a Shared Challenge in the UMAP Space, EvalUMAP Whitepaper 2019
Social Influencer Problems www.adaptcentre.ie
Proposed Challenge www.adaptcentre.ie “the challenge is to create an approach to generate personalized notifications on individuals’ mobile phones, whereby such personalization would consist of deciding what events (SMS received, etc.) to show to the individual and when to show them.” 1 Challenge 1 • Given 3 months historical notification data (for training) • Develop a user model which generates a personalized notification given a context • Using Gym-Push, user model is evaluated using test data and evaluation metrics Challenge 2 • Given small sample of notification data (no training) • Develop an adaptive user model which generates a personalized notification given a context • Using Gym- Push, user model is evaluated, in simulated “real - time”, using test data and evaluation metrics 1. Proposal for a Shared Challenge in the UMAP Space, EvalUMAP Whitepaper 2019
Gym-Push www.adaptcentre.ie OpenAI Gym Open source toolkit for “developing and comparing reinforcement learning algorithms” 1 Gym-Push Custom OpenAI Gym environment simulating push-notification overload on mobile device users 1. https://gym.openai.com/
Gym-Push www.adaptcentre.ie Gym-Push • Ease of installation – pip, docker, hosted • Multiple communities – RL, UMAP, HCI • End-user interface • Established Online Leaderboard
Gym-Push www.adaptcentre.ie Challenge 1
Gym-Push www.adaptcentre.ie Challenge 2
Gym-Push www.adaptcentre.ie
Gym-Push www.adaptcentre.ie Train on Real, Test on Synthetic 1 RMSE F1 scores differ in range 0.02 – 0.07 indicating synthetic data imitates real world data. 1. Esteban, C., Hyland, S.L., R¨atsch, G.: Real-valued (medical) time series generation
Gym-Push www.adaptcentre.ie
Evaluation Metrics www.adaptcentre.ie Performance Diversity Response Learning Time Rate
Evaluation Metrics www.adaptcentre.ie Simulated User • AdaBoost Classifier chosen • Trained on 3 months of historical user data • Acc avg = 83.8%, F1 avg = 72.8% Accuracy Precision Recall F1
Challenge Entry www.adaptcentre.ie Challenge 1 • MLP used for Generator & Discriminator • Notifications OHE vector length 28 • Trained using RMSProp in 128 mini-batch chunks over 2000 epochs
Results & Discussion www.adaptcentre.ie
Results & Discussion www.adaptcentre.ie
Limitations www.adaptcentre.ie Simulated Data Proposed Simulated User Evaluation metrics Challenge domain specific Challenge Diversity Online-learning Entry
Final Thoughts www.adaptcentre.ie Simulated More challenge game domains environments Cross domain Domain specific metrics metrics Integrate with Create a existing group of evaluation evaluators services e.g. per domain TIRA
www.adaptcentre.ie Thank you. Questions? Email: kieran.fraser@adaptcentre.ie
Recommend
More recommend