online learning of website embeddings
play

ONLINE LEARNING OF WEBSITE EMBEDDINGS for Accurate Prediction of - PowerPoint PPT Presentation

ONLINE LEARNING OF WEBSITE EMBEDDINGS for Accurate Prediction of User Behavior Even when Data are Scarce Amelia White, Director of Data Science Research Nov 13, 2019 Expanding Digital Survey Data SMALL SURVEY PANEL CUSTOM DSTILLERY DEVICE


  1. ONLINE LEARNING OF WEBSITE EMBEDDINGS for Accurate Prediction of User Behavior Even when Data are Scarce Amelia White, Director of Data Science Research Nov 13, 2019

  2. Expanding Digital Survey Data SMALL SURVEY PANEL CUSTOM DSTILLERY DEVICE MODEL UNIVERSE ~200MM devices

  3. vanillaicecream.com 4/11/19 Data Used for Modeling buzzfeed.com 4/11/19 nytimes.com 4/11/19 chocicecream.com 4/11/19 buzzfeed.com 4/11/19 nytimes.com 4/11/19 SMALL SURVEY PANEL CUSTOM DSTILLERY DEVICE MODEL UNIVERSE ~200MM devices 1.5B Web site visits daily 3

  4. 10 Million URLs Millions of Users 4

  5. Need for a Reduced Dimensional Feature Space 10 Million URLs Thousands of Users

  6. REDUCED DIMENSIONAL FEATURE SPACE

  7. Taking Ideas from Natural Language Processing ● Similar data ● Sentences of words ● Sequences of web sites visited ● High dimensional categorical features

  8. Need for a Reduced Dimensional Feature Space 10 Million URLs Thousands of Users

  9. Need for a Reduced Dimensional Feature Space 10 Million URLs Thousands of Users Thousands of Users 128 Dimensional Embedding Space

  10. Website Embeddings V1: word2vec www.pophaircuts.com P(Context URL |target URL ) Output Layer Fully Connected Edges Kx128 B i B = Embedding matrix i = 0,...,K-1 Dictionary(www.short-hairstyles.co) = i K = 50,000 www.hairstyle.com www.short-hairstyles.co www.pophaircuts.com

  11. Training Word2vec ● Trained word2vec with the browsing history of all devices seen in a 2 week time period: ● Browsing history of 430,648,822 devices ● Sequence of 15,077,897,800 site visits

  12. Visualizing Embeddings Website Cluster # www.boardingarea.com 512 www.thepointsguy.com 512 www.taxifarefinder.com 512 www.theflightdeal.com 512 www.uberestimate.com 512 www.sleepinginairports.net 512 www.frugaltravelguy.com 512 www.airchina.us 512 www.cathaypacific.com 512 www.travelskills.com 512 www.travelsort.com 512 www.skyteam.com 512 www.seatmaestro.com 512 www.flyertalk.com 512 www.expertflyer.com 512 www.singaporeair.com 512 www.estimatefares.com 512

  13. ● Embedding millions of BEYOND URLs, with a manageable WORD2VEC: number of parameters ● Online learning of embeddings

  14. EMBEDDING MORE URLS WITH FEWER PARAMETERS

  15. Hash Embeddings

  16. Website Embeddings V2: Hash embeddings Output Layer P= Importance parameters Nx2 Hash Embedding m = 0,...,N Convolution layer N = 10M P m Kx128 B j B = Embedding matrix B i i,j = 0,...,K H 1 (m) = i Dictionary(‘www.kohls.com’) = m H 2 (m) = j

  17. Hash Embedding Requires Fewer Parameters Number of Parameters

  18. Measuring Embedding Quality for Parameter Selection ● Selected a ‘ground truth’ clustering, made from a known high quality embedding ● Used the silhouette score to measure how well test embeddings converged to the ground truth clustering https://platform.ai/blog/page/11/the-silhouette-loss-function-metric-learning-with-a-cluster-v alidity-index/, JIM BREMNER, APRIL 09, 2019 as the network trained

  19. Good Performance with 100x Fewer Parameters Number of Parameters s(i)

  20. ONLINE LEARNING OF EMBEDDINGS

  21. Website Embeddings V3: Online Learning of Hash Embeddings Output Layer P= Importance parameters Nx2 Hash Embedding m = 0,...,N Convolution layer P m Kx128 B j B = Embedding matrix B i i,j = 0,...,K H 1 (m) = i H 0 (‘www.kohls.com’) = m H 2 (m) = j

  22. Online Learning Optimizes Faster than Batch Learning Higher quality embeddings s(i) W2V (batch) Embeddings Hash (online) Embeddings

  23. Training the Online Embeddings B

  24. Distance in Embedding Space

  25. MODELING USERS IN EMBEDDING SPACE

  26. Need for a Reduced Dimensional Feature Space 10 Million URLs Thousands of Users Thousands of Users 128 Dimensional Embedding Space

  27. From URL Embeddings to Models

  28. From URL Embeddings to Models

  29. Embedding Features Outperform Sparse Web Features For Small Data Sets Comparing Embedding Features to Sparse % Gain in AUC Web Features ~1000 training ~1M training examples examples

  30. Embedding Features Outperform Sparse Web Features For Small Data Sets Comparing Embedding Features to Sparse % Gain in AUC Web Features ~1000 training ~1M training examples examples

  31. Embedding Features Outperform Sparse Web Features For Small Data Sets Comparing Embedding Features to Sparse % Gain in AUC Web Features ~1000 training ~1M training examples examples

  32. MODELING SURVEY DATA SMALL SURVEY PANEL CUSTOM DSTILLERY DEVICE MODEL UNIVERSE ~200MM devices

  33. Case Study: Predicting Ad Influence for Ice Cream Brand ● The Problem: ● Our Goal: ○ A survey company models which people are ○ Predicting the high scoring likely to be influenced by an advertisement respondents for an ice cream brand ○ Produce audience of devices that are ○ 5.5K survey respondents predicted to be influenceable by ad for ○ 500 high scoring respondents ice cream brand

  34. Case Study: Predicting Ad Influence for Ice Cream Brand ● Test AUC on predicting high scoring respondents: ○ Raw web behavior: 64.1 ○ Summarized web behavior: 63.5 ○ Cookie Embeddings: 75.8 Website embeddings Sparse web features Clusters of web sites

  35. THANK YOU Presented by Amelia White. awhite@dstillery.com Contributors: Christopher Jenness Melinda Han Williams MLE team: Wickus Martin Roger Cost Justin Moynihan Patrick McCarthy

Recommend


More recommend