Dynamic Embeddings for User Profiling in Twitter Shangsong Liang 1 , Xiangliang Zhang 1 , Zhaochun Ren 2 , Evangelos Kanoulas 3 1 KAUST, Saudi Arabia 2 JD.com, China 3 University of Amsterdam, The Netherlands
Overview Ò The Task Background and Related Work Ò Our Method Ò Dynamic User and Word Embedding Model (DUWE) Ò Streaming Keyword Diversification Model (SKDM) Ò Experiments Ò Conclusion 2
The Task Input : A stream of tweets generated across the time Twitter Users Tweets over time Output : A set of keywords to profile the user at different point in time Sport Food Given a user at time t 3
The Task Twitter Tweets over time Users Sport Food Relevant Given a user at time t Diversified Dynamic 4
Background of User Profiling Problem Ò Expert finding task at TREC 2005 enterprise track Ò Given documents which describes expert candidates, answer a query with a sorted name list in a specific domain, ☛ uncovering associations between people and topics Ò A generative language modeling approach in Balong et al (2007) Ò Works on a Static document collection Ò Assumes users’ profiling results are unchanged Need Dynamic User Profiling 5
Dynamic User Profiling Approaches Ò ExperTime (Rybak et al 2014) Ò A probabilistic model for learning how personal research interests evolve (Fang and Godavarthy 2014) 6
Limitations of Current User Profiling Methods Ò Treat words as atomic units leading to a vocabulary mismatch that harms performance Ò Represent words and users in disjoint vocabulary spaces making it difficult to measure the similarity between users and words when constructing the profile Can words and users be embedded in the same semantic space? Can their embedding be modeled in the dynamic environment? 7
Related Work in Dynamic Topic Models and Dynamic Embedding Ò Dynamic Topic Models: modeling dynamic user interests Ò Topic over time model (Wang et al. KDD 2006) Ò Topic tracking model (Iwata et al. IJCAI 2009) Ò Dynamic user clustering topic model (Liang et al. KDD 2016), etc Ò None of them is for user profiling Ò Dynamic Word Embedding Dynamic word embedding by separating data into time bins, and apply Ò word2vec within each bin (Kim et al. 2014, Hamilton et al. 2016) Or based on Bayesian skip-gram model (Bamler and Mandt, 2017) Ò All of them are for words only but not for users Ò All of them are not for user profiling Ò 8
Overview Ò The Task Background and Related Work Ò Our Method Ò Dynamic User and Word Embedding Model (DUWE) Ò Streaming Keyword Diversification Model (SKDM) Ò Experiments Ò Conclusion 9
Our Approach Ò Dynamic User and Word Embedding Model (DUWE) Ò Infer both users’ and words’ embeddings over time in the same semantic space Ò Enable to measure the similarities between users’ and words’ embeddings Ò Streaming Keyword Diversification Model Ò Retrieve relevant keywords to profile users’ current interests over time Ò Diversify the returned relevant keywords such that the keywords can cover all aspects of the users’ interests 10
Dynamic User and Word Embedding α 2 α 2 User Diffusion p ( U t | U t − 1 ) ∝ N ( U t − 1 , α t − 1 I ) · N ( 0 , α 0 I ) α α Observed Observed co- user-word z t − 1 y t − 1 y t z t occurrence of pairs at t-1 n + m + n + m + words at t-1 t − 1 t − 1 t t v t − 1 u t − 1 v t u t |U t − 1 | |U t | V V User Word representation representation α t − 1 α t β t − 1 β t at t at t-1 β 2 β 2 Word Diffusion p ( V t | V t − 1 ) ∝ N ( V t − 1 , β t − 1 I ) · N ( 0 , β 0 I ) β β 11
Diffusion of user representation Gaussian Prior α 2 α 2 p ( U t | U t − 1 ) ∝ N ( U t − 1 , α t − 1 I ) · N ( 0 , α 0 I ) α α According to Kalman filtering, we define the variance of transition kernel for a user embedding from t-1 to t • A • F . • F measuring the word distribution changes from previous time step t-1 to the current time step t for user u 12
Diffusion of word representation Gaussian Prior β 2 β 2 p ( V t | V t − 1 ) ∝ N ( V t − 1 , β t − 1 I ) · N ( 0 , β β β 0 I ) According to Kalman filtering, we define the variance of transition kernel for a word embedding from t-1 to t • A • F . • F measuring the word distribution changes from t-1 to the current time step t 13
DUWE model inference Ò Apply the skip-gram filtering for the inference (Bamler et al. 2017) and the variational inference algorithm to obtain the embeddings Ò Posterior distribution over and conditional on the statistics information and as follows: positive and negative indicator positive and negative indicator matrices for all user-to-word pairs matrices for all word-to-word pairs model transition for users model transition for words where we have: skip-gram model for words skip-gram model for 14 user and words
Streaming Keyword Diversification Model Ò generating top-K relevant and diversified keywords for profiling users’ interests at time t. 15
Overview Ò The Task Background and Related Work Ò Our Method Ò Dynamic User and Word Embedding Model (DUWE) Ò Streaming Keyword Diversification Model (SKDM) Ò Experiments Ò Conclusion 16
Experimental Setup Ò Datasets Ò 1,375 users randomly sampled from Twitter Ò 3.78 million tweets posted by the users from the beginning of their registrations up to May 31, 2015 Ò Two types of G round T ruth: One for evaluating Relevance -oriented ( RGT ) performance and another for evaluating Diversity -oriented ( DGT ) performance. Ò Evaluation Metrics Ò Relevance: Pre (Precision), NDCG, MRR, MAP Ò Their semantic version of the metrics, denoted as Pre-S, NDCG-S, MRR-S, MAP-S Ò Diversity: Pre-IA (Intent-Aware Precision), α-NDCG, MRR-IA, MAP-IA 17
Experimental Setup Ò Baselines Ò Non-dynamic Embedding Models Skip-Gram Model, i.e., word2vec Model (SGM) Ò Distributed Representations of Documents (DRD) Ò Ò Dynamic Traditional Profiling Model Predictive Language Model (PLM) Ò Ò Dynamic Topic Model User Clustering Topic model (UCT) Ò Ò Dynamic Embedding Models Dynamic Independent Skip-Gram model (DISG) Ò Dynamic Pre-initialized Skip-Gram model (DPSG) Ò Dynamic Independent Distributed Representations of documents Ò (DIDR) Dynamic Pre-initialized Distributed Representations of documents Ò (DPDR) 18
Overall Performance Ò Average relevance performance on time periods of each month 19
Overall Performance Ò Diversity performance on time periods of each month 20
An Example User’s Dynamic Profiling Results over Time Top-6 keywords of an example user’s dynamic profile, whose interests cover a number of aspects and dramatically change over time, from Sport, fitness, kitchen, exercise, to education. 21
Relevance and diversity performance over time Diversity performance over time Relevance performance over time 22
Performance w.r.t. embedding dimensionality 23
Overview Ò The Task Background and Related Work Ò Our Method Ò Dynamic User and Word Embedding Model (DUWE) Ò Streaming Keyword Diversification Model (SKDM) Ò Experiments Ò Conclusion 24
Conclusions Ò Study the problem of dynamic user profiling in Twitter Ò Propose a Dynamic User and Word Embedding model (DUWE) Ò Propose a Streaming Keyword Diversification Model (SKDM) Ò Evaluate the performance of the proposed models in real dataset, Twitter 25
Thank you for your attention! Our paper at http://www.kdd.org/kdd2018/accepted-papers/view/dynamic- embeddings-for-user-profiling-in-twitter Lab of Machine Intelligence and kNowledge Engineering (MINE): http://mine.kaust.edu.sa/
Recommend
More recommend