Final Report Interest-aware Information Diffusion in Dynamic Social Network Zhenhao Cao Ru Wang Mobile Internet 2018. 6
Outline • Introduction • Related Work • Challenge & Motivation • Proposed Model • Experiments • References EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 1/42
Introduction • Social Network EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 2/42
Introduction – A Taxonomy • An earlier survey: a taxonomy for information cascade prediction √ • Collaborative Filtering methods • Leverage homophily: insightful • Get rid of troublesome feature engineering EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 3/42
Introduction – Why CF? • Key idea behind CF: Homophily • Transplantable to information diffusion modeling Commodity adopt Information adopt adoption entity adoption (retweet a post) not adopt not adopt EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 4/42
Related Work – Extant CF-based Studies • CRPM & IRPM [1] (CIKM2015) EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 5/42
Related Work – Extant CF-based Studies • GPOP [2] (WWW2017) EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 6/42
Related Work – Extant CF-based Studies • A Collaborative Filtering Model for Personalized Retweeting Prediction [3] (DASFAA2015) EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 7/42
Challenge & Motivation • More sufficient utility of social network information • Better adapted for Information Diffusion modeling • Novel insights into user retweet behavior EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 8/42
Challenge & Motivation • More sufficient utility of social network information - A flat “snapshot” of users’ historical behaviors - Information loss: Permutation? Sequence? Diffusion topologies? Diffusion Topology Retweet Matrix 𝑆 ··· 0 ··· 1 ··· 1 compress ··· 0 ··· 1 ··· 0 ··· 0 ··· 1 ··· ··· ··· ··· ··· ··· ··· ··· EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 9/42
Challenge & Motivation • More sufficient utility of social network information • Better adapted for Information Diffusion modeling - Leverage diffusion topologies * Essence of information diffusion * A main difference from recommendation system problems EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 10/42
Challenge & Motivation • More sufficient utility of social network information • Better adaption to Information Diffusion modeling • Novel insights into user retweet behavior Post Post Int ntere rest Att ttraction Ret Retweet or or not not? Others’ Resis Resistance Inf nfluence EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 11/42
Our Work
Our Work - ReTrend • A novel framework for information diffusion Interest-extraction Component 𝑇 𝐷 𝑌 𝑍 𝐵 Prediction Component 𝑆 𝐸𝑗𝑔 𝑎 𝑈 Resistance-extraction Component EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 12/42
ReTrend – Observable Data • Four matrices carrying observable data - Subscription Matrix (S) Interest-extraction Component 𝑇 𝐷 - Contagion Matrix (C) 𝐵 𝑌 𝑍 - Resistance Matrix (T) Prediction Component 𝑆 - Retweet Matrix (R) 𝐸𝑗𝑔 𝑎 𝑈 Resistance-extraction Component EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 13/42
ReTrend – Learning Latent Feature • Four factor matrices carrying latent feature vectors - User Interest Matrix (X) Interest-extraction Component 𝑇 𝐷 - User Influence Matrix (Y) 𝐵 𝑌 𝑍 - User Resistance Matrix (Z) Prediction Component 𝑆 - Item Attraction Matrix (A) 𝐸𝑗𝑔 𝑎 𝑈 Resistance-extraction Component EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 14/42
ReTrend – Learning Latent Feature • Four factor matrices carrying latent feature vectors - User Interest Matrix (X) Interest-extraction Component 𝑇 𝐷 - User Influence Matrix (Y) 𝐵 𝑌 𝑍 - Use ser r Res esistance Matr trix (Z (Z) Prediction Component 𝑆 - Item Attraction Matrix (A) 𝐸𝑗𝑔 𝒂 𝑈 Resistance-extraction Component • We deem this inherent attribute ‘resistance’ varies over latent space but remains fixed for a fixed user EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 15/42
ReTrend – Logic Explanation • Take Contagion Matrix for example • Contagion Matrix: |user| × |post| • Entry 𝐷 𝑣𝑗 : count of retweet behaviors Interest-extraction Component 𝑇 𝐷 triggered by user 𝑣 w.r.t. post 𝑗 𝐵 𝑌 𝑍 • 𝐷 𝑣𝑗 reflects two facts: Prediction Component 𝑆 - to what degree a user can trigger his 𝐸𝑗𝑔 friends to retweet the post 𝑎 𝑈 - how attractive the post is Resistance-extraction Component EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 16/42
ReTrend – Logic Explanation • Take Contagion Matrix for example Contagion Matrix C Item Attraction Matrix 𝐵 ··· User Influence Matrix 𝑍 0 ··· ··· ··· ··· ··· ··· ··· ··· 0 ··· ··· ··· 2 ··· ··· ··· ≈ 0 × ··· ··· ··· 1 ··· 𝑙 ··· ··· 0 ··· ··· ··· 0 ··· ··· ··· 0 ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· 𝑙 • Assume a Gaussian observation noise EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 17/42
ReTrend – Logic Explanation • For Retweet Matrix Interest-extraction • Retweet behavior can be determined Component 𝑇 𝐷 by user interest, resistance, parent 𝐵 𝑌 𝑍 influence and post attraction Prediction Component 𝑆 𝐸𝑗𝑔 𝑎 𝑈 Resistance-extraction Component where EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 18/42
ReTrend – Entire Model • Conditional distribution over all observed data as • Place zero-mean spherical Gaussian priors on latent feature vectors EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 19/42
ReTrend – Entire Model • By modifying the log-likelihood, we obtain the loss function as • SGD for optimization EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 20/42
ReTrend – Retweet-tree Encoding • How ReTrend leverage information better? • Tree-structured essence of information cascade – Retweet-tree ··· ··· ··· ··· ··· EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 21/42
ReTrend – Retweet-tree Encoding • Subscription Matrix ··· ··· Subscribe Matrix 𝑇 ··· 0 1 1 0 1 0 1 0 ··· ··· 1 0 1 1 0 0 1 0 ··· 0 0 0 1 1 0 0 1 ··· 0 1 0 0 0 1 1 0 ··· 1 0 0 1 0 0 0 1 ··· 0 0 1 0 0 1 0 0 ··· 0 1 0 0 1 0 1 0 ··· 0 1 1 1 0 0 1 1 ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 22/42
ReTrend – Retweet-tree Encoding • Retweet Matrix ··· ··· Retweet Matrix 𝑆 ··· 0 ··· ··· 1 ··· 1 ··· 0 ··· 1 ··· 0 ··· 0 ··· 1 ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 23/42
ReTrend – Retweet-tree Encoding • Contagion Matrix ··· ··· Contagion Matrix C ··· 0 ··· ··· 0 ··· 2 ··· 0 ··· 1 ··· 0 ··· 0 ··· 0 ··· ··· ··· ··· ··· ··· ··· ··· ··· ··· EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 24/42
ReTrend – Training • Dynamic inference on the most likely retweet-tree structure EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 26/42
ReTrend – Training • AND, it is post-transcending EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 27/42
Modification
Matrix Factorization – Drawbacks • Simple and fixed inner-product: Low Non-linearity[4] • Complex inference in low-dimensional latent space • Too much constraints 𝑇 𝐷 𝑍 𝐵 𝑌 Pure linear operation: 𝑆 𝐸𝑗𝑔 Empirically lo low per performance 𝑎 𝑈 EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 28/42
MLP Module – Optimization for MF • Replace multiplication with a simple MLP module. • Level up non-linearity Matrix A Matrix B Matrix A Matrix B MLP Module Result Result EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 29/42
MLP Module – Detail EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 30/42
Experiments – Dataset • Rea eal-world ld da data taset fro from Twitt tter • More than 90,000 users and 99,696,204 tweets related [1][2] . • 440,000+ subscribes. • 2,370,000+ retweet behaviors. • 18,210,000+ un-retweet behaviors. • 18,210,000+ resistance tuples. • 2,170,000+ contagion tuples. [1] https://www.aminer.cn/data-sna#Twitter-Dynamic-Net [2] https://www.aminer.cn/data-sna#Twitter-Dynamic-Action EE447 2018.6 Final Report – Zhenhao Cao, Ru Wang 31/42
Recommend
More recommend