Modelling Cascades Over Time in Microblogs Wei Xie , Feida Zhu, - PowerPoint PPT Presentation

Modelling Cascades Over Time in Microblogs Wei Xie , Feida Zhu, Siyuan Liu and Ke Wang* Living Analytics Research Centre Singapore Management University * Ke Wang is from Simon Fraser University, and this work was done when the author was visiting Living Analytics Research Centre in Singapore Management University.

Motivation • Business applications such as viral marketing have driven a lot of research effort predicting whether a cascade will go viral. • In real life, there are very few truly viral cascades. • Previous research work* shows that temporal features are the key predictor of cascade size. * Justin Cheng, Lada A. Adamic, P. Alex Dow, Jon M. Kleinberg, Jure Leskovec: Can cascades be predicted? WWW 2014: 925-936

Time-aware Cascade Model u 5 u 5 u 4 u 4 u 0 u 0 t 4 u 1 u 1 t 0 t 0 t 1 t 1 u 3 u 3 u 2 u 2 t 3 t 3 t 2 t 2 t t + dt

Time-aware Cascade Model u 5 u 5 u 4 u 4 u 0 u 0 t 4 u 1 u 1 t 0 t 0 t 1 t 1 u 3 u 3 u 2 u 2 t 3 t 3 t 2 t 2 t t + dt ( t ) = ( t , { t j } ; Θ ) ⋅ dt P i h i e ( i ) ( t ) u j ∈ Followe ⎧ ⎪ P ( C ( t + dt )) = P ( C ( t + dt )| C ( t )) ⋅ P ( C ( t )) ⎪ ⎪ ⎪ P ( C ( )) = 1 t 0 ⎨ ⎪ ∏ ∏ ⎪ P ( C ( t + dt )| C ( t )) = ( t ) ⋅ (1 − ( t )) P i P i ′ ⎪ ⎩ ⎪ u i X (1) X (2) ( t ) ( t ) u i ′ ∈ ∈

Time-aware Cascade Model u 5 u 5 u 4 u 4 u 0 u 0 t 4 u 1 u 1 t 0 t 0 t 1 t 1 u 3 u 3 u 2 u 2 t 3 t 3 t 2 t 2 t t + dt ( t ) = ( t , { t j } ; Θ ) ⋅ dt P i h i e ( i ) ( t ) u j ∈ Followe ⎧ ⎪ P ( C ( t + dt )) = P ( C ( t + dt )| C ( t )) ⋅ P ( C ( t )) ⎪ ⎪ ⎪ P ( C ( )) = 1 t 0 ⎨ ⎪ ∏ ∏ ⎪ P ( C ( t + dt )| C ( t )) = ( t ) ⋅ (1 − ( t )) P i P i ′ ⎪ ⎩ ⎪ u i X (1) X (2) ( t ) ( t ) u i ′ ∈ ∈ users who have re-shared

Time-aware Cascade Model u 5 u 5 u 4 u 4 u 0 u 0 t 4 u 1 u 1 t 0 t 0 t 1 t 1 u 3 u 3 u 2 u 2 t 3 t 3 t 2 t 2 t t + dt ( t ) = ( t , { t j } ; Θ ) ⋅ dt P i h i e ( i ) ( t ) u j ∈ Followe ⎧ ⎪ P ( C ( t + dt )) = P ( C ( t + dt )| C ( t )) ⋅ P ( C ( t )) ⎪ ⎪ ⎪ P ( C ( )) = 1 t 0 ⎨ ⎪ ∏ ∏ ⎪ P ( C ( t + dt )| C ( t )) = ( t ) ⋅ (1 − ( t )) P i P i ′ ⎪ ⎩ ⎪ u i X (1) X (2) ( t ) ( t ) u i ′ ∈ ∈ users who have re-shared users who haven’t yet

Observations in Twitter Observation 1. Only the first re-sharer matters. ( t ) = ( t , ; Θ ) ⋅ dt P i h i t j ⋆ where e ( i ) j ⋆ = argmi { | ( t )} n j t j u j ∈ Followe

Observations in Twitter Observation 1. Only the first re-sharer matters. ( t ) = ( t , ; Θ ) ⋅ dt P i h i t j ⋆ where e ( i ) j ⋆ = argmi { | ( t )} n j t j u j ∈ Followe Observation 2. The chance of a tweet to be retweeted decreases as time goes by. ( t ) = ( τ ; Θ ) ⋅ dt P i h i where and is a decreasing function . ( τ ) t j ⋆ h i τ = t −

Hazard Function Design P ( t < T ≤ t + dt | T > t ) f ( t ) h ( t ) = lim = 1 − F ( t ) dt dt → 0

Hazard Function Design P ( t < T ≤ t + dt | T > t ) f ( t ) h ( t ) = lim = 1 − F ( t ) dt dt → 0 t t F ′ ( u ) ∫ ∫ | t H ( t ) = h ( u )d u = d u = − log (1 − F ( u )) = − log (1 − F ( t )) 0 1 − F ( u ) 0 0

Hazard Function Design P ( t < T ≤ t + dt | T > t ) f ( t ) h ( t ) = lim = 1 − F ( t ) dt dt → 0 t t F ′ ( u ) ∫ ∫ | t H ( t ) = h ( u )d u = d u = − log (1 − F ( u )) = − log (1 − F ( t )) 0 1 − F ( u ) 0 0 e − H ( t ) F ( t ) = 1 −

Hazard Function Design P ( t < T ≤ t + dt | T > t ) f ( t ) h ( t ) = lim = 1 − F ( t ) dt dt → 0 t t F ′ ( u ) ∫ ∫ | t H ( t ) = h ( u )d u = d u = − log (1 − F ( u )) = − log (1 − F ( t )) 0 1 − F ( u ) 0 0 e − H ( t ) F ( t ) = 1 − t e − t H ( t ) = ⇒ F ( t ) = 1 − Exponential distribution λ λ

Hazard Function Design P ( t < T ≤ t + dt | T > t ) f ( t ) h ( t ) = lim = 1 − F ( t ) dt dt → 0 t t F ′ ( u ) ∫ ∫ | t H ( t ) = h ( u )d u = d u = − log (1 − F ( u )) = − log (1 − F ( t )) 0 1 − F ( u ) 0 0 e − H ( t ) F ( t ) = 1 − t e − t H ( t ) = ⇒ F ( t ) = 1 − Exponential distribution λ λ t α ) β e − ( t α ) β H ( t ) = ( ⇒ Weibull distribution F ( t ) = 1 −

Hazard Function Design t e − t H ( t ) = ⇒ F ( t ) = 1 − Exponential distribution λ λ t α ) β e − ( t α ) β H ( t ) = ( ⇒ Weibull distribution F ( t ) = 1 −

Hazard Function Design t e − t H ( t ) = ⇒ F ( t ) = 1 − Exponential distribution λ λ t α ) β e − ( t α ) β H ( t ) = ( ⇒ Weibull distribution F ( t ) = 1 − e −∞ H ( ∞ ) = ∞ ⇒ F ( ∞ ) = 1 − ⇒ F ( ∞ ) = 1

Hazard Function Design τ ) − β H ( τ ) = λ ⋅ (1 − ( + 1 ) α d H ( τ ) β τ ) − ( β +1) h ( τ ) = = λ ⋅ ⋅ ( + 1 d τ α α

Hazard Function Design τ ) − β H ( τ ) = λ ⋅ (1 − ( + 1 ) α

Hazard Function Design τ ) − β H ( τ ) = λ ⋅ (1 − ( + 1 ) α scale parameter

Hazard Function Design τ ) − β H ( τ ) = λ ⋅ (1 − ( + 1 ) α shape parameter scale parameter

Hazard Function Design τ ) − β H ( τ ) = λ ⋅ (1 − ( + 1 ) α shape parameter scale parameter F ( ∞ ) ≈ H ( ∞ ) = λ

Hazard Function Design τ ) − β H ( τ ) = λ ⋅ (1 − ( + 1 ) α shape parameter scale parameter F ( ∞ ) ≈ H ( ∞ ) = λ describes the eventual re-tweeting probability

Hazard Rate Illustration

Hazard Rate Illustration 20 16 Retweeting Rate 12 8 4 0 t C 60 Time (Minute)

Hazard Rate Illustration 20 16 Retweeting Rate 12 8 4 0 t C 60 Time (Minute) 16e-4 Emperical Rate Estimated Rate 12e-4 Hazard Rate 8e-4 4e-4 0 0 10 20 30 40 50 60 Time (Minute)

Dataset From a Singapore based Twitter data set, we get all the retweets to construct retweeting cascades. In all we get 2,425,348 cascades.

  Probabilistic Model Fitting • TM t Threshold Model   e ( i ) ( t ) = λ ⋅ s (| Followe ( t )|) h i 1 where s ( x ) = 1 + e − a ( x − b ) • TCM-CH Constant Hazard   d H ( τ ) H ( τ ) = λ ⋅ τ h ( τ ) = = λ d τ • TCM-EH Exponential Hazard   d H ( τ ) e − k ⋅ τ e − k ⋅ τ H ( τ ) = λ ⋅ (1 − ) h ( τ ) = = λ ⋅ k ⋅ d τ • TCM-LH Long tail Hazard (our proposed) d H ( τ ) τ β τ ) − ( β +1) ) − β H ( τ ) = λ ⋅ (1 − ( + 1 ) h ( τ ) = = λ ⋅ ⋅ ( + 1 d τ α α α

Probabilistic Model Fitting For each cascade, observe its development in first for   T 0 training, and the next for testing. ∆ T

Probabilistic Model Fitting

Predicting Cascade Growth

Virality Prediction

Thanks

Our work is based on previous cascade models • J. Goldenberg, B. Libai, and E. Muller. Talk of the network: A complex systems look at the underlying process of word-of- mouth. Marketing letters, 12(3):211–223, 2001.   • M.Gomez-Rodriguez,D.Balduzzi,andB.Scho ̈ lkopf.Uncovering the temporal dynamics of diffusion networks. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011, pages 561–568, 2011.   • S. A. Myers, C. Zhu, and J. Leskovec. Information diffusion and external influence in networks. In The 18th ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining, KDD ’12, Beijing, China, August 12-16, 2012, pages 33–41, 2012.   • M. Gomez-Rodriguez, J. Leskovec, and B. Scho ̈ lkopf. Modeling information propagation with survival theory. In ICML (3), pages 666–674, 2013.   • N. Du, L. Song, M. Gomez-Rodriguez, and H. Zha. Scalable influence estimation in continuous- time diffusion networks. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pages 3147–3155, 2013.

Modelling Cascades Over Time in Microblogs Wei Xie , Feida Zhu, - PowerPoint PPT Presentation

Modelling Cascades Over Time in Microblogs Wei Xie , Feida Zhu, Siyuan Liu and Ke Wang* Living Analytics Research Centre Singapore Management University * Ke Wang is from Simon Fraser University, and this work was done when the author was

Information Cascades in Human Networks Milo Trujillo Professor Gao Information Cascades

Optimizing cascades & submodular optimization Rik Sarkar Today Maximizing cascades

Real-time #SemanticWeb in <= 140 chars Linked Data on the Web (LDOW2010) April 27 th , 2010

Information Extraction from Microblogs Posted during Disasters Saptarshi Ghosh 1 Kripabandhu Ghosh

Deep Twitter Diving: Exploring Topical Groups in Microblogs at Scale P. Bhattacharya, S. Ghosh,

Emerging Topic Detection for Organizations from Microblogs Yan Chen * , Hadi Amiri + , Zhoujun Li

Utilizing Micr Utilizing Microblogs f oblogs for A r Automatic matic Ne News Highlights

Microblogs as Parallel Corpora Wang Ling, Guang Xiang, Chris Dyer, Isabel Trancoso, Alan W Black

DETECTING RUMORS FROM MICROBLOGS WITH RECURRENT NEURAL NETWORKS 515030910611 INTRODUCTION

Cascades Recovery Inc. We care so much about paper and packaging; when youre done with it, we

Cascades Social and Technological Networks Rik Sarkar University of Edinburgh, 2019. Network

Scaling the Cascades Interconnect-aware FPGA implementation of Machine Learning problems Anand

BlackBerry 10 Cascades UI FW: A Different Take Markus Landin, Product Manager, Research In Motion

Weighted Classification Cascades for Optimizing Discovery Significance Lester Mackey

Community detection and cascades Rik Sarkar Today Community Detection Spectral

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

General Threshold Model for Social Cascades Jie Gao, Golnaz Ghasemiesfeh, Grant Schoenebeck,

Cascade: A Universal Programmer-assisted Type Qualifier Inference Tool Mohsen Vakilian* Amarin

Cascades and Contagion Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia

Why Do Cascade Sizes Follow a Power-Law? Andrzej Pacuk, Piotr Sankowski, Karol Wgrzycki , Piotr

Collective learning versus informational cascades: towards a logical approach to social

Gods Will for the New Creation

Deep Learning Scott E. Fahlman Professor Emeritus Language Technologies Institute February 12,

Global Diffusion via Cascading Invitations: Structure, Growth, and Homophily Ashton Anderson

Sambuz

Useful Links

Newsletter

Mail Us

Modelling Cascades Over Time in Microblogs Wei Xie , Feida Zhu, - PowerPoint PPT Presentation

Modelling Cascades Over Time in Microblogs Wei Xie , Feida Zhu, Siyuan Liu and Ke Wang* Living Analytics Research Centre Singapore Management University * Ke Wang is from Simon Fraser University, and this work was done when the author was

Information Cascades in Human Networks Milo Trujillo Professor Gao Information Cascades

Optimizing cascades &amp; submodular optimization Rik Sarkar Today Maximizing cascades

Real-time #SemanticWeb in &lt;= 140 chars Linked Data on the Web (LDOW2010) April 27 th , 2010

Information Extraction from Microblogs Posted during Disasters Saptarshi Ghosh 1 Kripabandhu Ghosh

Deep Twitter Diving: Exploring Topical Groups in Microblogs at Scale P. Bhattacharya, S. Ghosh,

Emerging Topic Detection for Organizations from Microblogs Yan Chen * , Hadi Amiri + , Zhoujun Li

Utilizing Micr Utilizing Microblogs f oblogs for A r Automatic matic Ne News Highlights

Microblogs as Parallel Corpora Wang Ling, Guang Xiang, Chris Dyer, Isabel Trancoso, Alan W Black

DETECTING RUMORS FROM MICROBLOGS WITH RECURRENT NEURAL NETWORKS 515030910611 INTRODUCTION

Cascades Recovery Inc. We care so much about paper and packaging; when youre done with it, we

Cascades Social and Technological Networks Rik Sarkar University of Edinburgh, 2019. Network

Scaling the Cascades Interconnect-aware FPGA implementation of Machine Learning problems Anand

BlackBerry 10 Cascades UI FW: A Different Take Markus Landin, Product Manager, Research In Motion

Weighted Classification Cascades for Optimizing Discovery Significance Lester Mackey

Community detection and cascades Rik Sarkar Today Community Detection Spectral

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

General Threshold Model for Social Cascades Jie Gao, Golnaz Ghasemiesfeh, Grant Schoenebeck,

Cascade: A Universal Programmer-assisted Type Qualifier Inference Tool Mohsen Vakilian* Amarin

Cascades and Contagion Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia

Why Do Cascade Sizes Follow a Power-Law? Andrzej Pacuk, Piotr Sankowski, Karol Wgrzycki , Piotr

Collective learning versus informational cascades: towards a logical approach to social

Gods Will for the New Creation

Deep Learning Scott E. Fahlman Professor Emeritus Language Technologies Institute February 12,

Global Diffusion via Cascading Invitations: Structure, Growth, and Homophily Ashton Anderson

Sambuz

Useful Links

Newsletter

Mail Us

Optimizing cascades & submodular optimization Rik Sarkar Today Maximizing cascades

Real-time #SemanticWeb in <= 140 chars Linked Data on the Web (LDOW2010) April 27 th , 2010