idea adoption/disease spread/viral marketing D S means 1:00pm S follows D David vid 1:10pm Sophie hie 1:18pm 1:15pm Christine ine Bob 1:25pm Jacob ob 33
Scenario I: idea adoption D S means 1:00pm D D is source S follows D ๐ ๐ธ ๐ข = 1 ๐ข David vid 1:10pm Sophie hie Terminating process adopt product only once B C ๐ ๐ถ ๐ข ๐ ๐ท ๐ข Followee Not yet adopted adopted 1:18pm 1:15pm Christine ine Bob ๐ต ๐พ๐ท ๐ต ๐พ๐ถ โ ๐พโ ๐ข = ๐ต ๐พ๐ถ ๐ข 1 โ ๐ ๐พ ๐ข ๐ ๐ถ ๐ข 1:25pm + ๐ต ๐พ๐ท ๐ข 1 โ ๐ ๐พ ๐ข ๐ ๐ท ๐ข J Jacob ob ๐ ๐พ ๐ข 34
Cascades from D in 30 mins 1:00pm 1:08pm David vid Sophie hie 1:17pm Bob 1:25pm Jacob ob 1:48pm Christine ine 35
Cascades from D in 30 mins 1:00pm 2:00pm 1:03pm David vid David vid 2:08pm Sophie hie 1:17pm Sophie hie Bob 1:25pm Jacob ob 2:42pm 2:37pm 1:48pm Christine ine Bob Christine ine 2:53pm Jacob ob 36
Cascades from D in 30 mins 1:00pm 2:00pm 7:00pm 1:03pm David vid David vid David vid 7:06pm 2:08pm Sophie hie Sophie hie 1:17pm Sophie hie Bob 7:17pm 1:25pm Christine ine Jacob ob 7:32pm 2:42pm 2:47pm Bob 1:48pm Christine ine Bob 7:50pm Christine ine 2:53pm Jacob ob 37 Jacob ob
Cascades from D in 30 mins 1:00pm 2:00pm 7:00pm 1:03pm David vid David vid David vid 7:06pm 2:08pm Sophie hie Sophie hie 1:17pm Sophie hie Bob 7:17pm 1:25pm Christine ine Jacob ob 7:32pm 2:42pm 2:47pm Bob 1:48pm Christine ine Bob 7:50pm Christine ine 2:53pm Jacob ob 38 Jacob ob
Cascade Data Cascade: a sequence of (node, time) pairs for a particular piece of news Cascades can start from different sources Cascade 2 Cascade 1 Cascade 3 User 1 ๐ข User 2 ๐ข User 3 ๐ข โฆ User n: ๐ข ๐ข 1 , ๐ข 2 , ๐ข 3 , โฆ , ๐ข ๐ (๐ข 1 , ๐ข 2 , ๐ข 3 , โฆ , ๐ข ๐ ) ๐ข 1 , ๐ข 2 , ๐ข 3 , โฆ , ๐ข ๐ 39
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence Modeling: Coevolution 40
Information diffusion and network coevolution 1pm, D: D S 1:45pm Cool paper means S follows D David vid 2pm, D: Nice idea 1:10pm, @D: Indeed Tina Sophie hie 1:18pm, @S @D: 1:15pm, @S @D: Very useful Classic Bob Christine ine Olivi via Jacob ob 2:03pm, @D: Agree 1:35pm @B @S @D: Indeed brilliant Farajtabar et al. NIPS 2015 41
Information diffusion and network coevolution 1pm, D: Cool paper (D, D, 1:00) 1:45pm (J, D, 1:45) David vid Link creation event sequence Sophie hie (J, J) ๐ข 5:25pm (J, S, 5:25) (J, D) 4pm, B: (B, B, 4:00) (J, S) It snows โฆ Christine ine Bob Tweet/retweet event sequence (J, J) ๐ข 1:35pm @B @S @D: (J, D, 1:35) Indeed brilliant (J, D) Jacob ob 4:10pm, @B: (J, B, 4:10) Beautiful (J, B) 5pm, J: โฆ (J, J, 5:00) Going out 42
Targeted retweet (D, D) Dโs own initiative โ ๐ธโ ๐ข = ๐ David vid Sophie hie (C, D) (B, D) ๐ ๐ท๐ธ ๐ข ๐ ๐ถ๐ธ ๐ข Mutually-exciting process Christine ine Bob High if followees retweet frequently ๐ต ๐พ๐ท ๐ข ๐ต ๐พ๐ถ ๐ข โ ๐พ๐ธโ ๐ข = ๐พ ๐ธ ๐ต ๐พ๐ถ ๐ข exp โ ๐ข โ ๐๐ ๐ถ๐ธ ๐ข (J, D) + ๐พ ๐ธ ๐ต ๐พ๐ท ๐ข exp โ ๐ข โ ๐๐ ๐ท๐ธ ๐ข Jacob ob ๐ ๐พ๐ธ ๐ข 43
Information driven link creation (J, D) 1:45pm ๐ต ๐พ๐ธ (๐ข) David vid ๐พ โs random exploration ๐ฟ ๐พ๐ธโ ๐ข = 1 โ ๐ต ๐พ๐ธ ๐ข โ ๐ ๐พ + ๐ฝ ๐ธ exp โ ๐ข โ ๐๐ ๐พ๐ธ ๐ข Sophie hie Check whether Retweet ๐ธ the link already there Self-exciting process Christine ine Bob Terminating process no link and retweet often (J, D) Jacob ob ๐ ๐พ๐ธ ๐ข 44
Joint model of retweet + link creation Diffusion network Diffusion network ๐ฉ ๐ข โ {0,1} ๐ฉ ๐ข โ {0,1} Alter Support Link creation Link creation Information diffusion Information diffusion process ๐ถ ๐ข โ 0 โช ๐ + process ๐ถ ๐ข โ 0 โช ๐ + process process Terminating Mutually-exciting Drive process process 45
Simulation 46
Link creation parameter controls network type ๐ฝ ๐ธ = 0 ๐ฝ ๐ธ large Erdos-Renyi random networks Scale-free networks 47
Shrinking network diameters Generate networks with small shrinking diameter Small connected components merge Diameter shrinks 48
Cascade patterns: structure Generate short and fat cascades as ๐ฝ increases ๐พ = 0.2 49
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence Modeling: Collaborative Dynamics 50
Collaborative dynamics ๐ 1 ๐ 2 Low rank ๐ 3 ๐ 4 ๐ ๐ธ๐ 1 ๐ ๐ธ๐ 4 โฆ David id โฎ โฑ โฎ ๐ ๐พ๐ 1 ๐ ๐พ๐ 4 โฆ ๐ฝ ๐ธ๐ 1 ๐ฝ ๐ธ๐ 4 Sophi phie โฆ โฎ โฑ โฎ ๐ฝ ๐พ๐ 1 ๐ฝ ๐พ๐ 4 โฆ Chris istine tine Self-exciting process Tend to go to the same store Jacob ob again and again โ ๐ธ๐ 1 โ ๐ข = ๐ ๐ธ๐ 1 + ๐ฝ ๐ธ๐ 1 ๐ธ๐ 1 | exp โ|๐ข โ ๐ข ๐ ๐ธ๐1 โ๐ ๐ข ๐ธ๐1 ๐ข ๐ 51
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence Learning: Sparse Networks 52
Hidden diffusion networks Estimate the Estimate the diffusion diffusion networks? networks? 53
Parametrization of idea adoption model D S D is source means 1:00pm D ๐ ๐ธ ๐ข = 1 Parametrization S follows D ๐ข ๐ต ๐พ๐ธ David vid 1:10pm ๐ต ๐พ๐ ๐ฅ = ๐ต ๐พ๐ถ Sophie hie ๐ต ๐พ๐ท B C ๐ ๐ถ ๐ข ๐ ๐ท ๐ข Terminating process adopt product only once 1:18pm 1:15pm Christine ine Bob Followee Not yet ๐ต ๐พ๐ท ๐ต ๐พ๐ถ adopted adopted 1:25pm J โ ๐พโ ๐ข = ๐ต ๐พ๐ถ 1 โ ๐ ๐พ ๐ข ๐ ๐ถ ๐ข Jacob ob ๐ ๐พ ๐ข + ๐ต ๐พ๐ท 1 โ ๐ ๐พ ๐ข ๐ ๐ท ๐ข 54
โ 1 Regularized log-likelihood ๐ ๐ ๐ฅ + ๐ ๐ฅ 1 = log ๐ฅ, ๐ โ ๐ข ๐ โ ๐ฅ, ฮจ โ ๐ โ ๐ ๐ฅ 1 ๐=1 ๐ time ๐ข 1 ๐ข 3 ๐ข 2 ๐ข Jacob ob Likelihood: โ โ ๐ข 1 โ โ ๐ข 2 โ โ ๐ข 3 โ โ ๐ข exp โ โ โ ๐ ๐๐ ๐ 0 ๐ฅ, ๐ โ ๐ข 1 ๐ฅ, ๐ โ ๐ข 3 ๐ ๐ฅ, ๐ โ (๐) ๐๐ exp โ 0 ๐ฅ, ๐ โ ๐ข 2 ๐ฅ, ๐ โ ๐ข 55
Soft-thresholding algorithm โ 1 -reguarlized likelihood estimation problem. Solve one such problem for each node. Set learning rate ๐พ ๐ = 0 Initialize ๐ฅ While ๐ โค ๐ฟ , do ๐ฅ ๐+1 = ๐ฅ ๐ โ ๐พ โ ๐ผ ๐ฅ ๐ ๐ฅ ๐ โ ๐ โ ๐พ + ๐ ๐ ๐ = ๐ + 1 End while ๐ฃ ๐ฃ ๐ ๐ ๐ค ๐ค 56
Statistical guarantees Recovery conditions: 2 ๐, is bounded [๐ท ๐๐๐ , ๐ท ๐๐๐ฆ ] Eigenvalue of the Hessian, ๐ = ๐ผ ๐ฅ Gradient is upper bounded, ๐ผ ๐ฅ ๐ โ โค ๐ท 1 Hazard is lower bounded, min ๐ฅ ๐ โฅ ๐ท 2 Incoherence condition: ๐ ๐ ๐ ๐ ๐ ๐๐ โ1 โ โค 1 โ ๐ network structure parameter value observation window source node distribution Given ๐ > ๐ท 3 โ ๐ 3 log ๐ cascades, set regularization parameter 2โ๐ log ๐ ๐ โฅ ๐ท 4 โ ๐ , the network structure can be recovered with ๐ probability at least 1 โ 2 exp(โ๐ท โฒโฒ ๐ 2 ๐) 57
Memetracker 58
Estimated diffusion network Blogs Mainstream media Nan et al. NIPS 2012 59
Tracking diffusion networks 60
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence Learning: Low Rank Collaborative Dynamics 61
Collaborative dynamics ๐ 1 ๐ 2 Regularization ๐ 3 ๐ 4 ๐ ๐ธ๐ 1 ๐ ๐ธ๐ 4 โฆ โฎ โฑ โฎ David id ๐ ๐พ๐ 1 ๐ ๐พ๐ 4 โฆ โ ๐ฝ ๐ธ๐ 1 ๐ฝ ๐ธ๐ 4 Sophi phie โฆ โฎ โฑ โฎ ๐ฝ ๐พ๐ 1 ๐ฝ ๐พ๐ 4 โฆ โ Chris istine tine Self-exciting process Tend to go to the same store Jacob ob again and again โ ๐ธ๐ 1 โ ๐ข = ๐ ๐ธ๐ 1 + ๐ฝ ๐ธ๐ 1 ๐ธ๐ 1 | exp โ|๐ข โ ๐ข ๐ ๐ธ๐1 โ๐ ๐ข ๐ธ๐1 ๐ข ๐ 62
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence Learning: Generic Algorithm 63
Concave log-likelihood of event sequence Log-likelihood Concave in Concave in ๐ ๐(๐ฅ) = log ๐ฅ, ๐ โ ๐ข ๐ w! w! โ ๐ฅ, ฮจ โ (๐) ๐=1 ๐ time ๐ข 1 ๐ข 3 ๐ข 2 ๐ข Jacob ob Likelihood: โ โ ๐ข 1 โ โ ๐ข 2 โ โ ๐ข 3 โ โ ๐ข exp โ โ โ ๐ ๐๐ ๐ 0 ๐ฅ, ๐ โ ๐ข 1 ๐ฅ, ๐ โ ๐ข 3 ๐ ๐ฅ, ๐ โ (๐) ๐๐ exp โ 0 ๐ฅ, ๐ โ ๐ข 2 ๐ฅ, ๐ โ ๐ข 64
Challenge in optimization problem ๐ time David vid ๐ข 1 ๐ข 2 ๐ข 3 โฆ ๐ข ๐ ๐ข ๐ Negative log-likelihood ๐ ๐ ๐ฅ, ฮจ โ (๐) โ log ๐ฅ, ๐ โ (๐ข ๐ ) min + ๐ ๐ฅ 1 wโโ + ๐=1 Existing first order methods Existing first order methods log ๐ฆ 1 1 ๐ ๐ ๐ 2 iterations ๐ 2 iterations ฮ๐ง Non-Lipschitz ฮ๐ฆ 65
Saddle point reformulation ๐ time David vid ๐ข 1 ๐ข 2 ๐ข 3 โฆ ๐ข ๐ ๐ข ๐ Negative log-likelihood ๐ ๐ ๐ฅ, ฮจ โ (๐) โ log ๐ฅ, ๐ โ (๐ข ๐ ) min + ๐ ๐ฅ 1 wโโ + ๐=1 Fenchel dual ๐ค ๐ >0 ๐ค ๐ ๐ฅ, ๐ โ (๐ข ๐ ) โ log ๐ค ๐ โ 1 max ๐ ๐ ๐ฅ, ฮจ โ (๐) โ ๐ค ๐ ๐ฅ, ๐ โ ๐ข ๐ min ๐ max + log ๐ค ๐ + ๐ ๐ฅ 1 ๐ wโโ + ๐ค ๐ >0 ๐=1 ๐=1 ๐=1 He et al. Arxiv 2016 66
Proximal gradient ๐ฅ ๐ค ๐ ๐ (๐ฅ ๐ , ๐ค ๐ ) ๐ข , ๐ค ๐ ๐ข ) (๐ฅ ๐ ๐ข โ ๐ฟ ๐ผ ๐ข โ ๐ฟ ๐ผ ๐ฅ ๐ฅ ๐ = ๐ฅ ๐ = ๐ฅ ๐ โ ๐๐ฟ + ๐ โ ๐๐ฟ + ๐ข ๐ข ๐ฅ ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ฅ ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ฅ ๐ฅ ๐ = ๐ฅ ๐ = ๐ฅ ๐ ๐ 2 + 4๐ฟ 2 + 4๐ฟ 1/2 1/2 ๐ = ๐ค ๐ + ๐ค ๐ ๐ = ๐ค ๐ + ๐ค ๐ ๐ข + ๐ฟ ๐ผ ๐ข + ๐ฟ ๐ผ ๐ข ๐ข ๐ค ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ค ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ค ๐ = ๐ค ๐ ๐ค ๐ = ๐ค ๐ ๐ค ๐ค 2 2 ๐ข } ๐ข } Given current ๐ฅ ๐ข , {๐ค ๐ Given current ๐ฅ ๐ข , {๐ค ๐ ๐ ๐ ๐ฅ, ฮจ โ (๐) โ ๐ค ๐ ๐ฅ, ๐ โ ๐ข ๐ min ๐ max + log ๐ค ๐ + ๐ ๐ฅ 1 ๐ wโโ + ๐ค ๐ >0 ๐=1 ๐=1 ๐=1 Bilinear form ๐ ๐ฅ, ๐ค ๐ 67
Accelerated proximal gradient ๐ฅ ๐ค ๐ ๐ ๐ฅ ๐ข+1 = ๐ฅ ๐ฅ ๐ข+1 = ๐ฅ ๐ข+1 , ๐ค ๐ ๐ข+1 ) ๐ข โ ๐ฟ ๐ผ ๐ข โ ๐ฟ ๐ผ ๐ โ ๐๐ฟ + ๐ โ ๐๐ฟ + (๐ฅ ๐ฅ ๐ฅ ๐ = ๐ฅ ๐ = ๐ฅ ๐ฅ ๐ ๐ ๐ฅ, ๐ค ๐ ๐ฅ ๐ ๐ ๐ฅ , ๐ค ๐ ๐ ๐ ๐ 2 + 4๐ฟ 2 + 4๐ฟ 1/2 1/2 (๐ฅ ๐ , ๐ค ๐ ) ๐ข+1 = ๐ค ๐ + ๐ค ๐ ๐ข+1 = ๐ค ๐ + ๐ค ๐ ๐ข + ๐ฟ ๐ผ ๐ข + ๐ฟ ๐ผ ๐ค ๐ ๐ค ๐ ๐ค ๐ = ๐ค ๐ ๐ค ๐ = ๐ค ๐ ๐ค ๐ ๐ ๐ฅ ๐ค ๐ ๐ ๐ฅ, ๐ค ๐ , ๐ค ๐ ๐ข , ๐ค ๐ ๐ข ) (๐ฅ 2 2 ๐ ๐ข โ ๐ฟ ๐ผ ๐ข โ ๐ฟ ๐ผ ๐ฅ ๐ฅ ๐ = ๐ฅ ๐ = ๐ฅ ๐ โ ๐๐ฟ + ๐ โ ๐๐ฟ + ๐ข ๐ข ๐ฅ ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ฅ ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ฅ ๐ฅ ๐ = ๐ฅ ๐ = ๐ฅ ๐ ๐ 2 + 4๐ฟ 2 + 4๐ฟ 1/2 1/2 ๐ = ๐ค ๐ + ๐ค ๐ ๐ = ๐ค ๐ + ๐ค ๐ ๐ข + ๐ฟ ๐ผ ๐ข + ๐ฟ ๐ผ ๐ข ๐ข ๐ค ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ค ๐ ๐ ๐ฅ ๐ข , ๐ค ๐ ๐ค ๐ = ๐ค ๐ ๐ค ๐ = ๐ค ๐ ๐ค ๐ค 2 2 1 1 ๐ ๐ ๐ iterations ๐ iterations ๐ข } ๐ข } Given current ๐ฅ ๐ข , {๐ค ๐ Given current ๐ฅ ๐ข , {๐ค ๐ ๐ ๐ ๐ฅ, ฮจ โ (๐) โ ๐ค ๐ ๐ฅ, ๐ โ ๐ข ๐ min ๐ max + log ๐ค ๐ + ๐ ๐ฅ 1 ๐ wโโ + ๐ค ๐ >0 ๐=1 ๐=1 ๐=1 Bilinear form ๐ ๐ฅ, ๐ค ๐ 68
Converge much faster Accelerated gradient Unaccelerated 69
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence Inference: Time-Sensitive Recommendation 70
Collaborative dynamics ๐ 1 ๐ 2 ๐ 3 ๐ 4 David vid Sophie hie Next item prediction Next item prediction What next item David will buy? What next item David will buy? โ ๐ธ๐โ (๐ข) โ ๐ธ๐โ (๐ข) max max ๐ ๐ Christine ine Return time prediction Return time prediction When will David buy the item? When will David buy the item? โ โ Jacob ob max ๐๐ ๐ธ๐โ ๐ ๐๐ max ๐๐ ๐ธ๐โ ๐ ๐๐ ๐ข ๐ข Nan et al. NIPS 2015 71
Music recommendation for Last.fm Online records of music listening. The time unit is hour 1000 users, 3000 albums 20,000 observed pairs, more than 1 million events Album prediction Returning time prediction 72
Electronic healthcare records MIMIC II dataset: a collection of de-identified clinical visit records The time unit is week 650 patients and 204 disease codes Diagnosis code prediction Returning time prediction 73
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence Inference: Influence Maximization 74
Inference in dea adoption D S means 1:18 pm Influence estimation Influence estimation S follows D โฆ Can a piece of news spread, in 1 month, Can a piece of news spread, in 1 month, David vid to a million user? to a million user? 1:30 pm ๐ ๐ก, ๐ข : = ๐ฝ ๐ ๐ (๐ข) ๐ ๐ก, ๐ข : = ๐ฝ ๐ ๐ (๐ข) โฆ Sophie hie ๐โ๐ ๐โ๐ Influence maximization Influence maximization Who is the most influential user? Who is the most influential user? 2:00 pm max max ๐กโ๐ ๐ ๐ก, ๐ข ๐กโ๐ ๐ ๐ก, ๐ข โฆ Christine ine Bob Source localization Source localization Where is the origin of information? Where is the origin of information? ๐กโ๐,๐ขโ[0,๐] Likelihood partial cascade ๐กโ๐,๐ขโ[0,๐] Likelihood partial cascade max max Jacob ob Rodriguez et al. ICML 2012 Nan et al. NIPS 2013 75 Farajtabar et al. AISTATS 2015
Cascades from D in 1 month David vid David vid David vid Sophie hie Sophie hie Sophie hie Bob Christine ine Jacob ob Bob Christine ine Bob Christine ine Jacob ob 76 Jacob ob
Cascades from D in 1 month ๐ ๐ ๐ข = 4 ๐ ๐ ๐ข = 2 ๐ ๐ ๐ข = 3 ๐โ๐ ๐โ๐ ๐โ๐ David vid David vid David vid Sophie hie Sophie hie Sophie hie Bob Christine ine Jacob ob ๐ ๐ธ, ๐ข Bob โ 4 + 2 + 3 Christine ine Bob Christine ine 3 = 3 Jacob ob 77 Jacob ob
Cascades from B in 1 month David vid David vid David vid Sophie hie Sophie hie Sophie hie Bob Christine ine Jacob ob Bob Christine ine Bob Christine ine Jacob ob 78 Jacob ob
Cascades from B in 1 month ๐ ๐ ๐ข = 4 ๐ ๐ ๐ข = 2 ๐ ๐ ๐ข = 2 ๐โ๐ ๐โ๐ ๐โ๐ David vid David vid David vid Sophie hie Sophie hie Sophie hie Bob Christine ine Jacob ob ๐ ๐ถ, ๐ข Bob โ 4 + 2 + 2 Christine ine Bob Christine ine 3 = 2.67 Jacob ob 79 Jacob ob
Find most influential user max max ๐กโ๐ ๐ ๐ก, ๐ข ๐กโ๐ ๐ ๐ก, ๐ข ๐ ๐ ๐ ๐ + |๐น| ๐ ๐ ๐ 2 + ๐ ๐น |๐| ๐ ๐ ๐ 2 + ๐น |๐| Each graph ๐ David vid David vid David vid Soph phie ie Sophie phie Soph phie ie Bob Bob Chris istin tine Jacob cob Bob Bob Chris istin tine Bob Bob Chris istin tine 80 Jacob cob Jacob cob
Find most influential user max max ๐กโ๐ ๐ ๐ก, ๐ข ๐กโ๐ ๐ ๐ก, ๐ข ๐ ๐ ๐ ๐ + |๐น| ๐ ๐ ๐ 2 + ๐ ๐น |๐| ๐ ๐ ๐ 2 + ๐น |๐| Each graph Each node ๐ David vid David vid David vid Soph phie ie Sophie phie Soph phie ie Bob Bob Chris istin tine Jacob cob Bob Bob Chris istin tine Bob Bob Chris istin tine 81 Jacob cob Jacob cob
Find most influential user Quadratic in |๐| Quadratic in |๐| max max ๐กโ๐ ๐ ๐ก, ๐ข ๐กโ๐ ๐ ๐ก, ๐ข ๐ ๐ ๐ ๐ + |๐น| not scalable! not scalable! ๐ ๐ ๐ 2 + ๐ ๐น |๐| ๐ ๐ ๐ 2 + ๐น |๐| Each graph Each node Single source shortest path ๐ David vid David vid David vid Soph phie ie Sophie phie Soph phie ie Bob Bob Chris istin tine Jacob cob Bob Bob Chris istin tine Bob Bob Chris istin tine 82 Jacob cob Jacob cob
Randomized neighborhood estimation ๐ โผ exp(โ๐ ) Linear in # of Linear in # of nodes and edges nodes and edges David vid 2.75 ๐ ๐ธ = 0.29 Sophie hie 1.38 David id 0.29 ๐ ๐ = 0.29 Bob Sophi phie ๐ ๐ถ = 0.29 Bob Jacob ob 1.26 ๐ ๐พ = 1.26 Jacob ob ๐ ๐ท = 0.33 0.33 Christine ine Chris istine tine 83
Randomized neighborhood estimation ๐ โ 1 ๐ โ 1 ๐ โผ exp(โ๐ ) ๐ ๐ก, ๐ข โ ๐ ๐ก, ๐ข โ ๐ ๐ ๐ ๐ก (๐) ๐ ๐ก (๐) ๐=1 ๐=1 David vid 0.23 ๐ ๐ธ = 0.29, 0.23 Sophie hie 0.32 David id 1.97 ๐ ๐ = 0.29, 0.23 Bob Sophi phie ๐ ๐ถ = 0.29, 0.23 Bob Jacob ob 0.37 ๐ ๐พ = 1.26, 0.37 Given ๐ iid samples, ๐ โผ ๐ โ๐ , Given ๐ iid samples, ๐ โผ ๐ โ๐ , Jacob ob their minimum ๐ their minimum ๐ โ is distributed as โ is distributed as ๐ ๐ท = 0.33, 3.70 โ โผ ๐๐ โ๐๐ โ โผ ๐๐ โ๐๐ ๐ ๐ 3.70 Christine ine Chris istine tine 84
Computational complexity ๐ ๐ ๐ ๐ ๐ ๐ + ๐ + ๐น ๐ ๐ก, ๐ข โ 1 ๐ ๐ก, ๐ข โ 1 ๐ โ 1 ๐ โ 1 ๐ ๐ ๐ ๐ ๐ก (๐) ๐ก (๐) ๐ ๐ ๐=1 ๐=1 ๐ ๐ ๐=1 ๐=1 Each graph Each random Each node Breadth first label set search David vid David vid David vid Soph phie ie Sophie phie Soph phie ie Bob Bob Chris istin tine Jacob cob Bob Bob Chris istin tine Bob Bob Chris istin tine 85 Jacob cob Jacob cob
Scalability 86
Ten most influential sites in a month Site Typ ype e of site digg.com popular news site lxer.com linux and open source news exopolitics.blogs.com political blog mac.softpedia.com mac news and rumors gettheflick.blogspot.com pictures blog urbanplanet.org urban enthusiasts givemeaning.blogspot.com political blog talkgreen.ca environmental protection blog curriki.org educational site pcworld.com technology news 87
Dyn ynamic mic Pr Processes esses ove ver In Informat matio ion n Netwo works rks Rep epre rese sentat ntation, ion, Modeli deling ng, , Le Learning ning and d Infer erence ence More Advanced Models 88
Nan et al. AISTATS 203 Joint models with rich context Nan et al. KDD 2015 Audio Text time 0 ๐ข 1 ๐ข 2 โฆ โฆ ๐ข ๐ ๐ข ๐ ๐ Image Other simultaneously measured time-series 89
Spatial temporal processes influenza spread bird migration Crime Smart city 90
Continuous-time document streams Time Nan et al. KDD 2015 91
Dirichlet-Hawkes processes Dirichlet Recurrent Hawkes Chinese Restaurant Process Process โ ๐ (๐ข ๐ ) ๐ฝ ๐ ๐ |๐ 1:๐โ1 โผ + ๐ฝ ๐ ๐ ๐ + + ๐ฝ ๐ป 0 ๐ โ ๐ โฒ (๐ข ๐ ) โ ๐ โฒ (๐ข ๐ ) ๐ โฒ ๐ โฒ ๐ 92
Dark Knight vs. Endeavour Triggering Kernel Temporal Dynamics 93
Previous models are parametric Each parametric form encodes our prior knowledge Poisson Process Hawkes Process Self-Correcting Process Autoregressive Conditional Duration Process Limitations Model may be misspecified Hard to encode complex features or markers Hard to encode dependence structure Can we Ca we learn a mo more exp xpressive ssive mo mode del of ma marked ked temp mporal al po point t pr processes esses ? 94
Recurrent Marked Temporal Point Processes Recurrent neural network + Marked temporal point processes hidden vector of RNN learns a nonlinear dependency over both past ti time and marker ers general conditional density multinomial distribution of the next timing for the markers 95
Experiments: synthetic Time Prediction Intensity Function Prediction Error ACD Hawkes Self-Correcting 96
Experiments: real world data NYC Taxi Trading Stackoverflow MIMIC-II Time Prediction Marker Prediction 97
A unified framework Representation P ROBABILISTIC M ODELS 1. Intensity function and 2. Basic building blocks L EARNING M ETHODS 3. Superposition to Modeling 1. Idea adoption understand 2. Network coevolution predict 3. Collaborative dynamics control Learning 1. Sparse hidden diffusion networks P ROCESSES & A CTIVITY 2. Low rank collaborative dynamics over 3. Generic algorithm S OCIAL & I NFORMATION Inference N ETWORKS 1. Time-sensitive recommendation 2. Scalable Influence estimation 98
Introduction to PtPack A C++ ++ Mu Multivariate variate Tem empor oral al Poi oint nt Proce ocess ss Packa ackage ge
Features Learning sparse interdependency structure of continuous-time information diffusions Scalable continuous-time influence estimation and maximization Learning multivariate Hawkes processes with different structural constraints, like: sparse, low-rank, customized triggering kernels Learning low-rank Hawkes processes for time-sensitive recommendations Efficient simulation of standard multivariate Hawkes processes Learning multivariate self-correcting processes Simulation of customized general temporal point processes Basic residual analysis and model checking of customized temporal point processes Visualization of triggering kernels, intensity functions, and simulated events 100
Recommend
More recommend