milad eftekhar yashar ganjali nick koudas introduction
play

Milad Eftekhar , Yashar Ganjali, Nick Koudas Introduction - PowerPoint PPT Presentation

Milad Eftekhar , Yashar Ganjali, Nick Koudas Introduction Identifying the most influential individuals is a well- studied problem. We generalize this problem to identify the most influential groups . Application:


  1. Milad Eftekhar , Yashar Ganjali, Nick Koudas

  2. Introduction β€’ Identifying the 𝑙 most influential individuals is a well- studied problem. β€’ We generalize this problem to identify the π‘š most influential groups . β€’ Application: β€’ Companies often target groups of people β€’ E.g. by billboards, TV commercials, newspaper ads, etc. 2

  3. Group targeting β€’ Groups Billboard β€’ Advantages β€’ Improved performance β€’ Natural targets for advertising β€’ An economical choice 3

  4. Fine-Grained Diffusion (FGD) β€’ Determine how advertising to a group translates into individual adopters. β€’ Run individual diffusion process on these adopters. 4

  5. FGD Modeling β€’ Graph 𝐻′ : add a node for each group, add edges between a node corresponding to a group 𝑕 𝑗 and its members with weight π‘₯ 𝑗 that depends on β€’ Advertising budget, size of group, the escalation factor, and the budget needed to convince an individual 5

  6. FGD Modeling (Cont’d) β€’ Escalation Factor 𝛾 : how many more initial adaptors we can get by group targeting rather than individual targeting. Individual Targeting Group Targeting Advertising budget Advertising budget $1000 $1000 Cost of convincing an individual Billboard $100 Audience = 10000 Cost = $1000 individuals 20 initial adopters 10 initial adopters 𝛾 = 20 10 = 2 6

  7. FGD Modeling (Cont’d) β€’ Escalation Factor 𝛾 β€’ Based on the problem structure, the size and shape of the network, the initial advertising method, etc. β€’ Individual advertising: 𝛾 = 1 β€’ Billboard advertising: 𝛾 = 200 β€’ Online advertising: 𝛾 = 400 7

  8. Problem statement β€’ Goal : Find the π‘š most influential groups (blue group- nodes) Top-2 Top-2 influential influential groups groups β€’ NP-hard under FGD model 8

  9. topfgd algorithm β€’ Diffusion in FGD is monotone and submodular β€’ topfgd: a greedy algorithm provides a (1-1/e) approximation factor. β€’ In each iteration, add the group resulting to the maximum marginal increase in the final influence. β€’ Time: 𝑃 ( π‘š Γ— 𝑛 Γ—| 𝐹 π‘—π‘œπ‘’ |Γ— 𝑆 ) 9

  10. Coarse-Graind Diffusion (CGD) β€’ FGD is not practical for large social networks β€’ Idea: incorporate information about individuals without running explicitly on the level of individuals β€’ A graph to model inter-group influences Group 1 10

  11. CGD Modeling β€’ Differences with β€œIndividual Diffusion” models β€’ No binary decisions β€’ Progress fraction for each group β€’ Two types of diffusion β€’ Inter-group diffusion Progress fraction = 0.6 β€’ Intra-group diffusion β€’ Submodularity? 11

  12. CGD Diffusion Model β€’ Each newly activated fraction of a group can activate its neighboring groups β€’ As a result of an activation attempt from A to B, some activation attempts also occur between members of B β€’ Continue for several iterations to converge 0 0 0.04 β†’ 0.05 Group 1 Group 1 0 0 β†’ 0.04 0 0.2 0.2 0 0 β†’ 0.2 12

  13. topcgd algorithm β€’ Goal : Find the π‘š most influential groups β€’ NP-hard under CGD model β€’ Diffusion in CGD is monotone and submodular β€’ topcgd: a greedy algorithm provides a (1-1/e) approximation factor. β€’ Time: 𝑃( 𝐹 π‘—π‘œπ‘’ + π‘›π‘š 𝑛𝑒 + π‘œ ) β€’ 𝑒 is the number of iterations to converge (~10) 13

  14. Experimental setup β€’ Datasets: β€’ DBLP: 800K nodes, 6.3M edges, 3200 groups β€’ Comparison β€’ Spend same advertising budget on all algorithms β€’ Measure the final influence (the number of convinced individuals) β€’ Run Individual Diffusion process on the initial convinced individuals 14

  15. Results β€’ DBLP-1980: 8000 nodes, 69 groups β€’ Compare topid vs. topfgd vs. topcgd β€’ Final influence: topfgd and topcgd outperform topid for 𝛾 > 3 β€’ Time: topid (30 days), topfgd (an hour), topcgd (0.2 sec) topfgd topcgd topid 6000 final influence 5000 4000 3000 2000 1000 0 0 10 20 30 40 50 60 70 80 90 100 Ξ² 15

  16. Results (Cont’d) β€’ DBLP: topcgd vs. Baselines β€’ rnd, small, big, degree β€’ Time of topcgd : 100 minutes β€’ topfgd and topid not practical topcgd degree big rnd small 70000 60000 final influence 50000 40000 30000 20000 10000 0 10 30 50 70 90 Ξ² 16

  17. Conclusion and Future Works β€’ Focus on groups rather than individuals β€’ Wider diffusion β€’ Improved performance β€’ More less influential individuals vs. less more influential individuals β€’ Although CGD aggregates the information about individuals (hence improved performance), it results to final influence comparable to FGD. β€’ We are interested in a generalized model where β€’ Groups are allowed to receive different budgets β€’ The cost of advertising to each group is predetermined 17

  18. Thanks! (Questions?) 18

Recommend


More recommend