on the dynamics of topic based communites in online
play

On the Dynamics of Topic-Based Communites in Online - PowerPoint PPT Presentation

On the Dynamics of Topic-Based Communites in Online Knowledge-Sharing Networks Anna Guimar aes, Ana Paula Couto da Silva, Jussara Almeida Department of Computer Science - UFMG (Brazil) September 21, 2015 Introduction Online


  1. On the Dynamics of Topic-Based Communites in Online Knowledge-Sharing Networks Anna Guimar˜ aes, Ana Paula Couto da Silva, Jussara Almeida Department of Computer Science - UFMG (Brazil) September 21, 2015

  2. Introduction • Online Knowledge-Sharing Networks – Wikis, Q&A sites, discussion forums – User-created and maintained discussions – Wealth of knowledge 2

  3. Introduction • Online Knowledge-Sharing Networks – Wikis, Q&A sites, discussion forums – User-created and maintained discussions – Wealth of knowledge • Prior research focus on knowledge extraction by: – Detecting quality content [Agichtein et al., 2008] – Ranking questions and answers [Dalip et al., 2013] – Identifying expert users [Ravi et al., 2014, Wang et al., 2013] 2

  4. Introduction • More than repositories for knowledge! – Community structure surrounding discussions – Topics and communities subject to temporal changes – Multiple topics, multiple communities • This study: – Community approach to knowledge-sharing networks – Characterization and modeling of community evolution 3

  5. Case Study: Stack Overflow 4

  6. Case Study: Stack Overflow Tags 4

  7. Topic-Based Communities in Stack Overflow • Communities centered around topics – Topics are explicity defined – Independent from social interaction graph • Non-exclusive membership to multiple communities 5

  8. Stack Overflow Dataset • User activity – User ID, Tag ID, Time stamp • Data covering a six-year period – 2008–2014 Tags Posts Users 400 19.8 million 1.7 million 6

  9. Topic-Based Communities in Stack Overflow • Temporal analyses of community activity in terms of: – How user behavior affects community sustainability – How users relate to communities in the long run – How users divide their attention across different communities – How communities affect one another 7

  10. Communities in Stack Overflow: Findings • Significant revisiting behavior – Users continue to contribute to a same community – Revisitors to a community grow more significant over time Mean Fraction of Revisits 1st month 6th month 12th month Revisitors 0.20 0.44 0.50 Revisits 0.27 0.46 0.50 8

  11. Communities in Stack Overflow: Findings • Participation in multiple communities – 32% of users participate in up to 3 communities – Average user participates in 17 communities – Decaying pattern of activity over time 30 80 70 18 42 25 13 28 60 Communities 50 20 Posts 40 15 30 20 10 10 5 0 2 4 6 8 10 12 2 4 6 8 10 12 Months Months 9

  12. 0 2014 600 500 400 300 200 100 900 Months Aug 800 2014 Feb 2013 Aug 2013 Feb # Members Rails 3 Members New Members 700 Communities in Stack Overflow: Findings • Migrating behavior – Users traverse different communities over time – Shared member base across communities Ruby on Rails 3 → Ruby on Rails 4 10

  13. 2014 2014 3000 2000 1000 0 Months 6000 Aug Feb 5000 2013 Aug 2013 Feb # Members MySQL New Members 4000 Communities in Stack Overflow: Findings • Migrating behavior – Users traverse different communities over time – Shared member base across communities MySQL → PHP 10

  14. Communities in Stack Overflow: Findings • Key aspects dictating community evolution – Intra-community aspects – User revisits – Continued activity – Inter-community aspects – Shared member base – User migration 11

  15. How can we then describe community evolution? 12

  16. CERIS Model • CERIS – Community Evolution model with Revisits and Inter-community effectS • Goal: describe community activity (number of posts) over time • Incorporates revisits and community relationships 13

  17. CERIS Model • CERIS extends state-of-the-art models – Phoenix-R evolution model with revisits [Figueiredo et al., 2014] – Competition model [Beutel et al., 2012] • Epidemiology approach to network dynamics – Objects in the network are modeled as infections 14

  18. CERIS Model • Users are initially exposed to different communities S 15

  19. CERIS Model • Users become infected by participating in a community I 1 I 2 β 1 β 2 S 15

  20. CERIS Model • Users can recover by ceasing activity in a community I 1 I 2 γ 1 γ 2 β 1 β 2 S 15

  21. CERIS Model • Or they can be infected by additional communities I 1 , 2 εβ 2 εβ 1 γ 2 γ 1 I 1 I 2 γ 1 γ 2 β 1 β 2 S 15

  22. CERIS Model • Revisits to a same community captured by hidden states V 1 , 2 ω 1 , 2 I 1 , 2 εβ 2 εβ 1 γ 2 γ 1 ω 1 ω 2 V 1 I 1 I 2 V 2 γ 1 γ 2 β 1 β 2 S 15

  23. CERIS Model V 1 , 2 ω 1 , 2 I 1 , 2 V 1 , 2 V 1 , 2 εβ 2 εβ 1 + + s 1 s n v 1 γ 2 γ 1 ˆ ω 1 ω 2 + + V 1 I 1 I 2 V 2 γ 1 γ 2 V 1 V 1 β 1 β 2 ... S 16

  24. CERIS Model • Analyzes the time series for the number of posts in the communities simultaneously • Contagious process occurs following “shocks” – Wavelets method to identify activity peaks as shock candidates – e.g. When a new related community becomes active • Model fitting with the Levenberg-Marquardt algorithm and Minimum Description Length 17

  25. Jan 100 Jul 2014 Jul 2013 Jan 2012 Jan 0 50 150 Jul 200 250 300 350 400 ios7 ios6 ios5 model Jul CERIS Model Results HTML and CSS iOS versions 70000 css 60000 html 50000 model 40000 30000 20000 10000 0 2009 2010 2011 2012 2013 2014 18

  26. CERIS Model Results • Model results: – Reasonably accurate fittings – Captures different patterns of activity – Captures concurrent evolution of related communities RMSE HTML and CSS iOS versions All (mean, daily) 3046.895 13.612 21.131 19

  27. CERIS Model Results • Model outputs used to quantify the relationship between communities • Flow of users between communities: flow C 1 , C 2 ( t ) = εβ 2 ( t ) flow C 2 , C 1 ( t ) = εβ 1 ( t ) 20

  28. CERIS Model Results Top 100 Top 15 100 0.8 .net 0.9 objective-c 0.7 asp.net 0.8 80 css Communities 0.6 0.7 mysql ios 0.5 0.6 60 c++ 0.5 html 0.4 python 0.4 40 0.3 jquery 0.3 android 0.2 php 0.2 20 c# 0.1 0.1 javascript java 0.0 0.0 20 40 60 80 100 a # p d n l + s l s t c t t y m q p o e e v h i r o s - c o + s e i e i c n n a p h t y r r h v u c . . Communities j c d t m p y i s q t n s p c a j a a v e a j b j o 21

  29. Conclusions • Knowledge-sharing networks as a community environment – Topic-based communities defined by users interacting with topics of their interest • Investigation of topic-based communities in Stack Overflow – User activity in terms of communities they belong to – Impact of related communities • New model to describe community evolution – Incorporates key factors behind community activity – Good portrayal of the co-evolution of multiple communities 22

  30. Thank you! Anna Guimar˜ aes anna@dcc.ufmg.br 23

  31. References I Agichtein, E., Castillo, C., Donato, D., Gionis, A., and Mishne, G. (2008). Finding High-Quality Content in Social Media. In Proc. WSDM . Beutel, A., Prakash, B. A., Rosenfeld, R., and Faloutsos, C. (2012). Interacting Viruses in Networks: Can Both Survive? In Proc. ACM SIGKDD . 24

  32. References II Dalip, D. H., Gon¸ calves, M. A., Cristo, M., and Calado, P. (2013). Exploiting User Feedback to Learn to Rank Answers in Q&A Forums: A Case Study with Stack Overflow. In Proc. ACM SIGIR . Figueiredo, F., Almeida, J. M., Matsubara, Y., Ribeiro, B., and Faloutsos, C. (2014). Revisit Behavior in Social Media: The Phoenix-R Model and Discoveries. Proc. PKDD . 25

  33. References III Hansen, M. H. and Yu, B. (2001). Model Selection and the Principle of Minimum Description Length. Journal of the American Statistical Association , 96(454). Mor´ e, J. J. (1978). The levenberg-marquardt algorithm: implementation and theory. In Numerical analysis , pages 105–116. Springer. Ravi, S., Pang, B., Rastogi, V., and Kumar, R. (2014). Great Question! Question Quality in Community Q&A. In Proc. ICWSM . 26

  34. References IV Wang, X., Butler, B. S., and Ren, Y. (2013). The impact of membership overlap on growth: An ecological competition view of online groups. Organization Science , 24(2):414–431. 27

Recommend


More recommend