Model of Complex Networks based on Citation Dynamics Lovro ˇ Subelj & Marko Bajec University of Ljubljana Faculty of Computer and Information Science LSNA ’13 L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 1 / 14
Introduction Introduction Real-world networks are scale-free, small-world etc. Social networks are degree assortative . (Newman and Park, 2003) → Properties captured by many models in the literature. ֒ However, non-social networks are degree disassortative ! Figure: Part of Cora citation network with highlighted hubs. For simplicity, we consider only undirected networks. L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 2 / 14
Models of complex networks Forest Fire model (Leskovec et al., 2007) Let p be the burning probability . 1 i chooses an ambassador a and links to it; p 2 i selects x p ∼ G ( 1 − p ) neighbors a 1 , . . . , a x p and links to them; 3 a 1 , . . . , a x p are taken as the ambassadors of i . w w v v z z y y x x i i a a Networks are scale-free, small-world, degree assortative etc. Natural interpretation for citation networks! L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 3 / 14
Models of complex networks Author citation dynamics 1 author chooses a paper (i.e., ambassador) and cites it; 2 author selects some of its references and cites them; 3 the latter are taken as the ambassadors. w w v v z z y y x x i i a a Assumption → authors read all papers they cite (and vice-versa) . Only ≈ 20% of cited papers are read. (Simkin and Roychowdhury, 2003) Authors read or cite papers due to two (independent) processes ! L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 4 / 14
Models of complex networks Citation model (our) Let q be the linking probability . 1 i chooses an ambassador a ; p 2 i selects x p ∼ G ( 1 − p ) neighbors a 1 , . . . , a x p ; q i selects x q ∼ G ( 1 − q ) neighbors and links to them; 3 a 1 , . . . , a x p are taken as the ambassadors of i . w w v v z z y y x x i i a a Networks are scale-free, small-world, degree disassortative etc. Nodes do not (necessarily) link to their ambassadors! L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 5 / 14
Models of complex networks Alternative models w w w w v v v v z z z z y y y y x x x x i i i i a a a a Forest Fire (Leskovec et al., 2007) Butterfly (McGlohon et al., 2008) w w w w v v v v z z z z y y y y x x x x i i i i a a a a Copying (Krapivsky and Redner, 2005) Citation model (our) L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 6 / 14
Models of complex networks Analysis of the models S , T are the ambassadors and linked nodes. Forest Fire model: S = T Butterfly model: S ⊇ T Copying model: S ⊆ T Citation model: S , T arbitrary w w w w w w w w v v v v v v v v z z z z z z z z y y y y y y y y x x x x x x x x i i i i i i i i a a a a a a a a Why degree disassortativity? Linking to the ambassadors increases assortativity. Absence of such a process prevents assortativity. (Newman and Park, 2003) Heterogeneous networks are disassortative. (Johnson et al., 2010) L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 7 / 14
Experimental analysis Comparison of the models ( k & r ) Only Citation model gives degree disassortative networks (i.e., r < 0). 14 Forest Fire 0.8 Forest Fire Butterfly Butterfly Network degree k 12 Degree mixing r 0.6 Copying Copying Citation Citation 10 0.4 8 0.2 6 0 4 -0.2 2 -0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 Burning probability p Burning probability p 14 Forest Fire 0.8 Forest Fire Butterfly Butterfly Network degree k 12 Degree mixing r 0.6 Copying Copying Citation Citation 10 0.4 8 0.2 6 0 4 -0.2 2 -0.4 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 Linking probability q Linking probability q Shaded regions show most likely parameter values. (Laurienti et al., 2011) L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 8 / 14
Experimental analysis Comparison of the models ( l , C & Q ) All models give (scale-free) small-world networks with high modularity . 0.8 1 18 Forest Fire Forest Fire Network modularity Q Network clustering C Butterfly Butterfly 16 0.8 Mean distance l Copying 0.6 Copying 14 Citation Citation 12 0.6 10 0.4 0.4 8 Forest Fire Butterfly 6 0.2 Copying 0.2 4 Citation 2 0 0 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 Burning probability p Burning probability p Burning probability p 0.8 1 18 Forest Fire Forest Fire Network modularity Q Network clustering C Butterfly Butterfly 16 0.8 Mean distance l Copying Copying 0.6 14 Citation Citation 12 0.6 10 0.4 0.4 8 Forest Fire Butterfly 6 0.2 Copying 0.2 4 Citation 2 0 0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 Linking probability q Linking probability q Linking probability q L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 9 / 14
Experimental analysis Parameter estimation s is the number of ambassadors, s = | S | . s ≤ 1 − p 2 qs 1 − 2 p and k ≤ 1 − q − (1 − q ) s +1 For a given k and fixed q , the system can be solved for p . 30 100 100 5 500 500 25 Network degree k # ambassadors s 1000 1000 4 20 3 15 10 2 5 1 0 0.1 0.2 0.3 0.4 0.2 0.4 0.6 0.8 Burning probability p Linking probability q L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 10 / 14
Experimental analysis Cora citation network p q n m k r Cora network 23166 89157 7 . 697 − 0 . 055 Forest Fire 0 . 46 - 23166 88828 7 . 669 0 . 211 Citation 0 . 37 0 . 59 23166 89888 7 . 760 − 0 . 047 Percentage of papers considered is 66 % (# references just 3 . 85)! 1000 Cora network Cora network Degree distribution P(k) Forest Fire Forest Fire 0.1 Neighbor degree k N Citation Citation 0.01 100 0.001 10 0.0001 1 10 100 1000 1 10 100 Node degree k Node degree k Subelj and Bajec, 2012) . For other network properties see paper and (ˇ L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 11 / 14
Experimental analysis arXiv citation network p q n m k r arXiv network 27400 352021 25 . 695 − 0 . 030 Citation 0 . 46 0 . 67 27400 350699 25 . 598 − 0 . 068 Percentage of papers considered is 49 % (# references is 12 . 85)! arXiv network arXiv network Degree distribution P(k) 0.1 Citation Citation Neighbor degree k N 1000 0.01 100 0.001 0.0001 10 1 10 100 1000 1 10 100 1000 Node degree k Node degree k L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 12 / 14
Conclusions Conclusions Model for citation networks with most common properties. (Non-social) degree non-assortative networks → nodes must not link to their ambassadors! Networks also show dichotomous mixing . (Hao and Li, 2011) Future work: extension to directed networks, network traversal (isolated nodes), analyses on reliable data (e.g., WoS ). L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 13 / 14
Questions & comments lovro.subelj@fri.uni-lj.si http://lovro.lpt.fri.uni-lj.si/ L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 14 / 14
D. Hao and C. Li. The dichotomy in degree correlation of biological networks. PLoS ONE , 6 (12):e28322, 2011. doi: 10.1371/journal.pone.0028322 . S. Johnson, J. J. Torres, J. Marro, and M. A. Mu˜ noz. Entropic origin of disassortativity in complex networks. Phys. Rev. Lett. , 104(10):108702, 2010. doi: 10.1103/PhysRevLett.104.108702 . P. L. Krapivsky and S. Redner. Network growth by copying. Phys. Rev. E , 71(3):036118, 2005. doi: 10.1103/PhysRevE.71.036118 . P. J. Laurienti, K. E. Joyce, Q. K. Telesford, J. H. Burdette, and S. Hayasaka. Universal fractal scaling of self-organized networks. Physica A , 390(20):3608–3613, 2011. doi: 16/j.physa.2011.05.011 . J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data , 1(1):1–41, 2007. M. McGlohon, L. Akoglu, and C. Faloutsos. Weighted graphs and disconnected components: Patterns and a generator. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , page 524–532, New York, NY, USA, 2008. M. E. J. Newman and J. Park. Why social networks are different from other types of networks. Phys. Rev. E , 68(3):036122, 2003. doi: 10.1103/PhysRevE.68.036122 . M. V. Simkin and V. P. Roychowdhury. Read before you cite! Compl. Syst. , 14:269–274, 2003. L. ˇ Subelj and M. Bajec. Clustering assortativity, communities and functional modules in real-world networks. e-print arXiv:12082518v1 , pages 1–21, 2012. L. ˇ Subelj and M. Bajec. Model of complex networks based on citation dynamics. In Proceedings of the WWW Workshop on Large Scale Network Analysis , page 4, Rio de Janeiro, Brazil, 2013. L. ˇ Subelj (University of Ljubljana) Citation Network Model LSNA ’13 14 / 14
Recommend
More recommend