filippo radicchi graph based ranking algorithms
play

Filippo Radicchi Graph-based ranking algorithms BioCenit Research - PowerPoint PPT Presentation

Filippo Radicchi Graph-based ranking algorithms BioCenit Research Lab @ Universitat Rovira i Virgili Why analyze bibliographic data? Scientif i c motivation Electronic databases store a huge amount of information about scientif i c


  1. Filippo Radicchi Graph-based ranking algorithms BioCenit Research Lab @ Universitat Rovira i Virgili

  2. Why analyze bibliographic data? Scientif i c motivation Electronic databases store a huge amount of information about scientif i c publications ( only in 2006, Journals ~10 4 Papers ~10 6 Citations ~10 7 ) Collaboration networks Collaboration networks Citation networks Citation networks

  3. Why analyze bibliographic data? Practical motivation Citations represent the fundamental units used to measure the scientif i c relevance of papers, journals, scientists, research groups and institutions

  4. A practical example Ass. Professor Discipline Researcher Ass. Professor Full Professor Discipline Researcher Full Professor Mathematics Mathematics Physics Physics Biology Biology . Computer Sci . Computer Sci Chemistry Chemistry P = # publications, T = period of activity , C = total # of citations source: ”Indicatori di Attività Scientif i ca e di Ricerca”, Consiglio Universitario Nazionale (CUN), Dec. 2008

  5. A practical example I ( N p , A A )= 10 N p 1) Number of papers: A A h c I ( N C , A A )= N C 2) Total number of citations: A A 4 3) Contemporary h-index: S ( i ,t i ,t )= ( t − t i + 1 ) C ( i ,t i ,t ) A. Sidiropoulos et al., Scientometrics 72 , 253 (2007) calculated over a population of 1400 Italian physicists N p total number of papers A A academic age N C total number of citations C ( i ,t i ,t ) citations accumulated by paper i published in year t i and measured in year t R. Manella and P. Rossi, arXiv:1207.3499 (2012) source: ”Abilitazione Scientif i ca Nazionale – La normalizzazione degli indicatori per l'eta' accademica (ANVUR)”, Jul. 2012

  6. On a larger sample of scientists 35000 prof i les on Google Scholar citations

  7. The network structure of citation data is often neglected in research evaluation Network approach Network approach “Standard” approach Standard” approach “ D.J. de Solla Price, Science 169 , 510 (1965) papers Citation counts CiteRank Citation counts CiteRank journals Impact factor Eigenfactor Impact factor Eigenfactor scientists h-index, g-index, ... ? h-index, g-index, ... ?

  8. Graph-based ranking of scientists Physical Review Series I ( PRI ), Physical Review ( PR ), Physical Review Letters ( PRL ), Physical Review A ( PRA ), Physical Review B ( PRB ), Physical Review C ( PRC ), Physical Review D ( PRD ), Physical Review E ( PRE ), Reviews of Modern Physics ( RMP ) between 1893 and 2006 Paper Citation Network Paper Citation Network Weighted Author Citation Network Weighted Author Citation Network

  9. Weighted author citation network key-words: ” complex network ” , ” scale-free network ”, ” small-world network ”, etc..

  10. Dynamical representation Divide 8,783,994 total references into homogeneous homogeneous intervals intervals Divide 8,783,994 total references into M I = # of intervals M I = # of intervals M R = # of references in each interval M R = # of references in each interval M R /2 3M R /2 (M I - 1/2) M R (M I - 3/2) M R 2M R (M I -2) M R (M I -1) M R M I M R 2M (M I -2) M (M I -1) M M I M 0 M R 0 M R R R R R 2006 M R ~ 488,000 1893-1966 M I = 18

  11. Science Author Rank Algorithm Diffusion equation weight of the arc from j to i out-strength of the node j each paper carries a ”scientif i c credit”, equally divided among its authors SARA scores depend on the choice of the redistribution probability q

  12. Science Author Rank Algorithm

  13. Comparison with different metrics Benchmarking SARA Considered prizes: Nobel prize, Wolf prize, Boltzmann medal, Dirac medal and Planck medal

  14. Best physicists according to SARA NP= Nobel prize, WP= Wolf prize, BM= Boltzmann medal, DM= Dirac medal, and PM= Planck medal

  15. physauthorsrank.org

  16. Ranking tennis players

  17. ATP points distribution as of 2009 source: wikipedia.org 4 Grand Slams: Australian Open, Roland Garros, Wimbledon, US Open 9 Masters 1000: Indian Wells, Miami, Monte Carlo, Madrid, Rome, Canada, Cincinnati, Shanghai, Paris 11 500 Series: Rotterdam, Memphis, Acapulco, Dubai, Barcelona, Hamburg, Washington, Beijing, Tokyo, Basel, Valencia 40 250 Series: Doha, Chennai, Brisbane, Sydney, Auckland, ........ Best results in 18 tournaments: 4 Grand Slams, 8 Masters 1000, best 4 results in 500 Series and best 2 results in 250 Series ATP World Tour Finals: reserved to the best 8 players in the ranking

  18. ATP points 2009 ATP points 2008

  19. ATP data cover all tournaments since 1968

  20. The Open Era 3700 players, 3600 tournaments, 133000 matches

  21. Tennis contact graph each match is a directed edge from the loser to the winner edges are weighted w i , j = total matches i vs. j, won by j

  22. Top players in Grand Slams Only players with at least two Grand Slam titles between 1968 and 2010

  23. Tennis is “complex” Matthew effect in career longevity, A.M. Pertersen at al., Proc. Natl. Acad. Sci. USA 108 , 18 (2011)

  24. Prestige score random diffusion correction for dangling nodes relocation for a Grand Slam tournament for a tournament

  25. #3 : John McEnroe Career prize money: $12,547,797 Career record: 875–198 (81.55%) Career titles: 104 including 77 listed by the ATP

  26. #2 : Ivan Lendl Career prize money: $21,262,417 Career record: 1071–239 (81.8%) Career titles: 144 including 94 listed by the ATP

  27. #1 : Jimmy Connors Career prize money: $8,641,040 Career record: 1241–277 (81.75%) Career titles: 148 including 109 listed by the ATP

  28. Prestige Rank

  29. Relation with other scores

  30. Relation with other scores 2009 ATP year-end rank

  31. Best player of the year

  32. Best player of the year

  33. Best players in Grand Slams

  34. What did people think about this ranking?

  35. What did players think about this ranking? journalist: “There is a weird study by an American physician...” Pete: “Who is this guy!?!?!?!”

  36. References Diffusion of scientif i c credits and the ranking of scientists F. Radicchi, S. Fortunato, B. Markines and A. Vespignani Phys. Rev. E 80 , 056103 (2009) Who is the best player ever? A complex network analysis of the history of professional tennis F. Radicchi PloS ONE 6 , e17249 (2011) Citation networks F. Radicchi, S. Fortunato and A. Vespignani In Models of Science Dynamics: Encounters Between Complexity Theory and Information Sciences. Eds. A.Scharnhorst; K. Börner and P. van den Besselaar (Springer, 2012)

  37. Can bibliographic data be used for research evaluation? scientist scientist h-index discipline h-index discipline Edward Witten 110 Marvin Cohen 94 Physics Philip W. 91 Manuel Cardona 86 Anderson Frank Wilczek 68 George 135 Elias J. Corey 132 Whitesides Chemistry Martin Karplus 129 Alan Hegeer 114 Kurt Wuthrich 113 Hector Garcia- 70 Deborah Estrin 68 Molina Ian Foster 67 Computer science Scott Shenker, 65 Jeffrey D. Ullman, Don Towsley P. Ball, Nature 448 , 737 (2007)

  38. Description of the dataset Subject Categories Subject Categories Journals Journals Papers Papers Publication year Publication year Number of citations Number of citations Papers are classified in 172 scientific disciplines (from A to Z ) c o u s t i c s o o l o g y

  39. Different scientific disciplines Source of data March 2008

  40. Different scientific disciplines

  41. Different scientific disciplines

Recommend


More recommend