nalysis in bibliometrics ne network rk ana Lovro ubelj - PowerPoint PPT Presentation

nalysis in bibliometrics ne network rk ana Lovro Šubelj University of Ljubljana, Faculty of Computer and Information Science CWTS ‘17

ovenia “chicken” Sl Slov Pannonian flat like NL :) Alps ≤ 2864 m Ljubljana karst seaside caves & wine < 50 km :(

University of Lj Ljubljana • since 1919 271 st in CWTS Leiden Ranking 2017 • 26 members 23 faculties & 3 academies • 40,110 students & 5,730 staff in 2016

Faculty of Co Computer and Information Science • since 1996 cs study since 1973 • ≈1,300 students & ≈180 staff • BSc , MSc , PhD cs, prog, math, mm • research cs, db, is, dm, ml, ai, nets

ne networks courses

talk ou outline 1. reliability of bibliographic databases Šubelj, L., Fiala, D., & Bajec, M. (2014). Scientific Reports, 4, 6496. Šubelj, L., Bajec, M., Boshkoska, B. M., et al. (2015). PLoS ONE, 10(5), e0127390. 2. modeling paper citation networks Šubelj, L., & Bajec, M. (2013). In Proceedings of the LSNA ‘13, pp. 527–530. Šubelj, L., Žitnik, S., & Bajec, M. (2014). In Proceedings of the NetSci ’ 14, p. 1. 3. clustering paper citation networks Šubelj, L., Van Eck, N. J., & Waltman, L. (2016).PLoS ONE, 11(4), e0154404.

bibliographic databases re reliability • databases basis for research & evaluation • databases can differ substantially different databases often give quite different conclusions • content & structure can differ substantially coverage, timespan, features, accuracy, acquisition etc. • only informal notions on their reliability particular case of reliability of structure of citation networks

structure of ci citation networks • statistics of citation networks • mostly consistent with outliers outliers due to data acquisition in most cases • comparison over one statistic • comparison over many statistics? same problem in machine learning community

methodology of database comparison me • network statistics — residuals — database rank • mean ranks of databases over many statistics • residuals since “true database” is not known database reliability seen as consistency with other databases 2 3 Pairwise Spearman correlations ρ ij Residuals mean ranks R i ∃ ρ ij : H 1 Two-tailed Fisher independence z -tests ∀ ρ ij : H 0 One-tailed Friedman rank test H 0 H 0 : ρ ij = 0 at P -value = 0 . 01 H 0 : R i = R j at P -value = 0 . 1 χ 2 -distribution with d.f. N − 1 Standard normal distribution H 1 ∃ ˆ x ij : H 1 1 4 Studentized statistics residuals ˆ x ij Residuals mean ranks R i Two-tailed Nemenyi post-hoc test ∀ ˆ x ij : H 0 Two-tailed Student statistics t -tests H 0 H 0 : ˆ x ij = 0 at P -value = 0 . 1 H 0 : R i = R j at P -value = 0 . 1 Studentized range with d.f. N 25 Student t -distribution with d.f. N − 2

comparison of ci citation networks • comparison of different citation networks results robust to selection of networks, statistics, patterns etc. P -value = 0 . 1 1 2 3 4 5 6 WoS DBLP Cora PubMed arXiv APS A P → P • comparison of different information networks

comparison of bi bibl bliographi phic ne networks • A paper citation networks information networks • C author collaboration networks social networks • B author citation networks social-information networks P -value = 0 . 1 P -value = 0 . 1 1 2 3 4 5 6 1 2 3 4 5 6 WoS DBLP Cora APS Cora PubMed arXiv DBLP arXiv APS WoS PubMed A B A P → P B A ↔ A P -value = 0 . 1 1 2 3 4 5 6 DBLP arXiv there is no WoS PubMed Cora APS C C A − A “best” database!

models of ci citation networks • generative models of citation networks to reason about structure, evolution, dynamics, future etc. • many possible applications in bibliometrics z z z y y y x x x i i i a a a

fo forest fire network model • each new node i forms links as follows 1. i selects initial ambassador a and links to a 2. i selects its neighbors y , z and links to y , z 3. y , z are taken as new ambassadors of i w w v v z z y y x x i i a a

forest fire ci citation model • each new paper i cites as follows 1. i selects initial paper a and cites a 2. i selects its references y , z and cites y , z 3. y , z are taken as new reading for i w w v v z z y y x x i i a a • then authors read all cited papers and vice-versa • only ≈20% references read (Simkin & Roychowdhury, 2003)

realistic ci citation model • each new paper i cites as follows 1. i selects initial paper a and can cite a 2. i selects its references y , z and can cite y , z 3. some references are taken as new reading for i w w v v z z y y x x i i a a • read & cited papers modeled independently

directed ci citation model • directed dynamics much more complicated • model reproduces WoS citation networks • clear optima (peak) in model parameters

im implic licat atio ions of citation model one read paper ≈ five two cited papers!

clustering citation networks cl • clustering papers based on direct citation relations research areas or topics of papers • systematic comparison of large number of methods network clustering and partitioning there is no “best” method!

thank you! network convexity LCN2 seminar next Friday at 4pm in Snellius

nalysis in bibliometrics ne network rk ana Lovro ubelj - PowerPoint PPT Presentation

nalysis in bibliometrics ne network rk ana Lovro ubelj University of Ljubljana, Faculty of Computer and Information Science CWTS 17 ovenia chicken Sl Slov Pannonian flat like NL :) Alps 2864 m Ljubljana karst seaside

The real deal ! Dr. Thed van Leeuwen Presentation at the NARMA Meeting, 29 th march 2017 Outline

Overview & Natural Language Processing: Natural Synergies to Support Digital Information

SIGAPS, an innovative use of bibliometrics for the IHU IHU Franois-Jrme AUBERT Deputy

Can bibliometrics be used to evaluate research in the social sciences and humanities? Professor

SciPub course, Science, Community, Sciento- and Bibliometrics 20121022, Chalmers, Gteborg

A NALYSIS W HAT IS IT ? Created & Exclusively Owned by: Impact Branding Consulting, Inc

T ACOMA M IXED U SE C ENTERS F EASIBILITY A NALYSIS P REPARED BY P ROPERTY C OUNSELORS M AY 2015 I

T YPE -G UIDED W ORST -C ASE I NPUT G ENERATION Di Wang , Jan Hoffmann Carnegie Mellon

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Biophy iophytis is Present Presents s Preliminar Preliminary y Analysis nalysis of of SA

GALATEAS - EU Project Part of the European Commission s I nform ation and Com m unication

Presenter Name Job title, company What is ReCAP? ReCAP = Re altime C ontent A nalysis & P

A NALYSIS OF THE L UBRICATION S YSTEM IN A H IGH P RESSURE P ISTON W ATER P UMP D. Bottazzi 2 , D.

MOLECULA R A NALYSIS OF SINGLE C IRCU L ATING AND DISSEMINATED TUMOR C E LLS ON CHIP CRISTINA DE

B ENEFITS OF R EGIONAL P OWER T RADE : A NALYSIS FOR C ENTRAL A SIA AND S OUTH A SIA Donghui Park

C LASSIFICATION T REE A NALYSIS : A U SEFUL S TATISTICAL T OOL FOR P ROGRAM E VALUATORS Meredith

some cRiteRia foR building Reliable bibliometRic indicatoRs foR measuRing ReseaRch peRfoRmance

Implementation of Zipfian Sumita Barahmand and Shahram Ghandeharizadeh Database Lab, University

Quality Council November 19, 2020 Agenda Time Topic 4:00 p.m. Call to Order and Introductions

Reports on Demand Makes Benchmarking Your Renewals Simple Webinar: March 23 rd @ 11am EDT

1. Research Motivation Genetic Analysis for Disease: occurrence, diagnosis and treatment

Science 2.0 VU Big Science, e-Science and E- Infrastructures + Bibliometric Network Analysis

What does it take to make a good CS conference? Reverse-Engineering Conference Rankings Peep

Why metrics can (and should?) be applied in the Social Sciences Anne-Wil Harzing, Middlesex

Sambuz

Useful Links

Newsletter

Mail Us

nalysis in bibliometrics ne network rk ana Lovro ubelj - PowerPoint PPT Presentation

nalysis in bibliometrics ne network rk ana Lovro ubelj University of Ljubljana, Faculty of Computer and Information Science CWTS 17 ovenia chicken Sl Slov Pannonian flat like NL :) Alps 2864 m Ljubljana karst seaside

The real deal ! Dr. Thed van Leeuwen Presentation at the NARMA Meeting, 29 th march 2017 Outline

Overview &amp; Natural Language Processing: Natural Synergies to Support Digital Information

SIGAPS, an innovative use of bibliometrics for the IHU IHU Franois-Jrme AUBERT Deputy

Can bibliometrics be used to evaluate research in the social sciences and humanities? Professor

SciPub course, Science, Community, Sciento- and Bibliometrics 20121022, Chalmers, Gteborg

A NALYSIS W HAT IS IT ? Created &amp; Exclusively Owned by: Impact Branding Consulting, Inc

T ACOMA M IXED U SE C ENTERS F EASIBILITY A NALYSIS P REPARED BY P ROPERTY C OUNSELORS M AY 2015 I

T YPE -G UIDED W ORST -C ASE I NPUT G ENERATION Di Wang , Jan Hoffmann Carnegie Mellon

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Biophy iophytis is Present Presents s Preliminar Preliminary y Analysis nalysis of of SA

GALATEAS - EU Project Part of the European Commission s I nform ation and Com m unication

Presenter Name Job title, company What is ReCAP? ReCAP = Re altime C ontent A nalysis &amp; P

A NALYSIS OF THE L UBRICATION S YSTEM IN A H IGH P RESSURE P ISTON W ATER P UMP D. Bottazzi 2 , D.

MOLECULA R A NALYSIS OF SINGLE C IRCU L ATING AND DISSEMINATED TUMOR C E LLS ON CHIP CRISTINA DE

B ENEFITS OF R EGIONAL P OWER T RADE : A NALYSIS FOR C ENTRAL A SIA AND S OUTH A SIA Donghui Park

C LASSIFICATION T REE A NALYSIS : A U SEFUL S TATISTICAL T OOL FOR P ROGRAM E VALUATORS Meredith

some cRiteRia foR building Reliable bibliometRic indicatoRs foR measuRing ReseaRch peRfoRmance

Implementation of Zipfian Sumita Barahmand and Shahram Ghandeharizadeh Database Lab, University

Quality Council November 19, 2020 Agenda Time Topic 4:00 p.m. Call to Order and Introductions

Reports on Demand Makes Benchmarking Your Renewals Simple Webinar: March 23 rd @ 11am EDT

1. Research Motivation Genetic Analysis for Disease: occurrence, diagnosis and treatment

Science 2.0 VU Big Science, e-Science and E- Infrastructures + Bibliometric Network Analysis

What does it take to make a good CS conference? Reverse-Engineering Conference Rankings Peep

Why metrics can (and should?) be applied in the Social Sciences Anne-Wil Harzing, Middlesex

Sambuz

Useful Links

Newsletter

Mail Us

Overview & Natural Language Processing: Natural Synergies to Support Digital Information

A NALYSIS W HAT IS IT ? Created & Exclusively Owned by: Impact Branding Consulting, Inc

Presenter Name Job title, company What is ReCAP? ReCAP = Re altime C ontent A nalysis & P