“Peer Review” vs “Metrics”: Competing Models? Loet Leydesdorff University of Amsterdam, Amsterdam School of Communication Research (ASCoR) loet@leydesdorff.net …
Models do not have to be compatible • Peer review is another model (or prodecure) for the purpose of distinguishing between exellent and non-excellent research; • Models provide windows on the complexities under study.
The Quality of the Models
FIGURE 2: Distribution of publications; successful applicants (left) and unsuccessful (right) (left axis: number of publications; number of citations/10)
FIGURE 3: Superposition of the distribution of publications of successful and best unsuccessful applicants (left axis: number of publications; number of citations/100)
Possible explanations • The rational arguments are exhausted in the final round. → bias kicks in • Cognitive: disciplinary core; “play it safe” • Social: “old boys networks”; trust • Institutional: nearby-ness / distance Conclusion : a lottery sometimes works better
• Van den Besselaar, P., & Leydesdorff, L. (2009). Past performance, peer review, and project selection: A case study in the social and behavioral sciences. Research Evaluation, 18 (4), 273-288. • Bornmann, L., Leydesdorff, L., & Van den Besselaar, P. (2010). A Meta-evaluation of Scientific Research Proposals: Different Ways of Comparing Rejected to Awarded Applications. Journal of Informetrics, 4 (3), 211-220.
Quality of the Models: Specification of Error • A model of a complex system opens a window for control; • The principle of “Requisite Variety” (Ashby): – Without requisite variety, unintended consequences can be expected to prevail; – One cannot steer a complex system the focus should be on estimating the error • Models without error estimations should not be trusted.
“The decline of British science” r = -.349 Problems with the ‘measurement’ of national scientific performance, Science and Public Policy 15 (1988) 149-152; at p. 150
Leydesdorff, L. (1991). On the ‘ Scientometric Decline’ of British Science. One Additional Graph in Reply to Ben Martin. Scientometrics, 20 (2) 363-368.
Leiden Rankings Normalization needed in citation analysis. However, the basis for the normalization changes each year: • 2013: WoS Categories (appr. 220) • 2014: 82 clusters of direct citation relations • 2015: 3,822 “micro - fields” • 2016: 4,113 “micro - fields” • (2017: 4,003 “micro - fields”) Spearman rho 2013 vs. 2016 = .938** (n = 424; top-10%).
“Carnegie Mellon University” • 2013: 24 th among 500 universities; • PP10 = 18.7% • 2016: 67 th among 842 universities; • PP10 = 14.3% - 4.4% NB The extension of the sample has an effect on the ranks, but not on the PP10 (“size - independent indicator”).
Figure : The participation of Carnegie Mellon University in the top-10% class of papers using the Leiden Rankings for subsequent years as a time series versus the reconstruction using the 2016-model; all journals included.
“Carnegie Mellon University” • 2013: 24 th among 500 universities; • PP10 = 18.7% • 2016: 67 th among 842 universities; • PP10 = 14.3% - 4.4% Reconstructed in 2013 : PP10 = 15.5% → 3.2% less 3.2% of 4.4% 72.7% due to the model 15.5 → 14.3 27.3% due to the data
Leydesdorff, L., Wouters, P., & Bornmann, L. (2016, in press). Professional and citizen bibliometrics: complementarities and ambivalences in the development and use of indicators — a state-of-the-art report. Scientometrics 109 (3), 2129-2150. doi: 10.1007/s11192-016-2150-8
Conclusions • In the case of quantitative models we can detect the sources of error; – Quality control of the evaluation; • The sum of subjective preferences is not an intersubjective agreement; but a poorly understood process; – Quality control is subjective? • “Mixed models” cannot be controlled on quality. • Normative use of analytical models requires a reflexive translation.
Policy implications • Error is non-trivial in evaluation models; • Both qualitative and quantitative models generate and institutionalize errors; • The data is a minor source of error; • Policies can be based on erroneous inferences; • Policy makers “love” descriptive statistics; • Turn to “expectations” vs. “observations.
The use of “lotteries” • Do not make selections among the best applicants, but use a lottery instead. • No bias! • Applications can be given weights; for example, different weights for less privileged applicants; • Loosing a lottery has no social consequences; • Less bureaucracy; transparency; low costs.
Recommend
More recommend