some cRiteRia foR building Reliable bibliometRic indicatoRs foR measuRing ReseaRch peRfoRmance on data souRces, data quality and caveats in designing and applying indicatoRs wolfgang glÄnzel Centre for R&D Monitoring and Dept MSI, KU Leuven, Belgium
Structure of presentation 1. pRologue 2. fRom infoRmation to evaluation 3. appRopRiateness of data souRces 4. quality and cleanness of data 4.1 Basic problems of author identification 4.2 Institutional assignment 5. soundness of methodology 6. subject gRanulaRity 7. aggRegation level 8. epilogue GlÄnzel, Cinvestav , Mexico-City, 2019 2/40
CHapter 1 From Information to Evaluation – The Big Turn GlÄnzel, Cinvestav , Mexico-City, 2019 3/40
Prologue The way from bibliographic data to appropriate bibliometric indicators for measuring research performance is a long and stony road. correct use of bibliographic databases for indicator building and their sensible use in research assessment. ◦ As a rule of experience one could assume that bibliometricians spend about 80% of their time on data processing, cleaning and indicator testing. GlÄnzel, Cinvestav , Mexico-City, 2019 4/40 • In what follows I will atuempt to sketch the basic criteria for a
From Information to Evaluation Eugene GaRfield (1925–2017) He was the founder and chairman of the Institute for Scientific Information (now part of Clarivate Analytics). In the early 1960s he developed the Science Citation Index , the world’s first large multi-disciplinary citation database. Although the SCI was developed for the advanced information retrieval and for services in scientific information, it has become the common source for scientometric studies. “The SCI was not originally created either to conduct quantitative studies, calculate impact factors, nor to facilitate the study of history of science”. GaRfield, From information retrieval to scientometrics – is the dog still wagging its tail? 2009 GlÄnzel, Cinvestav , Mexico-City, 2019 5/40
From Information to Evaluation In the 1970s and 1980s, scientometrics/bibliometrics took a sharp rise and found a new orientation. — Besides information science and sociology of science, science policy became the third driving force in the evolution of scientometrics. The evolution from “litule scientometrics” to “big scientometrics” (GlÄnzel & Schoepflin, 1994) is characterised by two cardinal signs (GlÄnzel & WouteRs, 2013). GlÄnzel, Cinvestav , Mexico-City, 2019 6/40
From Information to Evaluation 1. Scientometrics evolved from a sub-discipline of library and information science to an instrument for evaluation and benchmarking (GlÄnzel, 2006; WouteRs, 2013). ◦ As a consequence, several scientometric tools became used in a context for which they were not designed. 2. Due to the dynamics in evaluation, the focus has shifued away from macro studies towards meso and micro studies of both actors and topics. ◦ More recently, the evaluation of research teams and individual scientists has become a central issue in services based on bibliometric data. GlÄnzel, Cinvestav , Mexico-City, 2019 7/40
From Information to Evaluation ☛ While in information services a certain incompleteness (false negatives), even some errors (false positives) might be tolerable as the results might still be useful, in benchmarking and evaluative contexts such errors can have fatal consequences. ◦ Moed (2010) raised the question of errors, namely, of what is an acceptable “error rate” in the assessment process. GlÄnzel, Cinvestav , Mexico-City, 2019 8/40
CHapter 2 Appropriateness of Data Sources GlÄnzel, Cinvestav , Mexico-City, 2019 9/40
Appropriateness of Data Sources Basic demands on bibliographic databases for possible bibliometric use include the following. • Indicators can only be as good as the underlying data source allows. • The database must provide a basis for global comparisons and benchmarking exercises. GlÄnzel, Cinvestav , Mexico-City, 2019 10/40
Appropriateness of Data Sources Further criteria concern, among others, GlÄnzel, Cinvestav , Mexico-City, 2019 types of similar documents) • possible redundancy of information (translations and difgerent publication • selection criteria (journals, books, proceedings) • document type (research article, review, letuer …) • publication type (journal, book, proceedings …) author–afgiliation link, etc.) The appropriateness of databases for bibliometric use includes basic • completeness of information (e.g., all authors, all addresses, addresses) • availability of necessary information (co-authors, abstracts, references, • source coverage (full vs. selective coverage) • subject coverage (e.g., applied sciences, SSH) issues such as 11/40
CHapter 3 Qvality and Cleanness of Data GlÄnzel, Cinvestav , Mexico-City, 2019 12/40
Qvality and Cleanness of Data The quality of data is mainly afgected by the following groups • the authors of the publications indexed in the database • the editors of the journals covered by the database • the database producer • the user of the database Strongly afgected fields and items are • author names • citations • addresses and institutions • document identifiers • funding information GlÄnzel, Cinvestav , Mexico-City, 2019 13/40
Qvality and Cleanness of Data The quality of data is mainly afgected by the following groups • the authors of the publications indexed in the database • the editors of the journals covered by the database • the database producer • the user of the database Strongly afgected fields and items are • author names • citations • addresses and institutions • document identifiers • funding information GlÄnzel, Cinvestav , Mexico-City, 2019 13/40
Qvality and Cleanness of Data Authors themselves are responsible for many errors, including names, references, addresses, titles. Note that wrong tiles of cited work may result in incorrect KeyWords Plus in the WoS Core Collection. Data extracted from bibliographic databases require careful cleaning and processing before possible bibliometric use. This applies to • Author identification • Address cleaning and standardisation • Institute identification and assignment • Document identification GlÄnzel, Cinvestav , Mexico-City, 2019 14/40
Qvality and Cleanness of Data Authors themselves are responsible for many errors, including names, references, addresses, titles. Note that wrong tiles of cited work may result in incorrect KeyWords Plus in the WoS Core Collection. Data extracted from bibliographic databases require careful cleaning and processing before possible bibliometric use. This applies to • Author identification • Address cleaning and standardisation • Institute identification and assignment • Document identification GlÄnzel, Cinvestav , Mexico-City, 2019 14/40
Qvality and Cleanness of Data Typical errors caused by authors are • various types of misspelling • incomplete information • erroneous data Data need therefore careful cleaning but many errors can practically not be corrected on the large scale. ☞ At lower aggregation levels, partial correction is possible, but requires supplementary information from external sources. ☞ On the large scale, semi-automated processes can be applied with remaining uncertainty. GlÄnzel, Cinvestav , Mexico-City, 2019 15/40
Qvality and Cleanness of Data Example of an incorrectly cited document GlÄnzel, Cinvestav , Mexico-City, 2019 Source: WoS Core Collection (Retrieved: May 2017) 16/40 1 st Author 1 st Page Journal PY Vol Cites Share SCHUBERT, A SCIENTOMETRICS 1989 16 3 196 85.2% SCHUBERT A SCIENTOMETRICS 1989 16 1 17 7.4% SCHUBERT A SCIENTOMETRICS 1989 16 8 1 0.4% SCHUBERT A SCIENTOMETRICS 1989 16 18 1 0.4% SCHUBERT A SCIENTOMETRICS 1989 16 218 1 0.4% SCHUBERT A SCIENTOMETRICS 1989 16 239 1 0.4% SCHUBERT A SCIENTOMETRICS 1989 16 432 1 0.4% SCHUBERT A SCIENTOMETRICS 1989 16 463 2 0.9% SCHUBERT A SCIENTOMETRICS 1989 16 478 1 0.4% SCHUBERT A SCIENTOMETRICS 1989 16 – 2 0.9% SCHUBERT A SCIENTOMETRICS 1988 16 3 3 1.3% SCHUBERT A SCIENTOMETRICS 1987 16 3 1 0.4% SCUBERT A SCIENTOMETRICS 1989 16 3 1 0.4% BRAUN T SCIENTOMETRICS 1989 16 3 2 0.9%
Qvality and Cleanness of Data Edited vs. authored books – another source of uncertainty and errors GlÄnzel, Cinvestav , Mexico-City, 2019 Source: WoS Core Collection (Retrieved: May 2017) 17/40 1 st Editor/Author 1 st Page Book title PY Cites MOED HE HDB QUANTITATIVE TEC 2004 1 MOED HF HDB QUANTATIVE SCI T 2004 1 Moed, HF...Glanzel, W...Schmoch, U HDB QUANTITATIVE SCI 2005 19 1 MOED HF HDB QUANTITATIVE SCI 2005 CH11 1 Moed, H. F....Glanzel, W....Schmoch, U. HDB QUANTITATIVE SCI 2005 15 Moed, H.F....Glanzel, W....Schmoch, U. HDB QUANTITATIVE SCI 2004 1 1 Moed, Henk F. HDB QUANTITATIVE SCI 2004 389 1 Moed, H. F....Glanzel, W....Schmoch, U. HDB QUANTITATIVE SCI 2004 785 1 Moed, H. F HDB QUANTITATIVE SCI 2004 82 Moed, W. Glanzel HDB QUANTITATIVE SCI 2004 51 1
Recommend
More recommend