A Systematic Literature Review on Evaluation of Digital Tools for Authoring Evidence - based Clinical Guidelines Soudabeh KHODAMBASHI and Øystein NYTRØ Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway Abstract. To facilitate the clinical guideline (GL) development process, different groups of researchers have proposed tools that enable computer-supported tools for authoring and publishing GLs. In a previous study we interviewed GL authors in different Norwegian institutions and identified tool shortcomings. In this follow-up study our goal is to explore to what extent GL authoring tools have been evaluated by researchers, guideline organisations, or GL authors. This article presents results from a systematic literature review of evaluation (including usability) of GL authoring tools. A controlled database search and backward snow-balling were used to identify relevant articles. From the 12692 abstracts found, 188 papers were fully reviewed and 26 papers were identified as relevant. The GRADEPro tool has attracted some evaluation, however popular tools and platforms such as DECIDE, Doctor Evidence, JBI-SUMARI, G-I-N library have not been subject to specific evaluation from an authoring perspective. Therefore, we found that little attention was paid to the evaluation of the tools in general. We could not find any evaluation relevant to how tools integrate and support the complex GL development workflow. The results of this paper are highly relevant to GL authors, tool developers and GL publishing organisations in order to improve and control the GL development and maintenance process. Keywords. Evidence-based medicine, clinical guidelines, evaluation Introduction Clinical guidelines (GL) are “systematically de veloped statements to assist practitioner and patients decision about appropriate health care for specific clinical circumstances” [1]. To facilitate the GL development process, different groups of researchers have proposed and developed tools that enable computer-supported (digital) authoring and publishing [2, 3]. These tools partially or fully support the GL development process. Different organisations seem to employ varying strategies and methods in the GL authoring workflow. The reported software tools also vary in their functionalities and features. Hence, there is no “standard” tool (set) for GL development. Some of the tools focus on the GL development and maintenance process, but do also support publishing, presentation and dissemination [4]. In our previous case study, we identified substantial shortcomings of GL tools (including content management systems (CMS)) in a total of four organisations in Norway [5]. The study was based on interviews and observations of authors maintaining digital GLs. As part of that empirical study, we concluded that a review of
tool evaluation was necessary. Hence, this paper systematically reviews the literature regarding evaluation of GL development tools identified and discussed in our previous studies (see Section 2) [2, 3]. We note that our literature search did not include software tools for developing computer-interpretable/executable GLs. 1. Digital Tools for Authoring Clinical Guidelines According to our previous review; based on a systematic literature search using PubMed and Google Scholar [3], contacting Norwegian GL authoring organisations [2, 5], and overviews made by Guidelines International Network (G-I-N) [6]; we identified a total of 21 unique tools and platforms/repositories supporting GL. We categorised the identified tools according to the parts of the GL authoring workflow they covered [3]. Figure 1 presents the identified tools and intended process coverage. The Håndboka, ‘The Handbook’, is one representative (Norwegian) CMS designed specifically for GL authoring. Figure 1. Tools supporting guideline authoring process. 2. Material and Methods To find published evaluations of these tools, we searched PubMed and Google Scholar; as shown in Figure 2. Tool names were used as the search criteria. The most recent search was conducted in March 2017. Titles and abstracts of all hits were retrieved and then screened. For cases in which the title and abstract of the reviewed papers were unclear or ambiguous, we screened the full text. After screening for relevance, and removing duplicates, 182 papers were retained for further review. In order to find further relevant literature, we used backward snowballing; i.e. checking papers cited by the 182 papers. This added 6 extra papers for full review. After screening the resulting 188 papers, only 26 papers were specifically identified as relevant to include in this study.
Figure 2. Selection process of articles. 3. Results Table 1 presents a compressed review of the evaluation reported for each tool, showing the focus of each evaluation. Some of the tools (the DECIDE tool, Doctor Evidence platforms, JBI-SUMARI, GIN guideline library) mentioned in Figure 1 were not found to be the subject of any evaluation, therefore these are not present in the table. Table 1. Details of evaluation of each tool (based on Figure 1). Tool Evaluation details How it supports the World Health Organisation (WHO) GL authoring process (functions and features) [2], feedback from users regarding how it supports them in the process of authoring (regarding supported functionalities: collaboration, communication, version control, archiving system, reference manager, support MAGICapp of standard terminologies (ICPC, ICD, SNOMED-CT, ATC, RxNorm, MeSH), managing users’ feedback, guideline template, drawing of workflow based GL, export file format.) in an organisation [5], feedback based on restructuring of available recommendations into a multilayered format as supported by MAGICapp [7]. Support of WHO GL development process [2], software tested with users in workshops and in authoring processes [8], feedback from GL methodologists about the Evidence to Decision (EtD) framework [9], assessed GRADE methodology and the tool for systematic reviews (SR) in terms of inter-rater agreement and identify areas of uncertainty [10], formative evaluation during the design and update of EtD framework and then summative user testing [11], GRADEPro assessed the effects of formatting alternatives in GRADE evidence profiles on GL panelists’ preferences, comprehension, and accessibility [12], evaluated how authors explain the reasons for downgrading and upgrading strength of recommendations according to GRADE methodology and where guidance is needed to create concise, clear, accurate, and relevant explanatory footnotes and comments [13], tested what information to include in the evidence summaries table and how to present it in tables [14]. Internet Portal for Feedback on using the tool and its effect on reduction of time and cost guideline development development [15] Håndboka Support of WHO GL development process [2].
Recommend
More recommend