AUTOMATED VALIDATION OF CLINICAL INCIDENT TYPES J.GUPTA, I.KORINSKA, J.PATRICK SCHOOL OF INFORMATION TECHNOLOGY SYDNEY UNIVERSITY HEALTH INFORMATICS CONFERENCE, BRISBANE 5 AUG 2015
MOTIVATIONS • Improve patient safety and quality of healthcare service • Over million clinical incidents in one state and under 2% are scrutinised • Current performance extraction processes • labour intensive, • there are inefficiencies in the software used and • classification models not statistically tested • Urgent need for innovation in classifying IIMS datasets and • improve software architecture
OBJECTIVES • Determine best performing statistical text classifiers (STC) • For multiclasses (13,12) clinical incident types – going beyond 2 • Demonstrate methods of improve specificity and sensitivity of the classification models e.g. • impact of balanced v/s unbalanced design • size (number of reports) • cl assifier’s effect (Clinician v/s Expert)
OVERVIEW - SOFTWARE ARCHITECTURE & DATA WAREHOUSE Incident Information Management System (IIMS) 1 AA 2 AV 3 BHP 4 BBP 5 CM 6 DOC 7 FALL 12 Classes 8 HAI 13 Classes 9 MED 10 NUT 11 PATH 12 PC 13 PU Data pool: 7 Hospitals datasets Period:2004 - 2008
RESEARCH DESIGN Experiment Clinical Incident types/N. Fields Classifier’s effect Size/Balance effect 13 * 1a 14 fields § Clinician 5448** Unbalanced 1b 12^ 10 fields §§ Clinician 5148^^ Unbalanced 2a 12^ “ Clinician 1200~ Balanced 2b 12^ “ Expert 1200~ Balanced Algorithms/Statistical Naïve Bayes (NB), Naïve Bayes Multinomial (NBM), J48, and Classifiers Used: Support Vector Machine using radial basis function (SVM_RBF) * AA, AV, BHP, BBP, CM, DOC, FALL, HAI, MED, NUT, PATH, PU, PC ^ AA, AV, BHP, BBP, CM, DOC, FALL, HAI, MED, NUT, PATH, PU, **500, 500, 500, 500, 500, 500, 500, 361, 500, 250, 306, 500,30 = 5448 ^^500, 500, 500, 500, 500, 500, 500, 361, 500, 250, 306, 500 = 5148 ~100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100 , 100 = 1200 § 10 categorical and 4 free text = 14 fields §§ 6 categorical and 4 free text = 10 fields
TOOLS AND MEASURES WEKA: Classifiers • Naïve Bayes (NB) & Naïve Bayes Multinomial (NBM) • Decision Trees (J48) & • Support Vector Machine with radial basis kernel function (SVM_RBF) Standard accuracy measures calculated were: • Percentage correctly classified, Recall • Precision, F-measure, Kappa statistic • Area under curve (AUC) of receiver operating characteristics (ROC) Confusion Matrix analysis
2A Recall Rate : DT and SVM 1A Recall rate : DT and SVM
2A Recall Rate : NB and NBM 1A Recall rate : NB and NBM
EXP1: A & B Algorithms DT NB NBM SVM_RBF CIT 13 12 13 12 13 12 13 12 Accuracy [%] 73.66 75.54 69.71 71.86 78.29 80.44 79.06 68.89 Kappa statistic 0.71 0.73 0.67 0.69 0.76 0.79 0.77 0.66 Precision 0.74 0.74 0.71 0.71 0.79 0.72 0.79 0.79 AUC 0.89 0.89 0.90 0.90 0.96 0.91 0.89 0.89 EXP2: A & B Algorithms DT NB NBM SVM_RBF Expert Clinician Expert Clinician Expert Clinician Expert Clinician Accuracy [%] 70.17 65.91 70.08 69.60 81.32 79.58 54.92 41.12 Kappa statistic 0.68 0.63 0.67 0.67 0.80 0.78 0.51 0.33 Precision 0.70 0.66 0.71 0.71 0.81 0.8 0.69 0.63 AUC 0.89 0.85 0.89 0.91 0.97 0.96 0.41 0.66
RESULTS: EXPERIMENT 1 & 2 - PRECISION SCORE
KEY FINDINGS & WHAT NEXT • Using Classifiers for Multiclass datasets like Clinical Incident Types in IIMS is achievable • Confusion matrix is useful in improving the classifiers performance • Standard measures of performance used in this study are adequate to determine changes • NBM classifier works well with Clinical Incident Types • Large dataset can be processed with high accuracy and minimum human resources • Balanced design and reduction in sample size improves but model's respond to efficiency differently • Explore application of automation and software architecture improvement • Explore Improving performance of the classifier further • Explore application on real time data to drive change in quality of service in healthcare
Recommend
More recommend