Early detection of bowel cancer using primary care electronic health records Jacqueline Birks Centre for Statistics in Medicine University of Oxford Sensors in Medicine 2017 London 5 October 2017
Colorectal Cancer • 4 th most common cancer in UK • 41,265 new cases in 2014 • 72% of cases occurring in those older than 65 years • 5-year survival dependent on stage at diagnosis
Diagnosis of colorectal cancer Dukes’ staging system • A - early stage • Good prognosis after surgery (5 year survival 95%) • B & C - intermediate stages • D - advanced cancer • which has spread to other parts of the body, usually liver and lungs • Poor prognosis (5 year survival 8%)
Diagnosis of colorectal cancer • Presentation with symptoms • Losing weight, change in bowel habit, bleeding from rectum, pain in abdomen • Screening test • Colonoscopy • Sigmoidoscopy • Faecal occult blood test (FOBT) • Repeated test • colonoscopy
Routine screening tests for colorectal cancer in England • Faecal occult blood test • Offered at 60-75 years every 2 years • Uptake of invitations is less than 60% • Colonoscopy • In the process of being made available to everyone aged 55 years
New approaches to detection of colorectal cancer • Development of algorithms to predict risk from routinely collected primary care data • using symptoms reported by patients • using complete blood counts (CBC) collected for any reason which may reveal signs of anaemia and iron deficiency (haemoglobin, red blood cell distribution width, mean cell volume)
ColonFlag • Machine learning algorithm to identify patients with colorectal cancer • Developed by Medial EarlySign, Israel • Data set of 2 million people in Israel • Comprehensive data • Regular CBCs with 20 indices • Demographic data • Linked to cancer registry Kinar et al Development and validation of a predictive model for detection of colorectal cancer in primary care by analysis of complete blood counts (2016) Journal American Medical Informatics Association 23:879-890
ColonFlag - methodology • included all patients aged 40 years or more 2003-2011 • Extracted for each patient • all CBCs with dates of tests • Year of birth • Sex • Date of any diagnosis of colorectal cancer
ColonFlag - derivation • Machine learning process derived a prediction algorithm using 80% of dataset • for each CBC a risk index (ColonFlag) on a scale 0-100 was calculated using • all indices of CBC available (missing indices imputed) • age at CBC • sex • trends in CBC indices for the patient prior to current CBC • date of diagnosis if patient identified as a case
ColonFlag – Validation Assessing performance 3-6 months before diagnosis Validation carried out on remaining 20% of the dataset aged 50-75 years • risk index calculated for each CBC • included cases – most recent risk score selected in the 3-6 month window before diagnosis if available • controls – randomly selected a risk score with at least 3 months follow up
ColonFlag – Validation Assessing performance 3-6 months before diagnosis Results Derivation validation Total number patients 606403 173251 Total number with a CBC 466107 139205 Mean age 58.7 58.6 % females 53.6 53.1 Number of cases 2437 698
ColonFlag – Validation Assessing performance 3-6 months before diagnosis Overall performance of the ColonFlag Area under the ROC curve = 82 ± 1% at Sensitivity = 50% specificity = 88 ± 2%
Independent validation using UK routine primary care patient data • Population of patients from Clinical Practice Research Datalink (CPRD) registered between 2000-2015 • Primary endpoint first diagnosis of colorectal cancer in the primary care record • Inclusion criteria • at least one CBC • age at least 40 years by latest date of inclusion • at least 2 years of follow up after registering with a practice • no prior history of colorectal cancer
Independent validation using UK routine primary care patient data • Data required for the calculation of the ColonFlag risk score for each CBC • patient identity • patient date of birth • sex • date of CBC • Up to 20 indices of the CBC, missing indices are imputed Member of Medial team produced the dataset with a ColonFlag for each valid CBC
Independent validation using UK routine primary care patient data • Primary analysis • Followed methods of Kinar et al but used a time window of 18-24 months • Logistic regression analysis with ColonFlag as the only predictor of first diagnosis of colorectal cancer
Independent validation using UK routine primary care patient data • Discrimination • Plotted the receiver operating characteristic curve (ROC) • the area under the curve (C-statistic) • specificity at sensitivity = 50% • sensitivity at specificity = 99.5% Calibration, assessing agreement between observed and predicted outcomes, was not possible as the risk score does not provide a measure of absolute risk
Independent validation using UK routine primary care patient data • Further analyses • Case-control study matching cases and non-cases by sex, year of birth and year of selected CBC • Cohort study of patients with a ColonFlag in 2012 followed until 2015 to estimate predictive values
Independent Validation Results of logistic regression analysis with ColonFlag as the only predictor of colorectal cancer 18-24 months before diagnosis AUROC 0.778 95 % CI (0.771, 0.781) Specificity at 82.73 95% CI (82.68, 82.78) sensitivity=50% Score cut-off = 83.47 Sensitivity at 3.91 95% CI (3.40, 4.48) specificity =99.5% Score cut-off = 99.78
Independent Validation Results of logistic regression analysis with ColonFlag as the only predictor of colorectal cancer 18-24 months before diagnosis
Summary of findings • The ColonFlag applied to routinely collected primary care data from the UK produced AUC values comparable to the Israeli population from which it was derived for similar intervals before diagnosis • Most of the predictive power is due to age Birks J, Bankhead C, Holt T, Fuller A, Patnick J. Evaluation of a prediction model for colorectal cancer: retrospective analysis of 2.5 million patient records (2017) Cancer Medicine
Potential uses of the algorithm 1 As an adjunct to the NHS Bowel Cancer Screening Programme In combination with the quantified result of FIT (to be introduced 2018). A lower cut off could be used than for lower risk individuals To identify and encourage particularly non attenders for colonoscopy follow up after positive test To identify and encourage non participants in screening programme to participate
• In primary care To screen entire population over 50 for bowel cancer risk using FBC results where available To take a FBC of each individual 50+ periodically and check risk results To check an individual’s risk on every blood count To check an individual exhibiting minor symptoms (although might not improve date of diagnosis by then, but perhaps useful where there are comorbidities)
Acknowledgements Nuffield Department of Primary Care Health Sciences, University of Oxford Clare Bankhead Alice Fuller Tim Holt Nuffield Department of Population Health, University of Oxford Julietta Patnick The study was funded by the NIHR Oxford Biomedical Research Centre
Recommend
More recommend