Optimizing Risk Prediction for High Utilizers in a Safety Net System Zeyu (Zach) Li, MPH Kathleen Tatem, MPH Spriha Gogia, PhD, MPH Jeremy Ziring * Remle Newton-Dame, MPH Jesse Singer, DO, MPH Dave Chokshi, MD, MSc FACP All the presenting author and co-authors have no relevant financial relationship to disclose * NYU School of Medicine, New York, NY
INTRODUCTION 2
About New York City Health + Hospitals Largest municipal health system in the country Safety-net system: Mandate to care for THE BRONX the uninsured/underserved in New York City MANHATTAN QUEENS 11 hospitals, 6 diagnostic and treatment centers, 5 long-term care facilities, 70+ ambulatory care centers, correctional health services BROOKLYN >1 million patients, 6 million visits per STATEN ISLAND year 3
Risk Stratification Approach Define “risk” Stratification Stratify population into high, med and low risk with risk score algorithm Segmentation Segment high risk population into intervenable groups Targeting Target drivers of high risk in each segment with effective programming 4
METHODS 5
Study Population N = 833,969 adult patients with encounters at H+H during the measurement year (Q3 2016 – Q2 2017) Excluded: Pregnant women Actively incarcerated Only ancillary care (eg. Radiology visits, blood draws) Missing name, date of birth or sex in patient record 6
Outcomes of Interest Limited consensus on the definition of “High Risk” in Literature: visits, re -visits, days Goal: predict h igh ED/Inpatient (“acute”) utilization in the prediction year (Q3 2017 – Q2 2018) from data in the measurement year (Q3 2016 – Q2 2017) Outcomes tested: 10+ acute days: Original algorithm (logistic regression) * outcome (top 1%) 5+ acute days : To get at medium/high risk populations (top 5%) No. of acute days: Continuous outcome * https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5910357/ 7
Data Prep and Flow Data Sources Candidate Final Dataset Predictors: 70+ Clinical Utilization Modeling Dataset Social Determinants Demographics 8
Candidate Predictors Social Demographics Clinical Utilization Determinants • Count of chronic conditions • Continuous Binned Age • No. of ED visits • No. of zip code changes (Elixhauser) • Age Category (18-44, 45-64, • No. of inpatient (IP) visits • Neighborhood poverty level • Individual chronic conditions 65-80, 81+) from Elixhauser, including: • No. of ED/IP days • No. of payer changes • HTN Complicated • Sex • HTN Uncomp • DM Complicated • 10+ ED/IP days • Missed visits • DM Uncomp • Marital status • Substance Use • 90+ ED/IP days • Homelessness • Alcohol Use • Race/ethnicity • Solid Tumor • Met. Cancer • No. of outpatient visits • History of incarceration • Obesity • Preferred spoken language • Renal Failure • No. of primary care visits • Liver Disease • Payer group • Congestive Heart Failure • Medicare • 31 in total • 1+ non-emergent ED visits • Medicaid • Self Pay • Sickle Cell (CCS grouper) • 1+ primary care-treatable ED • Commercial visits • Other • Frailty indicator • Antipsychotic Rx • Anticoagulant Rx 9
Model Development & Validation Strategy Train Test 70% Test Training Tuned Candidate Test Set Models Test Test Data 5-fold validation for all models Validation Set 30% Final Model 10
RESULTS 11
Evaluating Model Performance Software SAS R Algorithm Logistic LASSO CART Outcome 10+ Days 10+ Days 5+ Days No. of Days No. of Days No. of 33 30 34 17 18 Variables Overall Model Performance AUROC 0.86 0.83 0.79 N/A N/A RMSE N/A N/A N/A 3.32 3.33 Top 1% Model Statistics PPV 44.6% 44.6% 44.8% 47.6% 48.4% Sensitivity 16.2% 16.2% 16.2% 17.3% 15.0% 12
Final Model Coefficients LASSO Model Predicting No. of Acute Days Demographics Utilization 0.06 0.35 Medicare Pt No. of ED Visits -0.04 0.36 Self-pay Pt No. of IP Visits -0.08 0.08 Other Payer Pt No. of ED/IP days (<30) -0.08 Pt with avoidable ER visit Clinical Indicators Social Determinants 0.19 0.08 Alcohol Use Dx No. of Zip changes 1.17 0.004 Psychoses Dx No. of Payer changes 0.38 0.04 Substance Use Dx Pt w/Missed visits (2+) 0.05 0.28 Congestive Heart Failure Dx Homeless 0.02 0.47 Renal Failure Dx History of Incarceration 0.21 No. of Elixhauser (count chronic conditions) 0.40 Pt with antipsychotic Rx Note: continuous variables are more influential than they appear 13
From Data to Patient Care: Risk Stratification Predicted vs. Observed Avg. Acute Days, Q3 2017-Q2 2018 Predicted acute days vs. average observed days: near-linear relationship 14 Predicted Acute Days
DISCUSSION 15
From Data to Patient Care – Model Deployment High risk patient lists List of patients predicted to be high risk are shared with facilities quarterly Epic integration High risk flag integrated in Epic to promote care coordination of patients before/during their visit Model Deployment: High Risk definition in our system Top 5% of scored patients (translates to 3.6 days) 57.5% of high risk patients return for an ED or inpatient visit 16
From Data to Patient Care - Segmentation 17
Contact us: PopHealthHighRisk@nychhc.org THANK YOU ! 18
APPENDIX 19
Risk Scoring at NYC H+H: A Brief History In 2015: Algorithm to identify high risk ACO (Medicare) patients (Risk Score 1.0) 2016/2017: Risk Score 2.0, a payer agnostic algorithm to predict high utilization (10+ acute days) at NYC Health + Hospitals Prioritizes utilization, behavioral diagnoses like schizophrenia and alcohol diagnoses Quarterly lists of top 1% high risk patients sent to facilities Algorithm published in JGIM earlier this year* 2017/2018: Distribution of high risk patient lists and dashboard with action segments This includes summary information of utilization, diagnoses, payer and segments of high risk patients at NYC H+H overall and by facility. 2018/2019: Risk Score 3.0 Development Risk score re-optimization with augmented social determinants and machine learning methods * https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5910357/ 20
Risk Score 3.0: Literature Review RISING RISK RISK SCORE CREATION • Difficult to predict • Common risk prediction modeling • Various approaches: methods: • Bloomers and Persisters • Multivariable regression • Binary Outcomes • LASSO • Population Segmentation • Random forest n=9 n=31 • Preventable Costs • CART • Impactability • Different outcomes of interest: 81+ • Service utilization, clinical, SEGMENTATION Articles mortality, and hospitalization • Mix of statistical methods available for • Prediction on general utilization Reviewed segmentation: is rare • CART • Numerous predictors (see next slide) n=24 • K-means clustering n=9 EPIC EHR INTEGRATION • Latent class analysis (MPLUS needed) • • A few organizations have successfully Expert led segment development integrated home-grown risk scores into Epic • Epic Healthy Planet can be leveraged for Risk Stratification 21
Homelessness Identification Patients identified as homeless based on structured data documentation: Demographics ( Registration ) Homeless/Undomiciled/Shelter listed in address field Hospital or Shelter address listed in address field ‘Person is homeless’ box checked in registration system 10+ zip code changes within previous 12 months Clinical diagnoses ( Medical Record ) ICD-10 homelessness code (Z59.0) on problem list Billing data ( Bill / Claim ) ICD-10 homelessness code (Z59.0) in billing data This does not include any free text information, such as Social Work psychosocial assessments
Cohort Characteristics Characteristics Development Cohort N=583 778, No. (%) Validation Cohort N=250 191, No. (%) 10+ Acute Days (Prediction Year) 16 109 (2.76%) 6 904 (2.76%) Past Utilizations ED Visits: 0 235 315 (40.3%) 100 670 (40.2%) 1-2 299 943 (51.4%) 128 817 (51.5%) 3-4 33 385 (5.72%) 14 315 (5.72%) 4+ 15 135 (2.59%) 6 389 (2.55%) IP Visits: 0 507 558 (86.9%) 217 556 (87.0%) 1 58 982 (10.1%) 25 368 (10.1%) 2 10 735 (1.84%) 4 587 (1.83%) 3 3 478 (0.60%) 1 384 (0.55%) 4+ 3 025 (0.52%) 1 296 (0.52%) 1+ Emergent PC Treatable ER visits 12 711 (2.18%) 5 424 (2.17%) Patient Demographics Marital Status: Married, Life Partner, Missing 151 869 (26.0%) 64 576 (25.8%) Single 386 952 (66.3%) 166 509 (66.6%) Separated, Widowed, Divorced 44 957 (7.70%) 19 106 (7.64%) Male 252 796 (43.3%) 108 339 (43.3%) Age: 18-44 298 053 (51.1%) 127 695 (51.0%) 45-64 202 976 (34.8%) 87 131 (34.8%) 65-80 67 022 (11.5%) 28 618 (11.4%) 81+ 15 727 (2.69%) 6 747 (2.70%) Ethnicity/Race: Non-Hispanic White 54 031 (9.26%) 22 893 (9.15%) Hispanic 197 699 (33.9%) 85 274 (34.1%) Non-Hispanic Black 204 564 (35.0%) 87 524 (35.0%) Other 127 484 (21.8%) 54 500 (21.8%) Speaks Non-English 177 324 (30.4%) 76 779 (30.7%) Payer (Most Recent): Medicaid 222 352 (38.1%) 95 455 (38.2%) Medicare 72 792 (12.5%) 31 234 (12.5%) Self-pay 194 071 (33.2%) 83 216 (33.3%) Other 94 563 (16.2%) 40 286 (16.1%) 23
Recommend
More recommend