WIILSUG Conference Milwaukee, WI, June 20, 2018 Advanced Analytics Consulting Services T argeting Return-to-Work Intervention by Predicting Prolonged Workers' Compensation Claims Mei Najim, CSPA, Advanced Analytics Consultant and Advisor
Mrs. Mei Najim provides advanced analytics consulting services including developing full life cycle predictive modeling processes from raw data exploration to model implementation into IT data systems, thorough documentation, and related training. Mei has over 14 years hands-on advanced analytics and machine learning experience dealing with large and complex data sets in various types of predictive analytics settings (claims, underwriting, pricing), along with extensive actuarial analytics experience including pricing, reserving, and research & development in the insurance industry. She has presented at many conferences to share and discuss her papers and expertise in predictive analytics with industry analytics experts. Mei holds a Bachelor of Science in Actuarial Science from Hunan University and two Master of Science degrees, in Applied Mathematics and in Statistics, from Washington State University. Mei is a member of the American Statistical Association and a Certified Mei Najim, CSPA Advanced Analytics Consultant and Advisor Specialist in Predictive Analytics (CSPA) of the Casualty Actuarial Society.
AGENDA Predictive Analytics In Insurance Industry Overview Return-to-Work Day 30 Model Q & A
Five Main Areas Using Predictive Analytics Predictive Analytics In Insurance Industry Overview Predictive Analytics Marketing Underwriting Reserving Claims Pricing Profitable Growth
A Series of Claim Predictive Models Predictive Analytics In Given the claim handling process standard practice and associated data Insurance Industry Overview collection, building a series of models corresponding with the associated claim process time lines to score real time open claims can improve model performance as of different time lines which can optimize cost and benefit for the claim handling unit. Day 60 Day 90 Day 1 Day 30 Day 45 Open Claims …… …… …… …… …… Model Model Model Model Model Claim Claim Claim Claim Model Model Outputs Scores Scores Scores Scores Scores More data fields available and static, Model accuracy increasing, Business value decreasing
AGENDA Return-to-Work Day 30 Model Predictive Analytics In Insurance Industry Overview Return-to-Work Day 30 Model Q & A
Executive Summary Return-to-Work Day 30 Model Motivation - Based on the insurance industry data, the prolonged return to work is one of the main drivers of increased duration and total cost, the sooner the injured worker return to work, the lesser suffering to injured workers and the lower the total claim cost Objective - Identify a set of claims where the return to work would be prolonged after day 30 since claim being opened and the outcomes could be improved by more efficient claim handling process, proper treatment, and assistance to return to work. Benefit - Improved claim outcomes measured using impact on cost and duration (need to index/severity adjust), Claimant satisfaction.
A Life Cycle Modeling Process Overview Return-to-Work Day 30 Model Business Goal Model Data Testing Acquisition Model Model Data Validation Implementation Preparation Variable Model Creation Building Variable Selection
Step 1. Business Goal(s) and Model Design Return-to-Work Day 30 Model Objectives: The business goal is to identify open claims with a high chance of return to work after day 30 since claim being opened in order for claim adjusters to help injured workers return to work earlier Model design is to build a return to work model with a binary target Business variable (Yes/No) to predict the likelihood of injured worker return to work Goal Model Data as of day 30. Testing Acquisition - Target variable creation Model Model Data Validation Implementation Preparation Challenge: A few return to work dates including partial return and full duty return, etc. Variable Model Creation Building A return to work after day 30 flag proxy could be created depending on Variable Selection what really matters to the company’s business goals
Step 1. Business Goal(s) and Model Design Return-to-Work Day 30 Model Target Variable Creation : 39% Frequency represents 84% Total Incurred Loss WC Indemnity Closed Claims 2008-2017 Business 90% 84% Goal Model Data Testing 80% Acquisition 70% 61% 60% Model Model Data Validation Implementation Preparation 50% 39% 40% Variable 30% Model Creation Building 16% Variable 20% Selection 10% Yes No % Frequency % TotalIncLoss RTW 30+ DAYS FLAG
Step 2. Data Scope and Acquisition Return-to-Work Day 30 Model Objectives: Data scope: - Coverage code = “WC” - 10 years WC indemnity closed claims - Client status =“current”, etc. Business Goal Model Data Data acquisition: Testing Acquisition - Accident, claim, claimant, payment, managed care, demographic, etc. Model Model Data Validation Implementation Preparation Rule of thumb: If the rare claims/outliers are possible to randomly happen again, then don’t exclude them as claims with high severity would Variable Model Creation Building impact/drive the overall average loss cost in insurance data Variable Selection
Step 3. Data Preparation Return-to-Work Day 30 Model Objective: Based on the business goal(s) and data scope, data was reviewed, cleansed, imputed, transformed to be pared for the next step - variable creation - Univariate analysis/Descriptive analytics Business - Conduct trend study to apply to financial data fields Goal Model Data Testing Acquisition Examples: Cleansing: Total incurred > $0, Exclude terminated clients, etc. Model Model Data Validation Implementation Preparation Imputation: Address the missing values (Age, Bill Audit, etc.) Transformation: taking a log or square root or exponential, etc. if data is Variable Model Creation Building skewed Variable Selection
Step 4. Variable Creation (a.k.a.: Feature Engineering) Return-to-Work Day 30 Model Objective: Create variables that make both statistical and business sense Examples: Data fields could be used directly Business - Initial Treatment, Number of dependents, Gender, Marital, Age, etc. Goal Model Data Create new variables Testing Acquisition - Lags between dates: • Lag (accident date – max medical improvement date) Model Model Data Validation Implementation Preparation • Month and Week of accident date, etc. - Groups: Variable Model Creation Building • Body Part group/Injury Type group Variable Selection • Comorbidity group based on ICD and CPT codes Text Analytics to create “variables” based on unstructured data - Text Analytics uses algorithms to derive patterns and trends from unstructured (free-form text) data through statistical and machine learning methods as well as natural language processing techniques
Step 5. Variable Selection (a.k.a.: Feature Selection) Return-to-Work Day 30 Model Objective: To reduce 500+ variables to a manageable level before applying the machine learning algorithms - Variable profiling/screening: Missing value ratios, etc. - High correlation filters: Identify variables which are correlated to Business each other to avoid multicollinearity Goal Model Data - Multivariate analyses: cluster analysis, principle component analysis, Testing Acquisition and factor analysis, etc. - Stepwise Model Model Data Validation Implementation Preparation There are also some build-in variable selection methods depending on Variable Model Creation Building specific type of statistical tools Variable Selection
Step 6. Model Building (a.k.a.: Model Fitting) Return-to-Work Day 30 Model Objective: After serious data mining work, multiple machine learning algorithms are utilized to build model to have a few candidate models (usually 3) - GLM Logistics Regression - Decision Tree Business - Random Forests Goal Model Data - Gradient Boosting Testing Acquisition - Neural Network, etc. Interaction and correlation usually should be examined before finalizing the Model Model Data Validation Implementation Preparation models Variable Model Creation Building Examples of Main Drivers Variable Selection Date of Max Medical Improvement, NCCI Injury Type, Body Part Code, Comorbidity Group, Nature Result Group, Average Weekly Wage, Benefit State, etc.
Recommend
More recommend