Utilizing Predictive Modeling to Improve Policy through Improved - PowerPoint PPT Presentation

Utilizing Predictive Modeling to Improve Policy through Improved Targeting of Agency Resources: A Case Study on Placement Instability among Foster Children Dallas J. Elgin, Ph.D. IMPAQ International Randi Walters, Ph.D. Casey Family Programs 2016 APPAM Fall Research Conference Image Credit: The Strengths Initiative

The Utility of Predictive Modeling for Government Agencies • Ch Challenge : Government agencies operate in an environment that increasingly requires using limited resources to meet nearly limitless demands. • Op Opportu tunity ty : Advances in computing technology & administrative data can be leveraged via predictive modeling to predict the likelihood of future events • Go Goal: : To provide an improved understanding of the methodology & identify associated best practices

What is Predictive Modeling? • Process of selecting a model that best predicts the probability of an outcome (Geisser, 1993), or generating an accurate prediction (Kuhn & Johnson, 2013). • Over the past several decades, predictive modeling has been utilized in a variety of fields to predict diverse outcomes • Within child welfare, predictive models have been used to inform decision-making: – Risk assessment instruments – Maltreatment recurrence, future involvement, child fatalities

Case: Placement Instability • Data: – 2013 Adoption and Foster Care Analysis and Reporting System (AFCARS) • Publicly-available dataset resembling administrative data – Sample: 15,000 foster care children that were in care throughout 2013 • Operationalization: 3 or more moves, or a total of 4 placements (Hartnett, Falconnier, Leathers & Testa, 1999; Webster, Barth & Needell, 2000) – 11,649 children with 3 or fewer placements – 3,351 children with 4 or more placements

Methodological Approach: Data Partition Strategy • The entire dataset of 15,000 children was split into 2 groups: – A training set used to train the models (75% of dataset= 11,250 children) – A test set used to validate the models (25% of dataset= 3,750 children)

Methodological Approach: Data Training Strategy • Train a collection of 10 models using the training set Mo Model Type Mo Model In Interpretability Co Computation Time Logistic Regression High Low Partial Least Squares Discriminant Analysis High Low Linear Discriminant Analysis Elastic Net/Lasso High Low K-Nearest Neighbors Low High Neural Networks Low High Non-Linear Classification Models Support Vector Machines Low High Multivariate Adaptive Regression Splines Moderate Moderate Classification Tree High High Boosted Trees Low High Classification Trees & Rule-Based Models Random Forest Low High • Utilize ROC Curves to evaluate how well the models calculate: 1. The true-positive rate (sensitivity) 2. The false-positive rate (specificity)

Model Performance on the Test Set • 3 models with highest ROC scores were applied to the test set (3,750 observations) Ne Neural Ne Network Model 4 or More Moves Less than 3 Moves 4 or More Moves 535 153 Less than 3 Moves 302 2,759 Ra Random Forest Model 4 or More Moves Less than 3 Moves 4 or More Moves 537 157 Less than 3 Moves 300 2,755 Bo Boosted Tree Model 4 or More Moves Less than 3 Moves 4 or More Moves 540 158 Less than 3 Moves 297 2,754 • Overall Accuracy= 87.8% - 87.9% – Less than 3 Moves= 90.1% - 90.2% – 4 or More Moves= 77.4% - 77.8%

Improving Model Accuracy • Iterative process involving: transforming variables, ‘fine-tuning’ model parameters, or combination of both • ‘Fine-Tuning’ parameters of the neural network model • Improved Overall Accuracy= 88.2% Un Un-tu tuned Neural Netw twork Model 4 or More Moves Less than 3 Moves 4 or More Moves 535 153 Less than 3 Moves 302 2,759 Tu Tuned Neural Network Model 4 or More Moves Less than 3 Moves 4 or More Moves 569 176 Less than 3 Moves 268 2,736

Improving Model Accuracy: Cost-Sensitive Tuning Cl Classification Tree with No Co Cost Penalty 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves 515 181 0.615 0.938 Less than 3 Moves 322 2,731 Classification Tree with Co Cl Cost Penalty of 2 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves 620 354 0.741 0.878 Less than 3 Moves 217 2,558 Classification Tree with Co Cl Cost Penalty of 5 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves 756 758 0.903 0.740 Less than 3 Moves 81 2,154 Cl Classification Tree with Co Cost Penalty of 10 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves 790 970 0.944 0.667 Less than 3 Moves 47 1,942 Cl Classification Tree with Co Cost Penalty of 20 4 or More Moves Less than 3 Moves Sensitivity Specificity 4 or More Moves 803 1,161 0.959 0.601 Less than 3 Moves 34 1,751 • Considerable improvements in reducing false negatives, but at expense of notable increases in the number of false positives.

Best Practices for Designing & Implementing Predictive Models 1. Predictive Models Can Improve Upon, but Not Replace, Traditional Decision-Making Processes within Government Agencies. 2. Government Agencies Should Clearly Articulate the Methodological Approach and the Predictive Accuracy of their Models. 3. Consider Opportunities for Incorporating Community Engagement into the Predictive Modeling Process.

~Questions & Feedback~ Dallas Elgin, Research Associate IMPAQ International delgin@impaqint.com Randi Walters, Senior Director of Knowledge Management Casey Family Programs

Placement Instability • What is it? Occurs when a child in the care of a child welfare system experiences multiple moves to different settings • Why does it matter? Placement instability can have significant consequences on children: – Greater risk for impaired development & psychosocial well-being – Greater uncertainty surrounding a child’s future – Greater likelihood of re-entry and/or emancipation • Is it a big issue? 25% of foster care children experience three or moves while in care (Doyle, 2007)

Improving Model Accuracy: Cost-Sensitive Tuning • False-negative predictions may be unacceptable as a failure to correctly identify placement instability could result in unnecessary exposure to adverse events • Cost-sensitive models impose cost penalties to minimize the likelihood of false predictions

Data • 2013 Adoption and Foster Care Analysis and Reporting System (AFCARS): Federal data provided by the states on all children in foster care • Sample: 15,000 foster care children that were in care throughout 2013 77.66% of Children in the Sample have 3 or fewer moves 22.34% of Children in the Sample have 4 or more moves

3 Highest Performing Models on the Training Set • Boosted Trees: build upon traditional classification tree models – Fit a series of independent decision trees and then aggregate the trees to form a single predictive model • Random Forests: build upon traditional classification tree models by utilizing bootstrapping methods to build a collection of decision trees – Consideration of smaller subset of predictors minimizes the likelihood of a high degree of correlation among multiple trees • Neural Networks: resemble the physiological structure of the human brain or nervous system – Use multiple layers (or algorithms) for processing pieces of information

Linear Discriminant Analysis Models • Utilize linear functions to categorize observations into groups based on predictor characteristics • Examples: logistic regressions, partial least squares discriminant analysis, and Elastic Net/Lasso models • These models commonly have: – High degree of interpretability – Low amount of computational time

Non-Linear Classification Models • Utilize non-linear functions to categorize observations • Examples: k-nearest neighbors, neural networks, support vector machines, and multivariate adaptive regression splines • These models commonly have: – Low to moderate interpretability – Moderate to high computational time

Classification Trees and Rule-Based Models • Utilize rules to partition observations into smaller homogenous groups • Examples: classification trees, boosted trees, and random forests • These models commonly have: – Low to high interpretability – High degree of computational time

Model Performance on the Test Set • 3 models with the highest ROC values were applied to the test set of 3,750 children

Identifying Prominent Predictors Ne Neural Ne Network Random Forest Ra Boosted Trees Bo Average Av Va Variable Name Ra Ranki king Ranki Ra king Ra Ranki king Ra Ranki king Date of Latest Removal 3 1 1 1.7 Beginning Date for Current Placement Setting 2 3 4 3.0 Date of First Removal 1 5 5 3.7 Child's Date of Birth 4 6 6 5.3 Emotionally Disturbed Diagnosis 8 11 8 9.0 Discharge Date of Child's Previous Removal 5 12 11 9.3 Currently Placed in Non-Relative Foster Home 9 14 13 12.0 Currently Placed in an Institution 6 20 10 12.0 Number of Days in Current Placement Setting 36 4 3 14.3 Female Child 16 17 18 17.0 • The caret package’s variable importance feature provides one option for characterizing the general effects of predictors with predictive models • The feature was ran on the neural network, random forest, and boosted tree models to identify the most important variables

Utilizing Predictive Modeling to Improve Policy through Improved - PowerPoint PPT Presentation

Utilizing Predictive Modeling to Improve Policy through Improved Targeting of Agency Resources: A Case Study on Placement Instability among Foster Children Dallas J. Elgin, Ph.D. IMPAQ International Randi Walters, Ph.D. Casey Family Programs

Session 3 Upskilling for Predictive Analytics Travis M Short, FSA Upskilling for Predictive

Model Predictive Control Model Predictive Control of Hybrid Systems of Hybrid Systems Model

Session 5 A brief introduction to Predictive Modeling Lichen Bao, Ph.D A Brief Introduction to

Lessons Learned (the Hard Way) in an Organization from Predictive Modeling Projects Predictive

Predictive Analytics for Capacity Planning HIC 2015 Andrae Gaeth What is predictive

Overcoming big data bottlenecks in healthcare : a Predictive Modeling case study Predictive

Predictive microbiology Survival, multiplication, or Predictive Modeling death of spoilage

DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve

Utilizing Crowd Funding Utilizing Crowd Funding for Support SMEs funding for Support SMEs

Biomimetic Sound Biomimetic Sound Localization Utilizing Localization Utilizing Head

Utilizing Plant Utilizing Plant Available Water Available Water as a Drought as a Drought Risk

Utilizing Commercial Utilizing Commercial Object Libraries Object Libraries w ithin Loosely- w

Utilizing commercial graphics processors Utilizing commercial graphics processors in the

Utilizing Micr Utilizing Microblogs f oblogs for A r Automatic matic Ne News Highlights

Welcome Overview of Predictive Analytics Claudia Perlich Chief Scientist, Dstillery Predictive

Predictive Modeling and Design Solutions for Beneficial Use of Dredged Material Presented by Tom

Local Market Power Mitigation Enhancements Draft Final Proposal and Retrospective Analysis May

Update on Plasmodium falciparum hrp2/3 gene deletions Jane Cunningham MPAC 22 24 March

Hardware-Intrinsic Identity for IP Protection John Ross Wallrabenstein Sypris Research Sypris

Using Positive Tainting and Syntax-Aware Evaluation to Counter SQL Injection Attacks William

Positive Surgical Margins in Partial Nephrectomy Specimens Evgeny Yakirevich, MD, DSc Department

to Find Safety and Cybersecurity Defects Daniel Kstner, Laurent Mauborgne, Stephan Wilhelm,

Adversarial Contrastive Estimation ACL 2018 AVISHEK (JOEY) BOSE, HUAN LING, *YANSHUAI CAO

Metastatic Papillary Gallbladder Carcinoma with a Unique Presentation and Clinical Course Brandon

Utilizing Predictive Modeling to Improve Policy through Improved - PowerPoint PPT Presentation

Utilizing Predictive Modeling to Improve Policy through Improved Targeting of Agency Resources: A Case Study on Placement Instability among Foster Children Dallas J. Elgin, Ph.D. IMPAQ International Randi Walters, Ph.D. Casey Family Programs

Session 3 Upskilling for Predictive Analytics Travis M Short, FSA Upskilling for Predictive

Model Predictive Control Model Predictive Control of Hybrid Systems of Hybrid Systems Model

Session 5 A brief introduction to Predictive Modeling Lichen Bao, Ph.D A Brief Introduction to

Lessons Learned (the Hard Way) in an Organization from Predictive Modeling Projects Predictive

Predictive Analytics for Capacity Planning HIC 2015 Andrae Gaeth What is predictive

Overcoming big data bottlenecks in healthcare : a Predictive Modeling case study Predictive

Predictive microbiology Survival, multiplication, or Predictive Modeling death of spoilage

DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve

Utilizing Crowd Funding Utilizing Crowd Funding for Support SMEs funding for Support SMEs

Biomimetic Sound Biomimetic Sound Localization Utilizing Localization Utilizing Head

Utilizing Plant Utilizing Plant Available Water Available Water as a Drought as a Drought Risk

Utilizing Commercial Utilizing Commercial Object Libraries Object Libraries w ithin Loosely- w

Utilizing commercial graphics processors Utilizing commercial graphics processors in the

Utilizing Micr Utilizing Microblogs f oblogs for A r Automatic matic Ne News Highlights

Welcome Overview of Predictive Analytics Claudia Perlich Chief Scientist, Dstillery Predictive

Predictive Modeling and Design Solutions for Beneficial Use of Dredged Material Presented by Tom

Local Market Power Mitigation Enhancements Draft Final Proposal and Retrospective Analysis May

Update on Plasmodium falciparum hrp2/3 gene deletions Jane Cunningham MPAC 22 24 March

Hardware-Intrinsic Identity for IP Protection John Ross Wallrabenstein Sypris Research Sypris

Using Positive Tainting and Syntax-Aware Evaluation to Counter SQL Injection Attacks William

Positive Surgical Margins in Partial Nephrectomy Specimens Evgeny Yakirevich, MD, DSc Department

to Find Safety and Cybersecurity Defects Daniel Kstner, Laurent Mauborgne, Stephan Wilhelm,

Adversarial Contrastive Estimation ACL 2018 *AVISHEK (JOEY) BOSE, *HUAN LING, *YANSHUAI CAO

Metastatic Papillary Gallbladder Carcinoma with a Unique Presentation and Clinical Course Brandon

Adversarial Contrastive Estimation ACL 2018 AVISHEK (JOEY) BOSE, HUAN LING, *YANSHUAI CAO