Decision Trees: an application to LTC Insurance Thibault ANTOINE Head of Critical Illness R&D Centre Guillaume BIESSY P.h.D. candidate – LTC & Disability R&D Centre
SCOR Global Life Contents 1 Introduction 2 Decision Trees 3 Portfolio study : Example 1 4 Portfolio study : Example 2 April 2016
Introduction Some examples of machine learning methods: Support Vector Machines Boosting Decision trees Random forests Neural networks . . . Those methods rely on letting the algorithm learn the structure of data instead of forcing a model. Each method has its own advantages and drawbacks 3
Introduction Dealing No Global High No with Threshold hierarchy Predictive Interpretation Algorithms number of distribution missing effects between power of the results variables hypothesis Data variables Association Rules Sparse Regressions Very good SVM Good Decision Trees Average Random Forests Poor Neural Networks & Deep Learning K-nearest neighbors Figure 1 : Overview of strengths and weaknesses for several methods. Source : Conference on Actuarial & Data Science, SCOR, November 2015. 4
Introduction Popular belief : machine learning is a set of obscure methods which give a slight gain of accuracy but only in the case where a tremendous amount of data is available. When to use machine learning ? A lot of individuals, but also. . . A lot of covariates Heterogeneous covariates Limited knowledge about the data 5
Introduction Good news : in LTC we have several covariates and limited knowledge about their effect as LTC is a complex risk. This is favorable to the use of machine learning . Bad news 1 : the amount of data is limited. Bad news 2 : in LTC we work with survival data which are censored. Machine learning methods cannot be applied directly. 6
Introduction Why using machine learning methods in LTC Insurance ? Get an idea of the relative importance of covariates in a dataset. Bring added value to portfolio studies Increase our knowledge about the risk and covariates effects Quickly compare several portfolios Concrete examples : Life expectancy based on underlying pathology : classification of pathologies, better knowledge of the risk both qualitatively and quantitatively. Probability of becoming disabled given the characteristics of the insured life (age at subscribing, gender, amount of premium, level of underwriting, substandard risk) : impact of underwriting, adverse selection effects, segmentation 7
SCOR Global Life Contents 1 Introduction 2 Decision Trees 3 Portfolio study : Example 1 4 Portfolio study : Example 2 April 2016
Decision Trees Why decision trees ? Transparency on choices made by the algorithm at each step Intuitive interpretation of the results Figure 2 : Example of a decision tree. The variable of interest is the life expectancy in the disabled state. 9
Decision Trees Two families of decision trees : Classification tree : the variable to be explained takes a handful of values. Example : incidence of dependency on a given interval. Regression tree : the variable to be explained is continuous. Example : life expectancy in dependency. To obtain a decision tree, there are two main steps : Build the tree Prune the tree Classification and regression trees use different methods for both steps. 10
Decision Trees – Build the Tree For this step, a criterion to create new branches is required. For the classification tree, an impurity function needs to be defined. Popular functions include entropy and the twoing function. Other can be defined, they just need to respect a few criteria (symmetry, unique maximum and minimum) A new branch is created if it reduces the global impurity of the tree, i.e. if the impurity of both leaves is lower than the impurity of the initial leaf. For the regression tree, the criterion used is the minimization of a quadratic error among each group. 11
Decision Trees – Pruning At this step, we want to reduce the tree to avoid any overfitting effect. There are several ways to do this : Split the data in a learning sample and a test sample (size is usually 2 thirds / 1 third) n-fold validation : the dataset is divided in n samples of roughly equal size. Every set except one is used as learning sample and the remaining set as a test sample. We repeat by selecting each sample and take the mean error. The choice of the final tree aim at minimizing either : The global mean error The global mean error + the standard deviation if this global mean error, computed for the tree which minimizes the global mean error. This approach introduces an extra safety margin and results in even smaller trees where each leaf is significant. 12
SCOR Global Life Contents 1 Introduction 2 Decision Trees 3 Portfolio study : Example 1 4 Portfolio study : Example 2 April 2016
Portfolio study : Example 1 – Portfolio of annuitants Available covariates : Time survived in LTC (variable of interest, can be censored) Age at onset of dependency Gender Characteristics of portfolio Total exposure : 49,170 person years Non censored trajectories : 70.6 % DISCLAIMER: For confidentiality purpose the scale of the figures has been changed The figures have been obtained on a subset of our portfolios The definitions used are not “standard” 14
Portfolio study : Example 1 – Data representation 15
Portfolio study : Example 1 – Tree Time spent in LTC Nb of Deaths Nb of insured in the cluster 16
Portfolio study : Example 1 – Full Tree 17
Portfolio study : Example 1 – Pruned Tree 18
Portfolio study : Example 1 – Over pruned Tree 19
SCOR Global Life Contents 1 Introduction 2 Decision Trees 3 Portfolio study : Example 1 4 Portfolio study : Example 2 April 2016
Portfolio study : Example 2 – Portfolio of annuitants Available covariates : Time survived in the disabled state (variable of interest) Age at onset of dependency Gender Pathology (11 categories) Amount of annuity bought Residence (home / institution) Level of premium for substandard risk Level of medical underwriting (void or basic) Characteristics of the portfolio Total exposure : 9,175 person years Non censored trajectories : 61.8 % DISCLAIMER: For confidentiality purpose the scale of the figures has been changed The figures have been obtained on a subset of our portfolios The definitions used are not “standard” 21
Portfolio study : Example 2 – Data representation 22
Portfolio study : Example 2 – Full Tree on pathologies 23
Portfolio study : Example 2 – Pruned on pathologies 24
Portfolio study : Example 2 – Full Tree All variables 25
Portfolio study : Example 2 – Pruned Tree All variables 26
Conclusion At least, decision trees are an interesting tool for the actuary. Results are easy to share, especially with non-actuaries. Application to those methods extends to other risk, among them disability. Disability offers a better context to apply those methods : more data, more covariates, limited benefit period Thank you for your attention ! 27
Recommend
More recommend