Survival Data Mining using Enterprise Miner and Proportional Hazard Cox Model 25 th June 2015 Manchester – UK Professor Jorge Ribeiro Patrick Ribeiro 1
SAS/ETS Econometrics Time Series Enterprise Miner 13.2 PROC ARIMA Survival Analysis Node PROC AUTOREG SAS/OR Operational Research Simulation Studio 13.2 2 2
Model 1 - Time to Next Purchase Survival Discrete Model 3
4
Enterprise Miner 13.2 Survival Analysis Node 5 5
1.1 - Model 1 - Time to Next Purchase Survival Discrete Model 6
1.2 - Model 1 - Time to Next Purchase Steps Plan “People are much more likely to get on a bus if they know where it is going”. 7
1.2 - Model 1 - Time to Next Purchase 8
1.2 - Model 1 - Time to Next Purchase 9
1.2 - Model 1 - Time to Next Purchase 10 10
Final Model - Hazard Function 11 11
Final Model - Benefit graph 12 12
Final Model 13 13
14 14
1.2 - Model 1 - Time to Next Purchase 15 15
SAS/ETS – Econometrics Time Series PROC ARIMA / PROC AUTOREG 16 16
SAS/ETS – Econometrics Time Series The Cross-Correlation Function Lag Jan Oct L H 4 t t Nov Dec Dec Dec L H 2 t t Jan Jan L H 1 t t Feb Feb Mar Apr 17 17
SAS/ETS – Econometrics Time Series PROC ARIMA / PROC AUTOREG Primary Event Variables Royal Wedding Point/Pulse Bank Holiday Price Step Marketing Campaign Ramp t event 18 18
SAS/ETS – Econometrics Time Series PROC ARIMA / PROC AUTOREG 19 19
SAS/OR – Operational Research Simulation Studio 13.2 20 20
Simulation Studio 13.2 21 21
2 - Model 2 - Call Centre Demand Call Centre Demand Model 22 22
2.1 - Model 2 – Call Centre Wait Time Max = 90 Wait Time Goal = 30 23 23
2.2 - Model 2 – Call Centre Wait Time Max = 90 Wait Time Goal = 30 24 24
2.3 - Model 2 – Call Centre Wait Time Max = 90 Wait Time Goal = 30 25 25
2.4 - Model 2 – Call Centre Wait Time Max = 90 Wait Time Goal = 30 26 26
2.5 - Model 2 – Call Centre 27 27
3.1 - Model 3 – Stress Test and Scenario Analysis 28 28
29 29
62 days for data preparation 6 days for modelling 30 30
31
Step 1 – Economic variables Economic Variables Unemployment GDP Inflation Cash rate Credit availability House prices Commercial property prices Commodity prices Swap rates Equity prices 32
... Cox Proportional Hazards Model { X ... X } ( ) ( ) h t h t e 1 i 1 k ik i 0 Linear function of a Baseline Hazard set of predictor function - involves variables - does time but not not involve time predictor variables 33
Step 3 – Model PROC PHREG DATA = MODEL COVSANDWICH(AGGREGATE); CLASS Risk ; MODEL (START,END)*DEFAULT(0) = Risk P1GDP UNEMPLOYMENT; ID CUSTOMER_ID; HAZARDRATIO Risk / DIFF=REF; HAZARDRATIO P1GDP / UNITS = 1 2 3 5; HAZARDRATIO UNEMPLOYMENT / UNITS = 1 2 3 5; RUN ; PD_Band Risk 1 to 5 1 6 to 11 5 12 to 16 09 17 to 18 12 19 to 20 15 34
SAS Results For each 1 unit increase in the GDP, the Hazard of Default goes down by an estimated 16.7 %. 100*(0.833 1) 16.7% 0.18257 e 0.833 35
SAS Results For each 1 unit increase in the Unemployment, the Hazard of Default increases by an estimated 25.5 %. Risk 0.22684 100*(1.255 1) 25.5% e 1.255 36
SAS Results A customer in the Band 01 has a ONLY 8.7% the risk of Default (or - 91.3%) compared to a customer in the Band 15 (the reference Band). 100*(0.087 1) 91.3% 2.44279 e 0.087 HAZARD RATIO (BAND 01) 0.087 HAZARD RATIO (BAND 15) 37
SAS Results A customer in the Band 01 has a ONLY 8.7% the risk of Default (or - 91.3%) compared to a customer in the Band 15 (the reference Band). HAZARDRATIO Risk / DIFF=REF; Risk 100*(0.087 1) 91.3% HAZARD RATIO (BAND 01) 0.087 HAZARD RATIO (BAND 15) 38
SAS Results PROC PHREG DATA = MODEL COVSANDWICH(AGGREGATE); CLASS Risk (PARAM=REF REF='15') ; MODEL (START,END)*DEFAULT(0) = Risk P1GDP UNEMPLOYMENT; ID CUSTOMER_ID; HAZARDRATIO P1GDP / UNITS = 1 2 3 5; HAZARDRATIO UNEMPLOYMENT / UNITS = 1 2 3 5; RUN ; Output 7 100*(0.694 1) 30.6% 100*(0.578 1) 42.2% 100*(0.401 1) 59.9% 39
SAS Results PROC PHREG DATA = MODEL COVSANDWICH(AGGREGATE); CLASS Risk (PARAM=REF REF='15'); MODEL (START,END)* DEFAULT(0) = Risk P1GDP UNEMPLOYMENT; ID CUSTOMER_ID; HAZARDRATIO P1GDP / UNITS = 1 2 3 5; HAZARDRATIO UNEMPLOYMENT / UNITS = 1 2 3 5; RUN ; Output 8 100*(1.574 1) 57.4% 100*(1.975 1) 97.5% 100*(3.109 1) 210.9% 40
Survival Function Scenario Analysis 2 Scenario Analysis 1 P1GDP= 0.8 ; P1GDP= 1.1 ; Unemployment= 6 ; Unemployment= 10 ; 41
Forecast under Scenario 42
Go Further Introduction to Survival Applying Survival Analysis using PH Cox Models Analysis for Business 43 43
Go Further Survival Data Mining Survival Data Mining Programming Approach Using Enterprise Miner 44 44
Go Further – Books 45 45
Go Further – Books 46 46
Questions Web page: www.modellingtraining.com info@modellingtraining.com Email: Tel: 01943 430241 07880 474564 - SAS code - Results - PDF 47 47
Recommend
More recommend