1 4
play

1 4 INTRODUCTION RESULTS Trucking industry background & load - PowerPoint PPT Presentation

P REDICTI TING C ARRI ARRIER L OAD AD C ANCE CELLA LLATI TIONS A UTHORS MIT Center for Transportation & Logistics Research Fest May 22, 2018 Ali Al-Habib Nicolas Favier Dr. Christopher Mejia A DVISOR A GENDA 1 4 INTRODUCTION RESULTS


  1. P REDICTI TING C ARRI ARRIER L OAD AD C ANCE CELLA LLATI TIONS A UTHORS MIT Center for Transportation & Logistics Research Fest May 22, 2018 Ali Al-Habib Nicolas Favier Dr. Christopher Mejia A DVISOR

  2. A GENDA 1 4 INTRODUCTION RESULTS Trucking industry background & load Models results presented in confusion cancellation impacts matrices and results analysis 2 5 DATA ANALYSIS CONCLUSION Descriptive analytics of load cancellation over Recommended actions and future research three-year dataset challenges 3 MODELING Predictive models applied on the dataset to identify main cancellation drivers 2

  3. 1 INTRODUCTION

  4. M OTIVATION 400 Million Truckloads 185 Million FTL Truckloads 32 Million Cancellations ~$145 /cancellation Estimated Impact ≅ $4.6B /year Source: Freight Facts and Figures, by U.S. Department of Transportation Bureau of Transportation Statistics 2015; CSCMP’s 4 Annual State of Logistics Report, by AT Kearney; & Data Analysis from the sponsor company

  5. P ROCESS Main Predictive 3-YEAR Drivers Model Dataset for Truckload to Predict Cancellation of Full Truckloads Cancellations Probability 3.6M Records of Full Descriptive analytics to Evaluating different Truckload during identify the main models to predict future 2015, 2016, 2017 cancellation drivers loads cancellations 5

  6. P OTENTIAL C ANCELLATION D RIVERS Appointment Day of the Load Time Type Week Book Time Book Lead Load Time Facility Dwell Changes Facility Time Service Level Contract Contract Unloading Loading Equipment Shipper Length of Characteristics Type Time Facility Time Relationship Type On-Time Impact Impact Delivery Shipment Leangth of High Risk High Carrier History Impact Haul Value Shipments/Year Load Characteristics Conference Shipper Duration Impact Weight Load ID Impact Empty Time Shipper ID Lead Time Trip Characteristics Load Impact Origin Shipper Industry Impact Characteristics Destination Impact Number of Shipper Size On-Time Stops PickUp Dead Head Load Cost Price Impact Internal Factors Carrier Rep Cancellations Other Impact Load Rate Impacts Rep Tenure Spot Price External Number of Factors Impact Weather Claims/Incidence Carrier Issues Carrier Natural Impact Disaster Safety Impact Geography Carrier History Rating Impact Carrier Characteristics Impact Bounce/Carrier Carrier ID Carrier Length of Loads/Year Carrier Size Relationship Carrier Type 6

  7. 2 DATA ANALYSIS

  8. B EHAVIOR O VER T IME 10% 15% 20% 25% 0% 5% 2015-1 Contract Cancellation Ratio 2015-2 2015-3 2015-4 2015-5 2015-6 2015-7 2015-8 2015-9 2015-10 2015-11 Cancellation Ratios over time 2015-12 2016-1 Spot Cancellation Ratio 2016-2 2016-3 2016-4 2016-5 2016-6 2016-7 2016-8 2016-9 2016-10 2016-11 2016-12 Total Cancellation Ratio 2017-1 2017-2 2017-3 2017-4 2017-5 2017-6 2017-7 2017-8 2017-9 2017-10 8

  9. L OCATION F ACTOR Loads & Cancellation Ratios by city 9

  10. S HIPPERS & C ARRIER F ACTORS Cancellation Ratios by shipper industry Cancellation Ratios by carrier length of relation with the company 10

  11. T IME F ACTORS Cancellation Ratios by duration Cancellation Ratios by Cancellation Ratios by between booking & load pickup day of the week pickup time 11

  12. 3 MODELING

  13. D ATA P REPARATION 2 4 Outliers Processing Predictor Screening Remove outlier records to Identify the most significant avoid undesired impact predictors in the data 3 5 1 Correlation Build the Model Load-Level Data Build multiple models to Remove correlated Convert data from stop to predict cancellations & attributes using Correlation load level data assess results & Multi-Collinearity Analysis 13

  14. M ODELING M ACHINE L EARNING Multiple Algorithms L OGISTIC R EGRESSION Harder to Explain Used to Validate Logistic Regression Results Categorical Output Self-Explanatory N EURAL N ETWORKS K-N EAREST N EIGHBOR Used as Main Model R ANDOM F OREST 14

  15. 4 RESULTS

  16. A VAILABLE D ATASET P REDICTOR S CREENING M ODEL R ESULTS Predictions No Yes No 652,501 2,956 655,457 A VAILABLE D ATASET Actual Yes 129,727 1,971 131,698 782,228 4,927 787,155 Error 16.86% Missed Bounces 98.50% Missed Error % Bounces Neural Networks 16.73% 99.95% Random Forest 16.61% 99.48% K-Neares Neighbor 19.90% 84.44% 16

  17. D ATA E NRICHMENT C ANCELLATION R ATIOS S EVERE W EATHER D ATA * E NRICHED D ATASET Carrier (80887) & City (Rochelle) Bounce Ratio=1/12= 0.08333 Repeated loads are counted only once for Average of the CarrierCity the ratio calculation Bounce Ratio for Each Stop Aggregated carrierCityBounce Ratio 17 on Load-Level *Source: National Centers for Environmental Information

  18. E NRICHED D ATASET P REDICTOR S CREENING M ODEL R ESULTS Predictions No Yes E NRICHED D ATASET No 638,652 16,880 655,532 Actual Yes 52,155 79,468 131,623 690,807 96,348 787,155 Error 8.77% Missed Bounces 39.62% Missed Error % Bounces Neural Networks 8.67% 39.04% Random Forest 8.70% 42.13% K-Neares Neighbor 9.33% 44.32% 18

  19. A DDITIONAL D ATASET Additional 3- Dataset (~3-year data) month data Cancellation Ratios Calculation (100%) Ratios Testing Training (80%) (20%) N EW D ATASET E NRICHED D ATASET Predictions Predictions No Yes No Yes No 638,652 16,880 655,532 No 59,883 3,735 63,618 Actual Actual Yes 52,155 79,468 131,623 Yes 8,903 1,722 10,625 690,807 96,348 787,155 68,786 5,457 74,243 Error 8.77% Error 17.02% Missed Bounces 39.62% Missed Bounces 83.79% Missed Error % Bounces Neural Networks 16.78% 84.70% Random Forest 16.19% 87.98% K-Neares Neighbor 16.41% 86.66% 19

  20. U NPREDICTABILITY T ESTING A VAILABLE H ISTORICAL D ATA P REDICTION T IME H ORIZON Additional 3-month data Additional 3-month data > 10 Historical > 10 Historical <= 10 Historical Records (67%) <= 10 Historical Records (67%) Records (33%) Records (33%) 7-day Horizon (3%) Predictions Predictions No Yes No Yes No 21,449 368 21,817 No 2,147 31 2,178 Actual Actual Yes 2,222 542 2,764 Yes 176 44 220 23,671 910 24,581 2,323 75 2,398 Error 10.54% Error 8.63% Missed Bounces 80.39% Missed Bounces 80.00% 20

  21. M ULTIPLE C LUSTERS , M ULTIPLE M ODELS Test Error Missed Bounces Logistic Regression (Threshold=0.5) - Base Scenario 17.02% 83.79% Low Cost (<= $500) 18.20% 99.06% Cost Clustering Mid Cost 16.67% 98.46% High Cost (>= $6000) 8.49% 100.00% Same day delivery (<= 250 mi) 16.07% 99.18% Miles Clustering Next Day delivery 18.08% 98.18% Long Haul (>= 550 mi) 18.08% 98.18% Less than 24h 8.53% 100.00% Between 24h and 48h 16.91% 100.00% Book To pickup Hours Clustering Between 48h and 72h 20.58% 99.99% More than 72h 22.33% 99.58% 21

  22. T HRESHOLD S ENSITIVITY A NALYSIS 22

  23. T HRESHOLD S ENSITIVITY A NALYSIS Loads 40,000 80% % of Bounces predicted correctly of total Bounces 35,000 70% 30,000 60% 25,000 50% Loads 20,000 40% 15,000 30% 10,000 20% 5,000 10% - 0% - 0.10 0.20 0.30 0.40 0.50 0.60 Threshold FN (Missed Bounces) FP (Missed Not Bounces) Bounces Predicted Correctly (%) 23

  24. 5 CONCLUSION

  25. N EXT S TEPS T HRESHOLD C HANGE Use the model with lower threshold (0.17) • Predict up-to 42% of cancelled loads • Tradeoff ratio 4 : 1 • (predicted cancellation : actual cancellation) F URTHER R ESEARCH Surveys to capture range of cancellation reasons • Record actual reasons for each cancellation • Capture details related to these reasons • Record additional information for each load: • • Loads sequence at truck level Carrier booked capacity • Rejection Rate • 25

  26. C HALLENGES L OAD S EQUENCE S CENARIO O VERBOOKING S CENARIO C OMPANY A C OMPANY B C OMPANY C S ELECTED R OUTE 26

  27. T HANK Y OU ! Q&A Ali Al-Habib Nicolas Favier

Recommend


More recommend