predicting real time transaction fraud
play

Predicting Real-Time Transaction Fraud Sami Niemi, PhD Barclays, - PowerPoint PPT Presentation

Predicting Real-Time Transaction Fraud Sami Niemi, PhD Barclays, Quantitative Analytics, Fraud Detection #StrataData - Predicting real-time transaction fraud using supervised learning Contents Background 1 Raw Data 2 Data Processing 3 4


  1. Predicting Real-Time Transaction Fraud Sami Niemi, PhD Barclays, Quantitative Analytics, Fraud Detection #StrataData - Predicting real-time transaction fraud using supervised learning

  2. Contents Background 1 Raw Data 2 Data Processing 3 4 Development 4 Validation 5 Implementation 6 Summary 7 #StrataData - Predicting real-time transaction fraud using supervised learning 2

  3. Background – Definitions and Examples an individual, or group of people, create or 3 rd Party use a third-party's identity in order to apply Fraud for products or take over an account without the consent or knowledge of the third-party. Card Card Present (CP) • e.g. lost, stolen, counterfeit/clone Transaction Card not Present (CnP) Fraud • e.g. identity theft, hacking, fake online shops #StrataData - Predicting real-time transaction fraud using supervised learning 3

  4. Background – Motivation (global view) #StrataData - Predicting real-time transaction fraud using supervised learning 4

  5. Background – Motivation (UK view) Source: Fraud the Facts 2018 by UK Finance #StrataData - Predicting real-time transaction fraud using supervised learning 5

  6. Background – Challenges Fraudsters Adapt and Invent New MOs Real-Time Runtime Requirements Front Page News Material #StrataData - Predicting real-time transaction fraud using supervised learning 6

  7. Aim of the project was to develop and implement new Debit CP and CnP real-time fraud detection models, which can reduce fraud losses and protect genuine customers. #StrataData - Predicting real-time transaction fraud using supervised learning 7

  8. Raw Data – Sources Non-Mon Events Other Confirmed Cards and Frauds Accounts Debit Card Transactions Payment Customer Instrument Info Info Account Info #StrataData - Predicting real-time transaction fraud using supervised learning 8

  9. Data Processing – Quality Assurance and Data Exploration • Reconciliation, Volumes, and Amounts • Daily and Monthly Summary Statistics Data Quality • Anomaly and Outlier Detection • Trend Analysis and Anomaly Detection • Distributions (PDFs and bar charts for Fraud / Non) Exploration • Correlations (covariance, correlation w/ target, etc.) • Thresholding • Issue Generation and Resolution Report • Documentation and Governance #StrataData - Predicting real-time transaction fraud using supervised learning 9

  10. Data – High Level Statistics • Total: 220 – 300M debit card transactions with total Volumes value of £9 – 11B per month • CP: 110M contactless, 20M ATM • CnP: 85M e-commerce + telephony • 10M unique customers per month Customers • transacting in 220 countries • using 12M debit cards • with 1.9M different merchants • Fraud Rates Frauds • CP: less than 0.01%, depending on segment • CnP: less than 0.15%, depending on segment #StrataData - Predicting real-time transaction fraud using supervised learning 1 0

  11. Development – Datasets Debit 14 months CP CnP Train OOT Train OOT 12 months 2 recent mnths 12 months 2 recent mnths Sample Sample 45M transactions 55M transactions #StrataData - Predicting real-time transaction fraud using supervised learning 1 1

  12. Data Processing – Feature Engineering and many more (e.g. merchant)… finally, ratios between values and current transaction. #StrataData - Predicting real-time transaction fraud using supervised learning 1 2

  13. Development – Feature Selection • Remove zero or extremely low variance 20k • Remove if all or extremely high level of missing values Univariate • Very low Information Value or Spearman rank correction 10k • Lasso co-efficient importance • Random Forest feature importance Model 1k • Recursive Feature Elimination Wrapper 500 • Business Review • Implementation Considerations Domain #StrataData - Predicting real-time transaction fraud using supervised learning 1 3

  14. Development – Feature Selection & Business Review Debit CP model feature: ratio of current transaction amount and maximum contactless in last X days Genuine Fraud #StrataData - Predicting real-time transaction fraud using supervised learning 1 4

  15. Development – Model Development Cycle Select Features Pick Model Review and Train Evaluate Optimize #StrataData - Predicting real-time transaction fraud using supervised learning 1 5

  16. Development – Hyper-parameter Optimization Example of Bayesian hyper-parameter optimization using hyper-opt Performance Number of Iterations Fraction of Features in a Split #StrataData - Predicting real-time transaction fraud using supervised learning 1 6

  17. Validation – CP Model Performance Precision Recall Curve: AUC ~0.23 ROC Curve: AUC ~0.95 #StrataData - Predicting real-time transaction fraud using supervised learning 1 7

  18. Validation – CP Model Performance Transaction Detection Rate Value Detection Rate New model New model Incumbent model Incumbent model False Positive Rate [bps] False Positive Rate [bps] #StrataData - Predicting real-time transaction fraud using supervised learning 1 8

  19. Validation – CnP Model Performance Transaction Detection Rate Value Detection Rate New model New model Incumbent model Incumbent model False Positive Rate [bps] False Positive Rate [bps] #StrataData - Predicting real-time transaction fraud using supervised learning 1 9

  20. Validation – CP Model Interrogation Chip used Chip not-used Fraud Risk Time since a new card was issued #StrataData - Predicting real-time transaction fraud using supervised learning 2 0

  21. Implementation – Development Artefacts Model Artefacts • Model Specification (JSON) • Model File (txt) • Validation Data (parquet) Feature Artefacts • Feature Specification (JSON) • Validation Data (parquet) #StrataData - Predicting real-time transaction fraud using supervised learning 2 1

  22. Implementation - Process Artefacts from Nexus using Jenkins Model File and Implementation Validation Feature Code Gen and Validation Feature Maturation and Shadow Operations Production Validation and Go-Live #StrataData - Predicting real-time transaction fraud using supervised learning 2 2

  23. Summary • Increasing number of customers become victims of fraud, especially remote purchase (e.g. e-commerce). • To improve fraud prevention and customer experience, we undertook – Development of generation 1 models for Debit CP and CnP using tree ensemble algorithms – 12 months of training data were converted to about 20k features to develop the best possible models – Both models are in implementation, shadow operations due in May with go-live during summer • R&D for generation 2 models (e.g. RNNs, autoencoders) on- going, promising results, but implementation requires more work… #StrataData - Predicting real-time transaction fraud using supervised learning 2 3

  24. Rate today’s session Session page on conference website O’Reilly Events App #StrataData - Predicting real-time transaction fraud using supervised learning

Recommend


More recommend