data driven ai
play

data-driven AI Using data about models to accelerate ML development - PowerPoint PPT Presentation

How Captricity built a human-level handwriting recognition engine using data-driven AI Using data about models to accelerate ML development Ramesh Sridharan @tweetsbyramesh Machine learning has the potential to change industries but ML


  1. How Captricity built a human-level handwriting recognition engine using data-driven AI Using data about models to accelerate ML development Ramesh Sridharan @tweetsbyramesh

  2. Machine learning has the potential to change industries

  3. …but ML in real-world production workflows can be hazardous

  4. Example: Combining models ✓ Validation Model 1 ✓ data ✓ Validation Model 2 ✓ data R&D: phase 1

  5. Example: Combining models ✓ ✓ Stacked Held-out test data model ✓ R&D: phase 2

  6. Example: Combining models ✓ ✓ Stacked Real-world data model ✗ Production: week 1

  7. Example: Combining models ✗ ✗ Stacked Real-world data model ✗ Production: week 4

  8. Example: Combining models • Silent failures go undetected ✗ • Can’t inspect model inputs/outputs ✗ Stacked Real-world data model • Rerunning models can be costly ✗ • Debugging is hard Production: week 4

  9. Challenges • When input conditions change, ML models can be unpredictable • Unpredictability slows productionizing ML models

  10. Outline • How Captricity works • Data-driven ML deployment • Data-driven ML development

  11. Outline • How Captricity works • Data-driven ML deployment • Data-driven ML development

  12. How Captricity works

  13. How Captricity works (dummy data)

  14. How Captricity works (dummy data)

  15. How Captricity works Smoker 561-80-0123 Tristan Chan 7/22/1950 (dummy data) To customer ! Machine Learning review Decision algorithms Training data Crowdsourcing

  16. Challenge: scan quality

  17. Outline • How Captricity works • Data-driven ML deployment • Data-driven ML development

  18. Challenge: How can we accelerate the deployment of ML research into production?

  19. Solution: track all models Input Output Model Metrics input output correctness model_snapshot …

  20. Provide access to aggregate metrics • Company-wide daily email • ML performance snapshot • Critical business metrics

  21. Challenge: models will fail ✗ ✓

  22. Challenge: models will fail ✗ ✓ Solution: model tracking enables identification, debugging and data curation

  23. Challenge: state changes ✓ Model F757558 Crowd F757558

  24. Challenge: state changes ✗ Model F757558 Crowd F757558 Customer/ expert E 757558

  25. Challenge: state changes ✗ Model F757558 Crowd F757558 Solution: capture everything needed to reproduce state Customer/ expert E 757558

  26. Parallel testing 91% Metrics Data Model v3.0 Model v4.0 94% ✓ Metrics

  27. Automatic model activation Model Metrics Crowd Evaluation

  28. Challenge: How do we accelerate the deployment of research into production? Key learning: Monitor and instrument all predictions from all ML models

  29. Data-driven ML deployment • Carefully track every prediction from every model • Provide easy access to aggregation and reporting • Track any and all factors correlated with low accuracy • Capture all state to reproduce results – Training data – Model snapshot – Pre- and post-processing

  30. Outline • How Captricity works • Data-driven ML deployment • Data-driven ML development

  31. Challenge: How do we determine which (sub-)problems to tackle with ML?

  32. Evaluation Input Results Model Evaluation

  33. Challenge: How do we determine which (sub-)problems to tackle with ML? Key learning: Collect data about input problem space, and use it to prioritize subproblems

  34. Key Learnings • Gather data on all predictions from all models –Enables debugging, deployment, and decision-making –Capture relevant state information • Use data about inputs to drive problem-solving Questions? Ramesh Sridharan @tweetsbyramesh rameshs@captricity.com

Recommend


More recommend