data-driven AI Using data about models to accelerate ML development - PowerPoint PPT Presentation

How Captricity built a human-level handwriting recognition engine using data-driven AI Using data about models to accelerate ML development Ramesh Sridharan @tweetsbyramesh

Machine learning has the potential to change industries

…but ML in real-world production workflows can be hazardous

Example: Combining models ✓ Validation Model 1 ✓ data ✓ Validation Model 2 ✓ data R&D: phase 1

Example: Combining models ✓ ✓ Stacked Held-out test data model ✓ R&D: phase 2

Example: Combining models ✓ ✓ Stacked Real-world data model ✗ Production: week 1

Example: Combining models ✗ ✗ Stacked Real-world data model ✗ Production: week 4

Example: Combining models • Silent failures go undetected ✗ • Can’t inspect model inputs/outputs ✗ Stacked Real-world data model • Rerunning models can be costly ✗ • Debugging is hard Production: week 4

Challenges • When input conditions change, ML models can be unpredictable • Unpredictability slows productionizing ML models

Outline • How Captricity works • Data-driven ML deployment • Data-driven ML development

How Captricity works

How Captricity works (dummy data)

How Captricity works Smoker 561-80-0123 Tristan Chan 7/22/1950 (dummy data) To customer ! Machine Learning review Decision algorithms Training data Crowdsourcing

Challenge: scan quality

Challenge: How can we accelerate the deployment of ML research into production?

Solution: track all models Input Output Model Metrics input output correctness model_snapshot …

Provide access to aggregate metrics • Company-wide daily email • ML performance snapshot • Critical business metrics

Challenge: models will fail ✗ ✓

Challenge: models will fail ✗ ✓ Solution: model tracking enables identification, debugging and data curation

Challenge: state changes ✓ Model F757558 Crowd F757558

Challenge: state changes ✗ Model F757558 Crowd F757558 Customer/ expert E 757558

Challenge: state changes ✗ Model F757558 Crowd F757558 Solution: capture everything needed to reproduce state Customer/ expert E 757558

Parallel testing 91% Metrics Data Model v3.0 Model v4.0 94% ✓ Metrics

Automatic model activation Model Metrics Crowd Evaluation

Challenge: How do we accelerate the deployment of research into production? Key learning: Monitor and instrument all predictions from all ML models

Data-driven ML deployment • Carefully track every prediction from every model • Provide easy access to aggregation and reporting • Track any and all factors correlated with low accuracy • Capture all state to reproduce results – Training data – Model snapshot – Pre- and post-processing

Challenge: How do we determine which (sub-)problems to tackle with ML?

Evaluation Input Results Model Evaluation

Challenge: How do we determine which (sub-)problems to tackle with ML? Key learning: Collect data about input problem space, and use it to prioritize subproblems

Key Learnings • Gather data on all predictions from all models –Enables debugging, deployment, and decision-making –Capture relevant state information • Use data about inputs to drive problem-solving Questions? Ramesh Sridharan @tweetsbyramesh rameshs@captricity.com

data-driven AI Using data about models to accelerate ML development - PowerPoint PPT Presentation

How Captricity built a human-level handwriting recognition engine using data-driven AI Using data about models to accelerate ML development Ramesh Sridharan @tweetsbyramesh Machine learning has the potential to change industries but ML

1 Data-dr Data-driven philosophy n philosophy Data-dr Data-driven: push n: push 7 8

Data-Driven Research Program Data-Driven Research Program Linked Longitudinal Retrospective

SCE Map Update: Data-Driven Spatial and E Field Maps Michael Mooney, Hannah Rogers Colorado

Data-driven COVID-19 modeling Data-driven COVID-19 modeling 1 Cyprien Neverov th August 28 ,

Data Driven Marketing the DNA of customer oriented companies 00101001 yes no Data Driven

Data-driven Clustering via Parameterized Lloyds Families Travis Dick Joint work with

CS 528 Mobile and Ubicomp Lecture 3a: Data-Driven Layouts & Android Components Emmanuel Agu

Data-Race Detection for Interrupt-Driven Data-Races and Happens- Kernels Before Analyzing

Where are we? Data! Data? We are not short of data Technologies! Technologies? We are not

2016 Constituent Driven Process Not to, not for, but WITH Planning Model ~ Top Down Board

Working Together on Data Science Data-Driven Discovery Initiative @ Moore Foundation Chris

The Key to Becoming a Data-Driven City: Using Technology to Understand Resident Feedback, Crisis

MMA Independent and Data Driven www. mma-research.com MMA Independent and Data Driven

DATA DRIVEN VALUE CREATION DATA SCIENCE & ANALYTICS | DATA MANAGEMENT | VISUALIZATION

Grammar-driven versus Data-driven: Which Parsing System is More Affected by Domain Shifts?

Data & Science A m mandate f for d data d driven c corporate i innovation By Igor

Capturing Value from Big Data through Data-Driven Business Models Patterns from the Start-up

BETTER S BETTER S AFE THAN S AFE THAN S ORRY: ORRY: Navigating Data-Driven S Navigating

From Model-Driven Computer Science to Data-Driven Computer Science and Back Moshe Y. Vardi Rice

Designing a Data-Driven Diversity & Inclusion Strategy Joelle Emerson, Founder & CEO

The Data-Driven Web of Now Extending D3js Travis Smith Developer Evangelist Atlassian

Data Driven Instruction Kelly Chandler-Olcott Syracuse University From Glossary of IES Practice

Surface Reconstruction Methodologies Global Structure Data-driven User-Driven State of the Art

E.T.L. The underestimated requisite to being data-driven E.T.L. The underestimated requisite to

data-driven AI Using data about models to accelerate ML development - PowerPoint PPT Presentation

How Captricity built a human-level handwriting recognition engine using data-driven AI Using data about models to accelerate ML development Ramesh Sridharan @tweetsbyramesh Machine learning has the potential to change industries but ML

1 Data-dr Data-driven philosophy n philosophy Data-dr Data-driven: push n: push 7 8

Data-Driven Research Program Data-Driven Research Program Linked Longitudinal Retrospective

SCE Map Update: Data-Driven Spatial and E Field Maps Michael Mooney, Hannah Rogers Colorado

Data-driven COVID-19 modeling Data-driven COVID-19 modeling 1 Cyprien Neverov th August 28 ,

Data Driven Marketing the DNA of customer oriented companies 00101001 yes no Data Driven

Data-driven Clustering via Parameterized Lloyds Families Travis Dick Joint work with

CS 528 Mobile and Ubicomp Lecture 3a: Data-Driven Layouts &amp; Android Components Emmanuel Agu

Data-Race Detection for Interrupt-Driven Data-Races and Happens- Kernels Before Analyzing

Where are we? Data! Data? We are not short of data Technologies! Technologies? We are not

2016 Constituent Driven Process Not to, not for, but WITH Planning Model ~ Top Down Board

Working Together on Data Science Data-Driven Discovery Initiative @ Moore Foundation Chris

The Key to Becoming a Data-Driven City: Using Technology to Understand Resident Feedback, Crisis

MMA Independent and Data Driven www. mma-research.com MMA Independent and Data Driven

DATA DRIVEN VALUE CREATION DATA SCIENCE &amp; ANALYTICS | DATA MANAGEMENT | VISUALIZATION

Grammar-driven versus Data-driven: Which Parsing System is More Affected by Domain Shifts?

Data &amp; Science A m mandate f for d data d driven c corporate i innovation By Igor

Capturing Value from Big Data through Data-Driven Business Models Patterns from the Start-up

BETTER S BETTER S AFE THAN S AFE THAN S ORRY: ORRY: Navigating Data-Driven S Navigating

From Model-Driven Computer Science to Data-Driven Computer Science and Back Moshe Y. Vardi Rice

Designing a Data-Driven Diversity &amp; Inclusion Strategy Joelle Emerson, Founder &amp; CEO

The Data-Driven Web of Now Extending D3js Travis Smith Developer Evangelist Atlassian

Data Driven Instruction Kelly Chandler-Olcott Syracuse University From Glossary of IES Practice

Surface Reconstruction Methodologies Global Structure Data-driven User-Driven State of the Art

E.T.L. The underestimated requisite to being data-driven E.T.L. The underestimated requisite to

CS 528 Mobile and Ubicomp Lecture 3a: Data-Driven Layouts & Android Components Emmanuel Agu

DATA DRIVEN VALUE CREATION DATA SCIENCE & ANALYTICS | DATA MANAGEMENT | VISUALIZATION

Data & Science A m mandate f for d data d driven c corporate i innovation By Igor

Designing a Data-Driven Diversity & Inclusion Strategy Joelle Emerson, Founder & CEO