Ava: From Data to Insights Through Conversation Rogers Jeffrey Leo John, Navneet Potti, and Jignesh M. Patel University of Wisconsin – Madison Rogers Navneet 01/10/2017 University of Wisconsin–Madison 1
Motivation Why do my customers churn? VP Focus for this talk A legal pyramid scheme (org chart) Data Scientist 01/10/2017 University of Wisconsin–Madison 2
Issues 1. Lost in translation 2. Long turn-around time 3. Correctness! 4. Reproducibility … 01/10/2017 University of Wisconsin–Madison 3
Data Science Pipeline Data Loading • from a csv file Data Cleaning • fill missing values Feature Engineering • pick/create appropriate features Model Selection and Training • pick an ML model based on the input and task at hand Parameter Tuning • hyperparameter optimization Save model for deployment • As a UDF, PMML file, … 01/10/2017 University of Wisconsin–Madison 4
Insights • Often the task is … Conversation Composition Code Constrained Storyboard In an interactive Natural notebook Language (e.g. iPython) Rely on Natural Composable Target the lower Language internal architecture level of the pyramid Translation rather and convenient and allow the than Natural system abstraction programmer to take Language control Understanding 01/10/2017 University of Wisconsin–Madison 5
01/10/2017 University of Wisconsin–Madison 6
Architecture
The Ava Storyboard Concept … 01/10/2017 University of Wisconsin–Madison 8
What is new? Previous work Ava • Imitation Game: Turning 1950 • Constrain to walk along a pre- scripted storyboard • CNL à remove ambiguity • Storyboard is a Finite State • Natural Language à Query: Machine Use feedback to refine ambiguity • The user creates their “own” story. It always ends well J . The right encapsulation for: Controlled 1. Technology trend Natural NLT v/s NLU • Language Different ML backends • 2. Developer skills C i C j chat ij Business analysts vs • statistician 01/10/2017 University of Wisconsin–Madison 9
Results from a user study with 16 participants. Distribution of the time taken by participants to complete the first model. 60 50 Time (minutes) 40 30 20 10 0 Python Ava 1/12/17 University of Wisconsin–Madison 10
The conversation is the code AVA Summary of The data chatbots are coming Ava democratizes “data science” Benefits: increased human productivity, reproducibility, rapid exploration, a powerful collaboration mechanism, … More automation along every dimension 01/10/2017 University of Wisconsin–Madison 11
Thanks! Jean-Michel Ané, Victor Cabrera, Mark Craven, Pamela Herd, David Page, Federico E. Rey, Garret Suen, and Kent Weigel University of Wisconsin 01/10/2017 University of Wisconsin–Madison 12
Recommend
More recommend