ava from data to insights through conversations a review
play

Ava: From data to insights through conversations A review by Apaar - PowerPoint PPT Presentation

Ava: From data to insights through conversations A review by Apaar Shanker DATA ANALYTICS USING DEEP LEARNING GT CS 8803 // FALL 2018 // Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers Jeffrey Leo John 1 ,


  1. Ava: From data to insights through conversations A review by Apaar Shanker DATA ANALYTICS USING DEEP LEARNING GT CS 8803 // FALL 2018 //

  2. Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers Jeffrey Leo John 1 , Navneet Potti 1 , Jignesh M. Patel 1 Computer Sciences Department, 1 University of Wisconsin-Madison Publication: CIDR ‘17 doi:http://pages.cs.wisc.edu/~jignesh/publ/Ava.pdf 2 GT 8803 // Fall 2018

  3. The current paradigm of data driven decision making

  4. Issues with the current model 1. Lost In translation 2. Long turnaround time 3. Correctness 4. Reproducibility 5. A cognitive overload due to surfeit of models and libraries 4 GT 8803 // Fall 2018

  5. Proposed Solution Key Observations: - Controlled natural language methods are now practically implemented as interfaces to software toolboxes - The data science workflow can be templatized We can use a chat-bot as a natural language UI to set up a data science pipeline by drawing on templates stored in a library. 5 GT 8803 // Fall 2018

  6. 6

  7. Typical Data Science Workflow The workflow is a (often cyclic) graph. The actual pipeline is a subgraph of the workflow graph. Meta Task Meta Task Meta Task Task Meta Task Task Once a workflow has been finalized - only the pipeline(constituted of dotted blue boxes) 7 needs to be preserved.

  8. Data Science Workflow can be Templatized from sklearn import tree model = DecisionTreeRegressor ( criterion= ’mse’ , splitter= ’best’ , max_depth=None ) model.fit(X_train, y_train) y_pred = model.predict(X_test) There is a clean separation of specification (parameter values) and template , such that task can be composed by simply substituting parameters into a pre-defined code template. 8 GT 8803 // Fall 2018

  9. 9

  10. Introducing AVA 10

  11. AVA in action 11

  12. Architecture Rest API Jpype 12

  13. 13

  14. Results A group of 16 students with some ML background (via coursework) and Python proficiency were asked to to do supervised learning on a Kaggle Dataset. 14 GT 8803 // Fall 2018

  15. Issues and Enhancements Accuracy of the AVA models versus human models ❖ The addition of templates to the repository can be ❖ automated. Work on the knowledge-base based recommendation ❖ system? Handling unstructured data: ❖ A customizable file-parser ➢ Handling larger than memory input data ❖ Uncertainty quantification in the output as a model ❖ guideline Where is the Code? ❖ 15 GT 8803 // Fall 2018

Recommend


More recommend