human in the loop interpretability prior
play

Human-in-the-Loop Interpretability Prior Isaac Lage 1 , Andrew - PowerPoint PPT Presentation

Human-in-the-Loop Interpretability Prior Isaac Lage 1 , Andrew Slavin Ross 1 , Been Kim 2 , Samuel J. Gershman 1 & Finale Doshi-Velez 1 1 Harvard University & 2 Google Brain Poster: Today, 10:45 AM - 12:45 PM, Room 210 & 230 AB #119


  1. Human-in-the-Loop Interpretability Prior Isaac Lage 1 , Andrew Slavin Ross 1 , Been Kim 2 , Samuel J. Gershman 1 & Finale Doshi-Velez 1 1 Harvard University & 2 Google Brain Poster: Today, 10:45 AM - 12:45 PM, Room 210 & 230 AB #119

  2. Interpretability clipart-library.com

  3. Optimizing for Interpretability Previous Work Choose a Optimize User Proxy for Proxy for Study Interpretability Interpretability

  4. Optimizing for Interpretability Previous Work Choose a Optimize User Proxy for Proxy for Study Interpretability Interpretability How to use results to Which proxy? choose a better proxy?

  5. Optimizing for Interpretability Human-in-the-Loop Interpretability Update User Model Study Update model directly No proxy! with results!

  6. Interpretability Prior Goal: Bias model to be human interpretable Bayesian Inference

  7. Interpretability Prior First: Formulate Interpretability Encouraging Prior

  8. Optimizing for Interpretability Can define a prior Previous Work Choose a Optimize User Proxy for Proxy for Study Interpretability Interpretability Which prior captures human interpretability?

  9. Optimizing for Interpretability Human-in-the-Loop Interpretability Update User Model Study Evaluate interpretability encouraging prior

  10. Interpretability Prior First: Formulate Interpretability Encouraging Prior Then: Identify MAP Solution

  11. Interpretability Prior Likelihood: Easy Evaluate computationally No users!

  12. Interpretability Prior Prior: Hard No closed form Evaluate with user studies! Likelihood: Easy Evaluate computationally No users!

  13. Interpretability Prior Prior: Hard No closed form Evaluate with user studies! Challenge: Approximate MAP with few evaluations of prior

  14. Simplified Cartoon of Our Approach Step 1: Identify Diverse, High Likelihood Models

  15. Simplified Cartoon of Our Approach Step 1: Identify Diverse, High Likelihood Models Candidate MAP 1: Candidate MAP 2: Candidate MAP 3: Likelihood = HIGH Likelihood = HIGH Likelihood = HIGH

  16. Simplified Cartoon of Our Approach Step 1: Identify Diverse, High Likelihood Models Candidate MAP 1: Candidate MAP 2: Candidate MAP 3: Likelihood = HIGH Likelihood = HIGH Likelihood = HIGH Prior = ? Prior = ? Prior = ?

  17. Simplified Cartoon of Our Approach Step 2: Bayesian Optimization with User Studies Similarity Based on Explanation Features

  18. Simplified Cartoon of Our Approach Step 2: Bayesian Optimization with User Studies Similarity Based on Explanation Features User study 1: Prior = MEDIUM

  19. Simplified Cartoon of Our Approach Step 2: Bayesian Optimization with User Studies Similarity Based on Explanation Features Prior Estimate: User study 1: Prior = HIGH? Prior = MEDIUM

  20. Simplified Cartoon of Our Approach Step 2: Bayesian Optimization with User Studies Similarity Based on Explanation Features User study 2: User study 1: Prior = LOW Prior = MEDIUM

  21. Simplified Cartoon of Our Approach Step 2: Bayesian Optimization with User Studies Similarity Based on Explanation Features Prior Estimate: User study 2: User study 1: Prior = HIGH? Prior = LOW Prior = MEDIUM

  22. Simplified Cartoon of Our Approach Step 2: Bayesian Optimization with User Studies Similarity Based on Explanation Features User study 3: User study 2: User study 1: Prior = HIGH Prior = LOW Prior = MEDIUM

  23. Main Takeaways • We optimize for interpretability directly with human feedback • Our approach efficiently identifies human-interpretable and predictive models Census Dataset • MAP approximations correspond to different interpretability proxies on different datasets MORE Number of Iterations Interpretable Poster: Today, 10:45 AM - 12:45 PM, Room 210 & 230 AB #119

Recommend


More recommend