machine learning opening the pandora s box
play

Machine Learning: Opening the Pandoras Box By Dhiana Deva - Machine - PowerPoint PPT Presentation

Machine Learning: Opening the Pandoras Box By Dhiana Deva - Machine Learning Engineer at Spoti fz QCon So Paulo - May 2019 Agenda About me Open the Pandoras Box Start with simple Aim to the skies Hit half-way there Enjoy the


  1. Machine Learning: Opening the Pandora’s Box By Dhiana Deva - Machine Learning Engineer at Spoti fz QCon São Paulo - May 2019

  2. Agenda About me Open the Pandora’s Box Start with simple Aim to the skies Hit half-way there Enjoy the journey!

  3. About Me

  4. Me @ QCon Rio 2015

  5. Me @ QCon São Paulo 2019

  6. Open the Pandora’s Box!

  7. Introducing Machine Learning

  8. Problems Problems Problems s m e l b o r P Introducing Machine Learning is like opening the Pandora’s Box

  9. Problems Problems Problems s m e l b o r P Introducing Machine Learning is like opening the Pandora’s Box

  10. Assumptions Issues Constraints s k s i R Introducing Machine Learning is like opening the Pandora’s Box

  11. Constraints

  12. Be aware (not afraid) of constraints What decisions can you a fg ect? What are the system implications? What does your ML Infra Illustration from the book support? "Creative People Must be Stopped” By David A. Owens

  13. Example Constraints Business Constraints Data Constraints Systems Constraints • Metrics • Volume • Available levers • Business logic • Features • Infrastructure support • Legal needs • Labels • Systems implications • Engineering e fg ort

  14. Addressing Constraints Investigate, communicate, and address it strategically by either: • Accepting and working under its boundaries • Expanding its boundaries WARNING : Hitting an unexpected critical constraint too late in the process can kill your ML product!

  15. Assumptions

  16. "You have no idea, KNOWN UNKNOWN but you pretend you know." ASSUMPTIONS KNOWN You might not have enough data to back your hypothesis. Historical data is biased by existing UNKNOWN heuristics. The hypothesis behind your ML product Assumptions bridging between "Known Unknowns" and might be based on a critical assumption. "Known Knowns"

  17. Example Assumptions • Are the metrics sensitive to the levers the ML approach is pulling? • How do customers behave under changes in the logic? • Impact analysis assumptions: - Cost of misclassi fj cation - Bene fj t of correct classi fj cation - Assumptions for worst case scenario - Parameters for more optimistic scenarios

  18. Addressing Assumptions • Experiment early and focus on learning parameters needed for better impact analysis and further more sophisticated approaches. • Consider reframing initial problems to be solved, to validate most critical assumptions fj rst. • To be able to more forward with an unbiased approach, collect randomized data.

  19. Issues

  20. Machine Learning itself might not be vs the issue! Is there latency introduced? Did the systems need to be changed, vs decoupled or refactored? Issues from systems implications might You don’t want to compare apples and oranges! impact your metrics and should not be attributed to Machine Learning.

  21. Example Issues Data System Other • Instrumentation • Latency • UX • Metrics • Bugs • CX

  22. A/A Test vs vs

  23. Unveiling Issues Running A/A Tests What to expect? • A: existing system, existing heuristic • A should be equal A*: • A*: new system, existing heuristic - Operational metrics - ML “turned-o fg ” - Business metrics - Bypassing the ML decision - CS metrics • If two A’s perform di fg erent: - Trust me, there’s an issue! - Time to investigate!

  24. Addressing Issues In case a discrepancy is found on the A/A Test analysis: • Which metric is showing discrepancies? • What could have caused it? • What is the impact of this discrepancy? Decide whether to fj x it based on its impact size

  25. A/A/B Test vs vs vs vs Run an A/A/B Test if time sensitive! But only trust the A/B part once you validated the A/A part!

  26. Risks

  27. Careful about "Squeeze Toys" Optimizing for metric A might "If you optimize your business to maximize one metric, something important happens. Just like lead to risking metric B. one of those bulging stress-relief squeeze toys , squeezing it in one place makes it bulge out in another.” Quote from the book “Lean Analytics” by Benjamin Yoskovitz and Alistair Croll

  28. Addressing Risks Before experimenting After experiment • Simulate worst case scenarios • Calculate experiment costs • Simulate random baseline Ps: Same goes when collecting randomised data.

  29. Start with simple!

  30. “Type a quote here.” Illustration from the book “Feature Engineering for Machine Learning" by Alice Zheng and Amanda Casari.

  31. “Doing simple sanity checking to make sure things are what you think they are can sometimes get you much further in the end than web scraping and a big fancy machine learning algorithm . It may not seem cool and sexy , but it’s smart and good practice . People might not invite you to a meetup to talk about it. It may not be publishable research, but at least it’s legitimate and solid work .” Quote from the book "Doing Data Science" by Cathy O’Neil and Rachel Schutt. Chapter contributed by Claudia Perlich.

  32. Illustration from the "Analytics Solutions Uni fj ed Method” ASUM-DM by IBM Iterate!

  33. s n o i t p m u s s s k A s i R I s s u s e n C s o o i t n p s m t r u a s i n s t A s s Constraints e R i u s k s s s I Illustration from the "Analytics Solutions Uni fj ed Method” ASUM-DM by IBM Iterate! Addressing the constraints, assumptions, risks and issues.

  34. s n o i t p m u s s s k A s i R s n C o o i t n p s m t r u a s i n s t A s s Constraints e R i u s k s s s I Illustration from the "Analytics Solutions Uni fj ed Method” ASUM-DM by IBM Iterate! Addressing the constraints, assumptions, risks and issues.

  35. s n o i t p m u s s s k A s i R C o n s t r a i n t s s Constraints e R i u s k s s s I Illustration from the "Analytics Solutions Uni fj ed Method” ASUM-DM by IBM Iterate! Addressing the constraints, assumptions, risks and issues.

  36. s n o i t p m u s s s k A s i R C o n s t r a i n t s s e R i u s k s s s I Illustration from the "Analytics Solutions Uni fj ed Method” ASUM-DM by IBM Iterate! Addressing the constraints, assumptions, risks and issues.

  37. s n o i t p m u s s s k A s i R C o n s t r a i n t s R i s k s Illustration from the "Analytics Solutions Uni fj ed Method” ASUM-DM by IBM Iterate! Addressing the constraints, assumptions, risks and issues.

  38. s n o i t p m u s s A C o n s t r a i n t s R i s k s Illustration from the "Analytics Solutions Uni fj ed Method” ASUM-DM by IBM Iterate! Addressing the constraints, assumptions, risks and issues.

  39. Illustration from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015 ML Systems are complex systems!

  40. Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015 Start with simple!

  41. Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015 Iterate with strategical proportional investments across the ML stack.

  42. Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015 And so on…

  43. Aim for the skies!

  44. What’s the limit of what’s achievable? Machine Learning is a powerful tool, but buy-in and sponsorship is much needed. A big vision is vital for Machine Learning products.

  45. Questions - cheat sheet • What if you had all the levers that you could possibly pull? • What if you could optimize all the aspects of the business and user experience? • What if you would break it down to multiple Machine Learning products? • What if you had all the data you would like to use? • What if you had the ideal Machine Learning infrastructure? • What if you would use the ideal Machine Learning model and approach? • What if you had all monitoring in place to quickly catch any issues?

  46. Vision - cheat sheet Multi-Objective Optimization Improve _____ and reduce _____ by _____ the right _____ and _____ with the right _____ and the right _____ Multiple ML Products Multiple Levers

  47. Hit half-way there!

  48. Good enough is better than perfect! • You might discover other interesting opportunities for Machine Learning. • You might discover other interesting opportunities even without Machine Learning. • You might discover there’s a third party service for your domain. • Machine Learning is as part of the solution, not the whole solution.

  49. Avoid harms Try to understand how decisions impact outcomes. Learn more: check out the slides from the tutorial " Algorithmic Bias in Practice " at ACM FAT*2019. Illustration from “AAAI 2017 Spring Symposium Series - Designing the UX of ML Systems” by Henriette Cramer and Jenn Thom

  50. Enjoy the journey!

Recommend


More recommend