transmogrification the magic of feature engineering
play

Transmogrification: The Magic of Feature Engineering Leah McGuire - PowerPoint PPT Presentation

Transmogrification: The Magic of Feature Engineering Leah McGuire and Mayukh Bhaowal ML algorithms take center stage in AI Modeling Raw Data Feature Engineering Bottleneck Mythical Numeric Matrix X 1 X 2 X 3 X 4 X 5 Y 0 1 0 0 0 A 1


  1. Transmogrification: The Magic of Feature Engineering Leah McGuire and Mayukh Bhaowal

  2. ML algorithms take center stage in AI Modeling Raw Data Feature Engineering Bottleneck

  3. Mythical Numeric Matrix X 1 X 2 X 3 X 4 X 5 Y 0 1 0 0 0 A 1 1 1 0 0 B 0 0 1 1 0 B 1 1 1 1 1 A 1 0 1 0 0 A

  4. Use the data types

  5. Automatic Feature Engineering Numeric Categorical Text Temporal Spatial Imputation Imputation Augment with Tokenization Track null value external data e.g avg Time difference Track null value income Hash Encoding One Hot Encoding Circular Statistics Log transformation for Spatial fraudulent Tf-Idf Dynamic Top K pivot large range behavior e.g: Time extraction (day, impossible travel Word2Vec week, month, year) Smart Binning Scaling - zNormalize speed Sentiment Analysis Closeness to major LabelCount Encoding Smart Binning Geo-encoding events Language Detection Category Embedding

  6. Transmogrification val featureVector = Seq ( age , phone , email , subject , zipCode ).transmogrify()

  7. Impact on Feature Engineering Email Phone Age Subject Zipcode Top Email Is Average Top 10 Email Country Phone Age Age Age TF-IDF Spammy Domain Income Code Is Valid [0-15] [15-35] [>35] Terms Vector

  8. The Black Swan of Perfectly Interpretable Models Leah McGuire, Mayukh Bhaowal

  9. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  10. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  11. The Question Why did the machine learning model make the decision that it did?

  12. Translation #1 How do I fix this model? — Data Scientist

  13. Translation #2 Do we have our bases covered, in case of a regulatory audit? — Legal Counsel

  14. Translation #3 Does Einstein know what I know? How do I use this prediction? — Non Technical End User

  15. P 1 (c | f) Input Output P k (c | f) Σ P n (c | f)

  16. Model Insights Report

  17. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  18. Debuggability Top contributing features for surviving the Titanic: 1. Gender 2. pClass 3. Body F1

  19. Trust How can you trust a man that wears both a belt and suspenders? Man can't even trust his own pants.

  20. Right Human Machine Wrong

  21. Bias

  22. Legal

  23. Black defendant has higher risk scores https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

  24. Actionable

  25. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  26. It’s complicated

  27. Does the consumer care about how Are the raw features affect the Does the consumer Can you use a features fed into model or just feature care about individual simple model? the model insights? predictions? interpretable? Feature Impact Secondary Model Feature Weights/ Model Agnostic Global Importance Global Global Feature Impact Feature Weights/ Secondary Model Model Agnostic Importance Local Local Local

  28. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  29. The best model or the model you can explain?

  30. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  31. Where did you get the feature matrix? X 1 X 2 X 3 X 4 X 5 Y 0 1 0 0 0 A 1 1 1 0 0 B 0 0 1 1 0 B 1 1 1 1 1 A 1 0 1 0 0 A

  32. Feature Engineering Email Phone Age Subject Zipcode Top Email Is Average Top 10 Email Country Phone Age Age Age TF-IDF Spammy Domain Income Code Is Valid [0-15] [15-35] [>35] Terms Vector

  33. Metadata!!! ● The name of the feature the column was made from ● The name of the RAW feature(s) the column was made from ● Everything you did to get the column ● Any grouping information across columns https://ontotext.com/knowledgehub/fundamentals/metadata-fundamental/ ● Description of the value in the column

  34. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  35. Interpretability: Global vs Local

  36. Does the consumer care about how Are the raw features affect the Does the consumer Can you use a features fed into model or just feature care about individual simple model? the model insights? predictions? interpretable? Feature Impact Secondary Model Feature Weights/ Model Agnostic Global Importance Global Global

  37. Feature Weight / Importance (Global)

  38. Predict House Price

  39. Predict Titanic Passenger Survival

  40. P 1 (c | f) Input Output P k (c | f) Σ P n (c | f)

  41. Feature Impact (Global - the hard way) X X 2 X 3 X 4 X 5 Y 0 1 0 0 0 A 1 1 1 0 0 B 0 0 1 1 0 B 1 1 1 1 1 A 1 0 1 0 0 A

  42. Feature Impact (Global - the hard way)

  43. Issues with Feature Importance / Weight / Impact (Global) http://resources.esri.com/help/9.3/arcgisengine/java/gp_toolref/spatial_statistics_toolbox/multicollinearity.htm

  44. Secondary Model Prediction Input Explanation

  45. Secondary Model (Global)

  46. Secondary Model (Global) https://www.statmethods.net/advgraphs/images/corrgram1.png

  47. What we do: ● All the metadata about how you got the feature ● Correlation ● Mutual information ● Feature weight / importance ● Feature distribution

  48. { "featureName" : "sex", What we do: "derivedFeatures" : [ { "stagesApplied" : [ "pivotText_OpSetVectorizer" ], "derivedFeatureValue" : "Male", "corr" : -0.5185045877245239, "mutualInformation" : 0.19652543270839468, "contribution" : 0.1763534388489181, …. }, { "stagesApplied" : [ "pivotText_OpSetVectorizer" ], "derivedFeatureValue" : "Female", "corr" : 0.518504587724524, "mutualInformation" : 0.19652543270839468, "contribution" : 0.18080355705344647, …. } }

  49. Roadmap for this talk Local Global (full (record How to model) level) Complications explain solutions solutions of feature What does it your engineering Why mean to model? Interpretability explain vs accuracy explain your your tradeoff model? model?

  50. Does the consumer care about how Are the raw features affect the Does the consumer Can you use a features fed into model or just feature care about individual simple model? the model insights? predictions? interpretable? Feature Impact Feature Weights/ Secondary Model Model Agnostic Importance Local Local Local

  51. Feature Weight (Local)

  52. Predict House Price 852 2 1 36

  53. Feature Weight (Local)

  54. Feature Impact (LOCO) {"age":17.0, "embarked":"C", "name":"Attalah, Miss. Malake", "pClass":"3", "parch":"0", "sex":"female", "sibSp":"0", "survived":0.0, "ticket":"2627"} Score = 0.62 Why? sex = "female" (+0.13), pClass = 3 (-0.05), ... https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning

  55. Secondary Model (LIME) https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning

  56. Secondary Model (Correlation) Norm (feature) * Corr https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning

  57. What we do: ● Use case determines LOCO or correlation ● Use case determines what level of features we show

Recommend


More recommend