classification
play

Classification March 19, 2020 Data Science CSCI 1951A Brown - PowerPoint PPT Presentation

Classification March 19, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter 1 Today Generative vs. Discriminative Models KNN, Naive Bayes, Logistic Regression SciKit


  1. Classification March 19, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter 1

  2. Today • Generative vs. Discriminative Models • KNN, Naive Bayes, Logistic Regression • SciKit Learn Demo 2

  3. Supervised vs. Unsupervised Learning • Supervised: Explicit data labels • Sentiment analysis—review text -> star ratings • Image tagging—image -> caption • Unsupervised: No explicit labels • Clustering—find groups similar customers • Dimensionality Reduction—find features that differentiate individuals 3

  4. Classification One Goal: P(Y|X)

  5. Classification One Goal: P(Y|X) Label

  6. Classification Features One Goal: P(Y|X) Label

  7. Classification One Goal: P(Y|X) P(email is spam | words in the message) P(genre of song|tempo, harmony, lyrics…) P(article clicked | title, font, photo…)

  8. harmonic complexity K Means tempo 8

  9. harmonic complexity K Means X tempo 9 X

  10. K Nearest Neighbors harmonic complexity Blue or Red? tempo 10

  11. K Nearest Neighbors harmonic complexity K = 1 tempo 11

  12. K Nearest Neighbors harmonic complexity K = 5 tempo 12

  13. K Nearest Neighbors harmonic complexity tempo 13

  14. K Nearest Neighbors harmonic complexity K = 5 tempo 14

  15. K Nearest Neighbors harmonic complexity K = 5 tempo 15

  16. K Nearest Neighbors harmonic complexity K = 5 tempo 16

  17. K Nearest Neighbors • Arguably the simplest ML algorithm • “Non-Parametric” — no assumptions about the form of the classification model • All the work is done at classification time • Works with tiny amounts of training data (single example per class) • The best classification model ever ???

  18. K Nearest Neighbors • Arguably the simplest ML algorithm • “Non-Parametric” — no assumptions about the form of the classification model • All the work is done at classification time • Works with tiny amounts of training data (single example per class) • The best classification model ever ???

  19. K Nearest Neighbors • Arguably the simplest ML algorithm • “Non-Parametric” — no assumptions about the form of the classification model • All the work is done at classification time • Works with tiny amounts of training data (single example per class) • The best classification model ever ???

  20. K Nearest Neighbors • Arguably the simplest ML algorithm • “Non-Parametric” — no assumptions about the form of the classification model • All the work is done at classification time • Works with tiny amounts of training data (single example per class) • The best classification model ever ???

  21. K Nearest Neighbors • Arguably the simplest ML algorithm • “Non-Parametric” — no assumptions about the form of the classification model • All the work is done at classification time • Works with tiny amounts of training data (single example per class) • The best classification model ever ???

  22. K Nearest Neighbors • Arguably the simplest ML algorithm • “Non-Parametric” — no assumptions about the form of the classification model • All the work is done at classification time • Works with tiny amounts of training data (single example per class) • The best classification model ever ???

  23. Supervised Classification https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html

  24. Generative Models Discriminative Models

  25. Generative Models Discriminative Models estimate P(X, Y) first

  26. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly

  27. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model

  28. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to observations, generate new observations

  29. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to Only supports observations, generate classification, new observations less flexible

  30. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to Only supports observations, generate classification, new observations less flexible Often more parameters, but more flexible

  31. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to Only supports observations, generate classification, new observations less flexible Often more parameters, Often fewer parameters, better but more flexible performance on small data

  32. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to Only supports observations, generate classification, new observations less flexible Often more parameters, Often fewer parameters, better but more flexible performance on small data Naive Bayes, Bayes Nets, VAEs, GANs

  33. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to Only supports observations, generate classification, new observations less flexible Often more parameters, Often fewer parameters, better but more flexible performance on small data Naive Bayes, Bayes Logistic Regression, Nets, VAEs, GANs SVMs, Perceptrons

  34. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to Only supports observations, generate classification, new observations less flexible Often more parameters, Often fewer parameters, better but more flexible performance on small data Naive Bayes, Bayes Logistic Regression, Nets, VAEs, GANs SVMs, Perceptrons KNN

  35. Generative Models Discriminative Models estimate P(X, Y) first estimate P(Y | X) directly /no explicit probability model Can assign probability to Only supports observations, generate classification, new observations less flexible Often more parameters, Often fewer parameters, better but more flexible performance on small data Naive Bayes, Bayes Logistic Regression, Nets, VAEs, GANs SVMs, Perceptrons KNN

  36. Supervised Classification

  37. Supervised Classification Good if not dramatic fizz. *** Rubbery - rather oxidised. * Gamy, succulent tannins. Lovely. **** Provence herbs, creamy, lovely. **** Lovely mushroomy nose and good length. ***** Quite raw finish. A bit rubbery. **

  38. Supervised Classification Lovely mushroomy nose and good length. 1 Gamy, succulent tannins. Lovely. 1 Provence herbs, creamy, lovely. 1 Good if not dramatic fizz. 0 Quite raw finish. A bit rubbery. 0 Rubbery - rather oxidised. 0

  39. Lovely mushroomy nose and good length. 1 Gamy, succulent tannins. Lovely. 1 Supervised Classification Provence herbs, creamy, lovely. 1 Quite raw finish. A bit rubbery. 0 Good if not dramatic fizz. 0 Rubbery - rather oxidised. 0 Label lovely good raw rubbery rather mushroomy gamy … 1 1 1 0 0 0 0 0 … 1 1 0 0 0 0 0 1 … 1 1 0 0 0 0 0 0 … 0 0 0 1 1 0 0 0 …

  40. Lovely mushroomy nose and good length. 1 Gamy, succulent tannins. Lovely. 1 Supervised Classification Provence herbs, creamy, lovely. 1 Quite raw finish. A bit rubbery. 0 Good if not dramatic fizz. 0 Rubbery - rather oxidised. 0 y Label lovely good raw rubbery rather mushroomy gamy … 1 1 1 0 0 0 0 0 … 1 1 0 0 0 0 0 1 … 1 1 0 0 0 0 0 0 … 0 0 0 1 1 0 0 0 …

  41. Lovely mushroomy nose and good length. 1 Gamy, succulent tannins. Lovely. 1 Supervised Classification Provence herbs, creamy, lovely. 1 Quite raw finish. A bit rubbery. 0 Good if not dramatic fizz. 0 Rubbery - rather oxidised. 0 y X Label lovely good raw rubbery rather mushroomy gamy … 1 1 1 0 0 0 0 0 … 1 1 0 0 0 0 0 1 … 1 1 0 0 0 0 0 0 … 0 0 0 1 1 0 0 0 …

  42. Lovely mushroomy nose and good length. 1 Gamy, succulent tannins. Lovely. 1 Supervised Classification Provence herbs, creamy, lovely. 1 Quite raw finish. A bit rubbery. 0 Good if not dramatic fizz. 0 Rubbery - rather oxidised. 0 y X Label lovely good raw rubbery rather mushroomy gamy … 1 1 1 0 0 0 0 0 … 1 1 0 0 0 0 0 1 … 1 1 0 0 0 0 0 0 … 0 0 0 1 1 0 0 0 … ??? 0 1 1 0 1 0 1 …

  43. Bayes Rule

  44. Bayes Rule P(Y|X) = P(X|Y)P(Y) P(X)

  45. Bayes Rule P(Y|X) = P(X|Y)P(Y) P(X)

  46. Bayes Rule Label lovely good raw rubbery rather mushroomy gamy … 1 1 1 0 0 0 0 0 … 1 1 0 0 0 0 0 1 … 1 1 0 0 0 0 0 0 … 0 0 0 1 1 0 0 0 …

Recommend


More recommend