introduction to machine learning classification and the
play

Introduction to Machine Learning: Classification and The Noisy - PowerPoint PPT Presentation

Introduction to Machine Learning: Classification and The Noisy Channel Model CMSC 473/673 UMBC Some slides adapted from 3SLP Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier


  1. Introduction to Machine Learning: Classification and The Noisy Channel Model CMSC 473/673 UMBC Some slides adapted from 3SLP

  2. Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier Evaluation

  3. Probabilistic Classification π‘ž 𝑍 π‘Œ) = β„Ž(π‘Œ; 𝑍) Directly model the posterior Discriminatively trained classifier Model the π‘ž 𝑍 π‘Œ) ∝ π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) posterior with Bayes rule Generatively trained classifier

  4. Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier Evaluation

  5. Classification P OLITICS T ERRORISM Three people have been fatally shot, and five S PORTS people, including a mayor, were seriously wounded T ECH as a result of a Shining Path attack today against a H EALTH community in Junin department, central F INANCE Peruvian mountain region. …

  6. Classification P OLITICS T ERRORISM Three people have been fatally shot, and five S PORTS people, including a mayor, were seriously wounded T ECH as a result of a Shining Path attack today against a H EALTH community in Junin department, central F INANCE Peruvian mountain region. …

  7. Classification P OLITICS Electronic alerts have T ERRORISM been used to assist the authorities in moments of S PORTS chaos and potential danger: after the Boston T ECH bombing in 2013, when the Boston suspects were H EALTH still at large, and last month in Los Angeles, F INANCE during an active shooter scare at the airport. … Source: http://www.nytimes.com/2016/09/20/nyregion/cellphone-alerts-used-in-search-of- manhattan-bombing-suspect.html

  8. Classification P OLITICS Electronic alerts have T ERRORISM been used to assist the authorities in moments of S PORTS chaos and potential danger: after the Boston T ECH bombing in 2013, when the Boston suspects were H EALTH still at large, and last month in Los Angeles, F INANCE during an active shooter scare at the airport. … Source: http://www.nytimes.com/2016/09/20/nyregion/cellphone-alerts-used-in-search-of- manhattan-bombing-suspect.html

  9. Classify with Uncertainty Use probabilities

  10. Classify with Uncertainty Use probabilities* *There are non- probabilistic ways to handle uncertainty… but probabilities sure are handy!

  11. Classification P OLITICS .05 Electronic alerts have T ERRORISM .48 been used to assist the authorities in moments of S PORTS .0001 chaos and potential danger: after the Boston T ECH .39 bombing in 2013, when the Boston suspects were H EALTH .0001 still at large, and last month in Los Angeles, F INANCE .0002 during an active shooter scare at the airport. … Source: http://www.nytimes.com/2016/09/20/nyregion/cellphone-alerts-used-in-search-of- manhattan-bombing-suspect.html

  12. Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification

  13. Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input : a document a fixed set of classes C = { c 1 , c 2 ,…, c J } Output : a predicted class c from C

  14. Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input : a document linguistic blob a fixed set of classes C = { c 1 , c 2 ,…, c J } Output : a predicted class c from C

  15. Text Classification: Hand-coded Rules? Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Rules based on combinations of words or other features spam: black-list- address OR (β€œdollars” AND β€œhave been selected”) Accuracy can be high If rules carefully refined by expert Building and maintaining these rules is expensive Can humans faithfully assign uncertainty?

  16. Text Classification: Supervised Machine Learning Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input: a document d a fixed set of classes C = { c 1 , c 2 ,…, c J } A training set of m hand-labeled documents (d 1 ,c 1 ),....,(d m ,c m ) Output: a learned classifier Ξ³ that maps documents to classes

  17. Text Classification: Supervised Machine Learning Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input: NaΓ―ve Bayes a document d Logistic regression a fixed set of classes C = { c 1 , c 2 ,…, c J } A training set of m hand-labeled Support-vector documents (d 1 ,c 1 ),....,(d m ,c m ) machines Output: a learned classifier Ξ³ that maps k-Nearest Neighbors documents to classes …

  18. Text Classification: Supervised Machine Learning Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input: NaΓ―ve Bayes a document d Logistic regression a fixed set of classes C = { c 1 , c 2 ,…, c J } A training set of m hand-labeled Support-vector documents (d 1 ,c 1 ),....,(d m ,c m ) machines Output: a learned classifier Ξ³ that maps k-Nearest Neighbors documents to classes …

  19. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 Multi-label Classification

  20. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ {True, False} ), then a binary classification task Multi-label Classification

  21. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for {True, False} ), then a finite K), then a multi-class binary classification task classification task Q: What are some examples of multi-class classification? Multi-label Classification

  22. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 π‘š are Multi- predicted, then a multi- output label classification task Multi-label Classification

  23. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 π‘š are Multi- predicted, then a multi- output label classification task Given input 𝑦 , predict multiple discrete labels 𝑧 = (𝑧 1 , … , 𝑧 𝑀 ) Multi-label Classification

  24. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 π‘š are Each 𝑧 π‘š could be binary or Multi- predicted, then a multi- multi-class output label classification task Given input 𝑦 , predict multiple discrete labels 𝑧 = (𝑧 1 , … , 𝑧 𝑀 ) Multi-label Classification

  25. Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier Evaluation

  26. Probabilistic Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification class π‘ž 𝑍 π‘Œ) = π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) π‘ž(π‘Œ) observed data

  27. Probabilistic Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification prior class-based likelihood probability of (language model) class class π‘ž 𝑍 π‘Œ) = π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) π‘ž(π‘Œ) observed observation likelihood (averaged over all classes) data

  28. Probabilistic Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification prior class-based likelihood probability of (language model) class class π‘ž 𝑍 π‘Œ) = π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) π‘ž(π‘Œ) observed observation likelihood (averaged over all classes) data

  29. Classification with Bayes Rule argmax 𝑍 π‘ž 𝑍 π‘Œ)

  30. Classification with Bayes Rule π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) argmax 𝑍 π‘ž(π‘Œ)

  31. Classification with Bayes Rule π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) argmax 𝑍 π‘ž(π‘Œ) constant with respect to Y

  32. Classification with Bayes Rule argmax 𝑍 π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍)

  33. Classification with Bayes Rule argmax 𝑍 log π‘ž π‘Œ 𝑍) + log π‘ž(𝑍)

Recommend


More recommend