supervised learning the setup
play

Supervised Learning: The Setup Machine Learning 1 Last lecture We - PowerPoint PPT Presentation

Supervised Learning: The Setup Machine Learning 1 Last lecture We saw What is learning? Learning as generalization The badges game 2 This lecture More badges Formalizing supervised learning Instance space and features


  1. Supervised Learning: The Setup Machine Learning 1

  2. Last lecture We saw – What is learning? Learning as generalization – The badges game 2

  3. This lecture • More badges • Formalizing supervised learning – Instance space and features What are inputs to the learning problem? – Label space What is the output of the learned function – Hypothesis space What is being learned? 3 Some slides based on lectures from Tom Dietterich, Dan Roth

  4. The badges game 4

  5. Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - (Full data on the class website, you can stare at it longer if you want) 5

  6. Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - What is the label for Indiana Jones ? (Full data on the class website, you can stare at it longer if you want) 6

  7. Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - How were the labels generated? (Full data on the class website, you can stare at it longer if you want) 7

  8. Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - How were the labels generated? If last letter of first name is before last letter of last name: label = + else label = - (Full data on the class website, you can stare at it longer if you want) 8

  9. Questions to think about How could you be certain that you got the right function? How did you arrive at it? • Learning issues: Is this prediction or just modeling data? Is there a difference? • How did you know that you should look at the letters? • What background knowledge about letters did you use? How • did you know that it is relevant? What “learning algorithm” did you use? • 9

  10. What is supervised learning? 10

  11. Instances and Labels Running example: Automatically tag news articles 11

  12. Instances and Labels Running example: Automatically tag news articles A label An instance of a news article that needs to be classified 12

  13. Instances and Labels Running example: Automatically tag news articles A label An instance of a news article that needs to be classified 13

  14. Instances and Labels Running example: Automatically tag news articles Instance Space : All possible Label Space : All possible labels news articles 14

  15. Instances and Labels 𝒴 : Instance Space The set of examples that need to be classified Eg: The set of all possible names, documents, sentences, images, emails, etc 15

  16. Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space The set of examples The set of all that need to be possible labels classified Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 16

  17. Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 17

  18. Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified The goal of learning: Find this target function Learning is search over functions Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 18

  19. Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 19

  20. Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + 𝑦 , , 𝑔(𝑦 , ) ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 20

  21. Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 21

  22. Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function 𝑕: 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 22

  23. Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function 𝑕: 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) This is the training phase. Labeled training data 23

  24. Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function 𝑕: 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Can you think of other training protocols? Labeled training data 24

  25. Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 25

  26. Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? 𝑕(𝑦) 26

  27. Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? 𝑕(𝑦) Apply the model to many test examples and compare to the target’s prediction Aggregate these results to get a quality measure 27

  28. Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = 𝑕(𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? 𝑕(𝑦) Apply the model to many test examples and compare to the target’s prediction Can we use these test examples during the training phase? 28

  29. Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) 29

  30. Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) The function 𝑔 is unknown 30

  31. Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features 31

  32. Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features For a training example (𝑦, 𝑔 𝑦 ) , the value of 𝑔 𝑦 is called its label 32

  33. Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features For a training example (𝑦, 𝑔 𝑦 ) , the value of 𝑔 𝑦 is called its label The goal of learning : Use the training examples to find a good approximation for 𝑔 33

Recommend


More recommend