Supervised Learning: The Setup Machine Learning 1
Last lecture We saw – What is learning? Learning as generalization – The badges game 2
This lecture • More badges • Formalizing supervised learning – Instance space and features What are inputs to the learning problem? – Label space What is the output of the learned function – Hypothesis space What is being learned? 3 Some slides based on lectures from Tom Dietterich, Dan Roth
The badges game 4
Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - (Full data on the class website, you can stare at it longer if you want) 5
Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - What is the label for Indiana Jones ? (Full data on the class website, you can stare at it longer if you want) 6
Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - How were the labels generated? (Full data on the class website, you can stare at it longer if you want) 7
Let’s play Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - How were the labels generated? If last letter of first name is before last letter of last name: label = + else label = - (Full data on the class website, you can stare at it longer if you want) 8
Questions to think about How could you be certain that you got the right function? How did you arrive at it? • Learning issues: Is this prediction or just modeling data? Is there a difference? • How did you know that you should look at the letters? • What background knowledge about letters did you use? How • did you know that it is relevant? What “learning algorithm” did you use? • 9
What is supervised learning? 10
Instances and Labels Running example: Automatically tag news articles 11
Instances and Labels Running example: Automatically tag news articles A label An instance of a news article that needs to be classified 12
Instances and Labels Running example: Automatically tag news articles A label An instance of a news article that needs to be classified 13
Instances and Labels Running example: Automatically tag news articles Instance Space : All possible Label Space : All possible labels news articles 14
Instances and Labels 𝒴 : Instance Space The set of examples that need to be classified Eg: The set of all possible names, documents, sentences, images, emails, etc 15
Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space The set of examples The set of all that need to be possible labels classified Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 16
Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 17
Instances and Labels 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified The goal of learning: Find this target function Learning is search over functions Eg: { Spam , Not-Spam }, { + , - }, Eg: The set of all possible etc. names, documents, sentences, images, emails, etc 18
Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 19
Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + 𝑦 , , 𝑔(𝑦 , ) ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 20
Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 21
Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function : 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Labeled training data 22
Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function : 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) This is the training phase. Labeled training data 23
Supervised learning 𝒴 : Instance Space 𝒵 : Label Space Target function The set of examples The set of all 𝑧 = 𝑔(𝑦) that need to be possible labels classified Learning algorithm only sees examples of the function f in action 𝑦 ) , 𝑔(𝑦 ) ) 𝑦 + , 𝑔 𝑦 + Learning 𝑦 , , 𝑔(𝑦 , ) A learned function : 𝒴 → 𝒵 algorithm ⋮ 𝑦 . , 𝑔(𝑦 . ) Can you think of other training protocols? Labeled training data 24
Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = (𝑦) 25
Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = (𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? (𝑦) 26
Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = (𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? (𝑦) Apply the model to many test examples and compare to the target’s prediction Aggregate these results to get a quality measure 27
Supervised learning: Evaluation Target function 𝒴 : Instance Space 𝒵 : Label Space 𝑧 = 𝑔(𝑦) The set of examples The set of all that need to be possible labels Learned function classified y = (𝑦) 𝑔(𝑦) Are they different? Draw test example 𝑦 ∈ 𝒴 How different? (𝑦) Apply the model to many test examples and compare to the target’s prediction Can we use these test examples during the training phase? 28
Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) 29
Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) The function 𝑔 is unknown 30
Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features 31
Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features For a training example (𝑦, 𝑔 𝑦 ) , the value of 𝑔 𝑦 is called its label 32
Supervised learning: General setting Given: Training examples that are pairs of the form (𝑦, 𝑔 𝑦 ) Typically the input 𝑦 is represented as feature vectors The function Example: 𝑦 ∈ 0,1 7 or 𝑦 ∈ ℜ 7 (d-dimensional vectors) • 𝑔 is unknown A deterministic mapping from instances in your • problem (e.g., news articles) to features For a training example (𝑦, 𝑔 𝑦 ) , the value of 𝑔 𝑦 is called its label The goal of learning : Use the training examples to find a good approximation for 𝑔 33
Recommend
More recommend