the perceptron
play

The Perceptron CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Credit: - PowerPoint PPT Presentation

The Perceptron CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Credit: figures by Piyush Rai and Hal Daume III This week Project 1 posted Form teams! Due Wed March 2 nd by 2:59pm A new model/algorithm the perceptron and its


  1. The Perceptron CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Credit: figures by Piyush Rai and Hal Daume III

  2. This week • Project 1 posted – Form teams! – Due Wed March 2 nd by 2:59pm • A new model/algorithm – the perceptron – and its variants: voted, averaged • Fundamental Machine Learning Concepts – Online vs. batch learning – Error-driven learning

  3. Geometry concept: Hy Hyperplane erplane • Separates a D-dimensional space into two half-spaces • Defined by an outward pointing normal vector 𝑥 ∈ ℝ 𝐸 – 𝑥 is orthogonal to any vector lying on the hyperplane • Hyperplane passes through the origin, unless we also define a bias term b

  4. Binary classification via hyperplanes • Let’s assume that the decision boundary is a hyperplane • Then, training consists in finding a hyperplane 𝑥 that separates positive from negative examples

  5. Binary classification via hyperplanes • At test time, we check on what side of the hyperplane examples fall 𝑧 = 𝑡𝑗𝑕𝑜(𝑥 𝑈 𝑦 + 𝑐)

  6. Function Approximation with Perceptron Problem setting • Set of possible instances 𝑌 – Each instance 𝑦 ∈ 𝑌 is a feature vector 𝑦 = [𝑦 1 , … , 𝑦 𝐸 ] • Unknown target function 𝑔: 𝑌 → 𝑍 – 𝑍 is binary valued {-1; +1} • Set of function hypotheses 𝐼 = ℎ ℎ: 𝑌 → 𝑍} – Each hypothesis ℎ is a hyperplane in D-dimensional space Input • Training examples { 𝑦 1 , 𝑧 1 , … 𝑦 𝑂 , 𝑧 𝑂 } of unknown target function 𝑔 Output • Hypothesis ℎ ∈ 𝐼 that best approximates target function 𝑔

  7. Perception: Prediction Algorithm

  8. Aside: biological inspiration Analogy: the perceptron as a neuron

  9. Perceptron Training Algorithm

  10. Properties of the Perceptron training algorithm • Online – We look at one example at a time, and update the model as soon as we make an error – As opposed to batch algorithms that update parameters after seeing the entire training set • Error-driven – We only update parameters/model if we make an error

  11. Perceptron update: geometric interpretation

  12. Practical considerations • The order of training examples matters! – Random is better • Early stopping – Good strategy to avoid overfitting • Simple modifications dramatically improve performance – voting or averaging

  13. Predicting with • The voted perceptron • The averaged perceptron • Require keeping track of “survival time” of weight vectors

  14. How would you modify this algorithm for voted perceptron?

  15. How would you modify this algorithm for averaged perceptron?

  16. Averaged perceptron decision rule can be rewritten as

  17. Averaged Perceptron Training

  18. Can the perceptron always find a hyperplane to separate positive from negative examples?

  19. This week • Project 1 posted – Form teams! – Due Wed March 2 nd by 2:59pm • A new model/algorithm – the perceptron – and its variants: voted, averaged • Fundamental Machine Learning Concepts – Online vs. batch learning – Error-driven learning

Recommend


More recommend