Lecture 3: Linear Classifiers Justin Johnson Lecture 3 - 1 - PowerPoint PPT Presentation

Lecture 3: Linear Classifiers Justin Johnson Lecture 3 - 1 September 11, 2019

Reminder: Assignment 1 • http://web.eecs.umich.edu/~justincj/teaching/eecs498/assignment1.html • Due Sunday September 15, 11:59pm EST • We have written a homework validation script to check the format of your .zip file before you submit to Canvas: • https://github.com/deepvision-class/tools#homework- validation • This script ensures that your .zip and .ipynb files are properly structured; they do not check correctness • It is your responsibility to make sure your submitted .zip file passes the validation script Justin Johnson Lecture 3 - 2 September 11, 2019

Last time: Image Classification Input : image Output : Assign image to one of a fixed set of categories cat bird deer dog truck This image by Nikita is licensed under CC-BY 2.0 Justin Johnson Lecture 3 - 3 September 11, 2019

Last Time: Challenges of Recognition Illumination Deformation Occlusion Viewpoint This image by Umberto Salvagnin is This image is CC0 1.0 public domain This image by jonsson is licensed licensed under CC-BY 2.0 under CC-BY 2.0 Clutter Intraclass Variation This image is CC0 1.0 public domain This image is CC0 1.0 public domain Justin Johnson Lecture 3 - 4 September 11, 2019

Last time: Data-Drive Approach, kNN 1-NN classifier 5-NN classifier train test train validation test Justin Johnson Lecture 3 - 5 September 11, 2019

Today: Linear Classifiers Justin Johnson Lecture 3 - 6 September 11, 2019

Neural Network Linear classifiers This image is CC0 1.0 public domain Justin Johnson Lecture 3 - 7 September 11, 2019

Recall CIFAR10 50,000 training images each image is 32x32x3 10,000 test images. Justin Johnson Lecture 3 - 8 September 11, 2019

Parametric Approach Image 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Justin Johnson Lecture 3 - 9 September 11, 2019

Parametric Approach: Linear Classifier f(x,W) = Wx Image 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Justin Johnson Lecture 3 - 10 September 11, 2019

Parametric Approach: Linear Classifier (3072,) f(x,W) = Wx Image (10,) (10, 3072) 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Justin Johnson Lecture 3 - 11 September 11, 2019

Parametric Approach: Linear Classifier (3072,) f(x,W) = Wx + b (10,) Image (10,) (10, 3072) 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Justin Johnson Lecture 3 - 12 September 11, 2019

Example for 2x2 image, 3 classes (cat/dog/ship) Stretch pixels into column f(x,W) = Wx + b 56 56 231 231 24 2 24 Input image 2 (2, 2) (4,) Justin Johnson Lecture 3 - 13 September 11, 2019

Example for 2x2 image, 3 classes (cat/dog/ship) Stretch pixels into column f(x,W) = Wx + b 56 0.2 -0.5 0.1 2.0 1.1 -96.8 56 231 231 + = 1.5 1.3 2.1 0.0 3.2 437.9 24 2 24 0 0.25 0.2 -0.3 -1.2 61.95 Input image 2 (2, 2) W b (3,) (3, 4) (4,) (3,) Justin Johnson Lecture 3 - 14 September 11, 2019

Linear Classifier: Algebraic Viewpoint Stretch pixels into column f(x,W) = Wx + b 56 0.2 -0.5 0.1 2.0 1.1 -96.8 56 231 231 + = 1.5 1.3 2.1 0.0 3.2 437.9 24 2 24 0 0.25 0.2 -0.3 -1.2 61.95 Input image 2 (2, 2) W b (3,) (3, 4) (4,) (3,) Justin Johnson Lecture 3 - 15 September 11, 2019

Add extra one to data vector; Linear Classifier: Bias Trick bias is absorbed into last column of weight matrix Stretch pixels into column 56 0.2 -0.5 0.1 2.0 1.1 -96.8 56 231 231 = 1.5 1.3 2.1 0.0 3.2 437.9 24 2 24 0 0.25 0.2 -0.3 -1.2 61.95 Input image 2 (2, 2) W (3,) (3, 5) (5,) 1 Justin Johnson Lecture 3 - 16 September 11, 2019

Linear Classifier: Predictions are Linear! f(x, W) = Wx (ignore bias) f(cx, W) = W(cx) = c * f(x, W) Justin Johnson Lecture 3 - 17 September 11, 2019

Linear Classifier: Predictions are Linear! f(x, W) = Wx (ignore bias) f(cx, W) = W(cx) = c * f(x, W) Image Scores 0.5 * Scores 0.5 * Image -96.8 -48.4 218.9 437.8 62.0 31.0 Justin Johnson Lecture 3 - 18 September 11, 2019

Interpreting a Linear Classifier Algebraic Viewpoint f(x,W) = Wx + b Stretch pixels into column 56 0.2 -0.5 0.1 2.0 1.1 -96.8 56 231 231 + = 1.5 1.3 2.1 0.0 3.2 437.9 24 2 24 0 0.25 0.2 -0.3 -1.2 61.95 Input image 2 (2, 2) W b (3,) (3, 4) (4,) (3,) Justin Johnson Lecture 3 - 19 September 11, 2019

Interpreting a Linear Classifier Algebraic Viewpoint f(x,W) = Wx + b Stretch pixels into column 0.2 -0.5 1.5 1.3 0 .25 W 56 0.2 -0.5 0.1 2.0 1.1 -96.8 56 231 0.1 2.0 2.1 0.0 0.2 -0.3 231 + = 1.5 1.3 2.1 0.0 3.2 437.9 24 2 24 0 0.25 0.2 -0.3 -1.2 61.95 Input image 2 b (2, 2) W 1.1 3.2 b -1.2 (3,) (3, 4) (4,) (3,) -96.8 437.9 61.95 Justin Johnson Lecture 3 - 20 September 11, 2019

Interpreting an Linear Classifier 0.2 -0.5 1.5 1.3 0 .25 W 0.1 2.0 2.1 0.0 0.2 -0.3 b 1.1 3.2 -1.2 -96.8 437.9 61.95 Justin Johnson Lecture 3 - 21 September 11, 2019

Interpreting an Linear Classifier: Visual Viewpoint 0.2 -0.5 1.5 1.3 0 .25 W 0.1 2.0 2.1 0.0 0.2 -0.3 b 1.1 3.2 -1.2 -96.8 437.9 61.95 Justin Johnson Lecture 3 - 22 September 11, 2019

Interpreting an Linear Classifier: Visual Viewpoint Linear classifier has one “template” per category 0.2 -0.5 1.5 1.3 0 .25 W 0.1 2.0 2.1 0.0 0.2 -0.3 b 1.1 3.2 -1.2 -96.8 437.9 61.95 Justin Johnson Lecture 3 - 23 September 11, 2019

Interpreting an Linear Classifier: Visual Viewpoint Linear classifier has one “template” per category 0.2 -0.5 1.5 1.3 0 .25 W A single template cannot capture 0.1 2.0 2.1 0.0 0.2 -0.3 multiple modes of the data b 1.1 3.2 -1.2 e.g. horse template has 2 heads! -96.8 437.9 61.95 Justin Johnson Lecture 3 - 24 September 11, 2019

Interpreting a Linear Classifier: Geometric Viewpoint f(x,W) = Wx + b Airplane Score Deer Score Classifier score Car Score Array of 32x32x3 numbers (3072 numbers total) Value of pixel (15, 8, 0) Justin Johnson Lecture 3 - 25 September 11, 2019

Interpreting a Linear Classifier: Geometric Viewpoint Pixel f(x,W) = Wx + b (11, 11, 0) Car score increases this way Pixel (15, 8, 0) Array of 32x32x3 numbers Car Score (3072 numbers total) = 0 Justin Johnson Lecture 3 - 26 September 11, 2019

Interpreting a Linear Classifier: Geometric Viewpoint Car template Pixel on this line f(x,W) = Wx + b (11, 11, 0) Car score increases this way Pixel (15, 8, 0) Array of 32x32x3 numbers Car Score (3072 numbers total) = 0 Justin Johnson Lecture 3 - 27 September 11, 2019

Interpreting a Linear Classifier: Geometric Viewpoint Car template Airplane Pixel on this line f(x,W) = Wx + b Score (11, 11, 0) Car score increases this way Pixel (15, 8, 0) Array of 32x32x3 numbers Car Score (3072 numbers total) = 0 Deer Score Justin Johnson Lecture 3 - 28 September 11, 2019

Interpreting a Linear Classifier: Geometric Viewpoint Car template Airplane Pixel Hyperplanes carving up a on this line Score (11, 11, 0) high-dimensional space Car score increases this way Pixel (15, 8, 0) Car Score = 0 Deer Score Plot created using Wolfram Cloud Justin Johnson Lecture 3 - 29 September 11, 2019

Hard Cases for a Linear Classifier Class 1 : Class 1 : Class 1 : First and third quadrants 1 <= L2 norm <= 2 Three modes Class 2 : Class 2 : Class 2 : Everything else Everything else Second and fourth quadrants Justin Johnson Lecture 3 - 30 September 11, 2019

Recall: Perceptron couldn’t learn XOR y X Y F(x,y) 0 0 0 0 1 1 1 0 1 1 1 0 x Justin Johnson Lecture 3 - 31 September 11, 2019

Linear Classifier: Three Viewpoints Visual Viewpoint Algebraic Viewpoint Geometric Viewpoint One template Hyperplanes f(x,W) = Wx per class cutting up space Justin Johnson Lecture 3 - 32 September 11, 2019

f(x,W) = Wx + b So Far: Defined a linear score funct ction Given a W, we can compute class scores for an image x. -3.45 -0.51 3.42 -8.87 6.04 4.64 0.09 5.31 2.65 But how can we 2.9 -4.22 5.1 actually choose a 4.48 -4.19 2.64 8.02 3.58 5.55 good W? 3.78 4.49 -4.34 1.06 -4.37 -1.5 -0.36 -2.09 -4.79 -0.72 -2.93 6.14 Cat image by Nikita is licensed under CC-BY 2.0; Car image is CC0 1.0 public domain; Frog image is in the public domain Justin Johnson Lecture 3 - 33 September 11, 2019

f(x,W) = Wx + b Choosing a good W TODO: 1. Use a loss function to -3.45 -0.51 3.42 quantify how good a -8.87 6.04 4.64 value of W is 0.09 5.31 2.65 2.9 -4.22 5.1 4.48 -4.19 2. Find a W that minimizes 2.64 8.02 3.58 5.55 the loss function 3.78 4.49 -4.34 ( optimization ) 1.06 -4.37 -1.5 -0.36 -2.09 -4.79 -0.72 -2.93 6.14 Justin Johnson Lecture 3 - 34 September 11, 2019

Loss Function A loss function tells how good our current classifier is Low loss = good classifier High loss = bad classifier (Also called: objective function ; cost function ) Justin Johnson Lecture 3 - 35 September 11, 2019

Lecture 3: Linear Classifiers Justin Johnson Lecture 3 - 1 - PowerPoint PPT Presentation

Lecture 3: Linear Classifiers Justin Johnson Lecture 3 - 1 September 11, 2019 Reminder: Assignment 1 http://web.eecs.umich.edu/~justincj/teaching/eecs498/assignment1.html Due Sunday September 15, 11:59pm EST We have written a

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

First look at structures CS 6355: Structured Prediction 1 So far Binary classifiers

Sketching Linear Classifiers Over Data Streams Kai Sheng Tai Vatsal Sharan, Peter Bailis,

Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology

Linear classifiers CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall

Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology

Chapter 4: Variability O Overview i In statistics, our goal is to measure the amount of

Seven types of community participation (adapted from Pretty 1994 and Cornwall 1996) 1.

supporting international students in the classroom BALEAP. The Janus Moment in EAP: Revisiting

SETTING U G UP A TELE-PSYCHIATRY S Y SERVICE CE F FOR R RU RURAL E EMERGENCY D NCY DEPART

Educator Licensure and Preparation Subcommittee APRIL 30, 2020 Agenda edTPA Research and

DESIGN, SIMULATION, TESTING CSSE 120Rose Hulman Institute of Technology

L ea rn i n g D ee p K e rn el s L ea rn i n g D ee p K e rn el s f or E xpon e nt ial F a m il y D

Part 6: Scoring in a Complete Search System Francesco Ricci Most of these slides comes from the

Lecture 3: Linear Classifiers Justin Johnson Lecture 3 - 1 - PowerPoint PPT Presentation

Lecture 3: Linear Classifiers Justin Johnson Lecture 3 - 1 September 11, 2019 Reminder: Assignment 1 http://web.eecs.umich.edu/~justincj/teaching/eecs498/assignment1.html Due Sunday September 15, 11:59pm EST We have written a

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

First look at structures CS 6355: Structured Prediction 1 So far Binary classifiers

Sketching Linear Classifiers Over Data Streams Kai Sheng Tai Vatsal Sharan, Peter Bailis,

Linear &amp; nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology

Linear classifiers CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall

Linear &amp; nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology

Chapter 4: Variability O Overview i In statistics, our goal is to measure the amount of

Seven types of community participation (adapted from Pretty 1994 and Cornwall 1996) 1.

supporting international students in the classroom BALEAP. The Janus Moment in EAP: Revisiting

SETTING U G UP A TELE-PSYCHIATRY S Y SERVICE CE F FOR R RU RURAL E EMERGENCY D NCY DEPART

Educator Licensure and Preparation Subcommittee APRIL 30, 2020 Agenda edTPA Research and

DESIGN, SIMULATION, TESTING CSSE 120Rose Hulman Institute of Technology

L ea rn i n g D ee p K e rn el s L ea rn i n g D ee p K e rn el s f or E xpon e nt ial F a m il y D

Part 6: Scoring in a Complete Search System Francesco Ricci Most of these slides comes from the

Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology

Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology