Lecture 3: Loss functions and Optimization Fei-Fei Li & Andrej - PowerPoint PPT Presentation

Lecture 3: Loss functions and Optimization Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 1

Administrative A1 is due Jan 20 (Wednesday). ~9 days left Warning: Jan 18 (Monday) is Holiday (no class/office hours) Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 2

Recall from last time… Challenges in Visual Recognition Deformation Camera pose Illumination Occlusion Intraclass variation Background clutter Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 3

Recall from last time… data-driven approach, kNN the data NN classifier 5-NN classifier Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 4

Recall from last time… Linear classifier image parameters 10 numbers, indicating f( x , W ) class scores [32x32x3] array of numbers 0...1 (3072 numbers total) Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 5

Recall from last time… Going forward: Loss function/Optimization TODO: 1. Define a loss function that quantifies our unhappiness with the 3.42 -3.45 -0.51 scores across the training -8.87 4.64 6.04 data. 0.09 2.65 5.31 2.9 5.1 -4.22 4.48 2.64 2. Come up with a way of -4.19 8.02 5.55 3.58 efficiently finding the 3.78 -4.34 4.49 parameters that minimize 1.06 -1.5 -4.37 the loss function. -0.36 -4.79 -2.09 (optimization) -0.72 6.14 -2.93 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 6

Suppose: 3 training examples, 3 classes. With some W the scores are: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car -1.7 2.0 -3.1 frog Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 7

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car -1.7 2.0 -3.1 frog Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 8

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car = max(0, 5.1 - 3.2 + 1) +max(0, -1.7 - 3.2 + 1) -1.7 2.0 -3.1 frog = max(0, 2.9) + max(0, -3.9) = 2.9 + 0 2.9 Losses: = 2.9 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 9

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car = max(0, 1.3 - 4.9 + 1) +max(0, 2.0 - 4.9 + 1) -1.7 2.0 -3.1 frog = max(0, -2.6) + max(0, -1.9) = 0 + 0 2.9 0 Losses: = 0 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 10

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car = max(0, 2.2 - (-3.1) + 1) +max(0, 2.5 - (-3.1) + 1) -1.7 2.0 -3.1 frog = max(0, 5.3) + max(0, 5.6) = 5.3 + 5.6 2.9 0 10.9 Losses: = 10.9 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 11

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat and the full training loss is the mean 5.1 4.9 2.5 over all examples in the training data: car -1.7 2.0 -3.1 frog L = (2.9 + 0 + 10.9)/3 2.9 0 10.9 Losses: = 4.6 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 12

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat Q: what if the sum 5.1 4.9 2.5 car was instead over all -1.7 2.0 -3.1 classes? frog (including j = y_i) 2.9 0 10.9 Losses: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 13

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car Q2: what if we used a -1.7 2.0 -3.1 mean instead of a frog sum here? 2.9 0 10.9 Losses: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 14

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car Q3: what if we used -1.7 2.0 -3.1 frog 2.9 0 10.9 Losses: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 15

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat 5.1 4.9 2.5 car Q4: what is the -1.7 2.0 -3.1 min/max possible frog loss? 2.9 0 10.9 Losses: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 16

Suppose: 3 training examples, 3 classes. Multiclass SVM loss: With some W the scores are: Given an example where is the image and where is the (integer) label, and using the shorthand for the scores vector: the SVM loss has the form: 3.2 1.3 2.2 cat Q5: usually at 5.1 4.9 2.5 car initialization W are small -1.7 2.0 -3.1 numbers, so all s ~= 0. frog What is the loss? 2.9 0 10.9 Losses: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 17

Example numpy code: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 18

Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 19

There is a bug with the loss: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 20

There is a bug with the loss: E.g. Suppose that we found a W such that L = 0. Is this W unique? Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 21

Lecture 3: Loss functions and Optimization Fei-Fei Li & Andrej - PowerPoint PPT Presentation

Lecture 3: Loss functions and Optimization Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 3 - Lecture 3 - 11 Jan 2016 11 Jan 2016 1 Administrative A1 is due Jan 20

Early Hearing Early Hearing Early Hearing loss D Early Hearing-loss D loss D loss D

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

Prior and loss robustness for varoius loss functions Agnieszka Kami nska and Zdzis law

Elementary Functions Part 1, Functions Lecture 1.4a, Symmetries of Functions: Even and Odd

Elementary Functions Part 1, Functions Lecture 1.1b, Functions defined by equations Dr. Ken W.

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Online Learning with Pairwise Loss Functions Online Learning with Pairwise Loss Functions MLSIG

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

BEEM103 Optimization Techniques for Economists Level Curves Multivariate Functions Isoquants

Periodic Functions and Orthogonal Systems Periodic Functions Even and Odd Functions

Elementary Functions Part 1, Functions Lecture 1.1c, Finding the domains of functions Dr. Ken W.

More on Functions Thomas Schwarz, SJ Marquette University Functions of Functions Functions

Orthonormal bases of functions April 24, 2018 Data - Vectors or Functions Vectors Functions

Functions Programmer-Defined Functions Local Variables in Functions Overloading

Functions Declarations vs Definitions Inline Functions Class Member functions

Stenomaps : Shorthand for shapes Arthur van Goethem, Andreas Reimer, Bettina Speckmann, and Jo Wood

23. Shortest Paths Motivation, Dijkstras algorithm on distance graphs, Bellman-Ford Algorithm,

GIT RECAP Check status since last commit: $ git status Stage changes/add new files: $ git add

A Critical-Time-Point Approach for All-start-time Lagrangian Shortest Paths V. Gunturi E. Nunes

Lab 8 Reading, writing files Modules Exception Handling Using lists to solve

Lab 8 Reading, wri0ng files Modules Excep0on Handling Using lists to solve problems

CS 4803 / 7643: Deep Learning Topics: Linear Classifiers Loss Functions Dhruv Batra

Natural Language Processing (CSE 490U): Language Models Noah Smith 2017 c University of