Administrative - how is the assignment going? - btw, the notes get - PowerPoint PPT Presentation

Administrative - how is the assignment going? - btw, the notes get updated all the time based on your feedback - no lecture on Monday Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 1

Lecture 4: Optimization Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 2

Image Classification assume given set of discrete labels {dog, cat, truck, plane, ...} cat Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 3

Data-driven approach Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 4

1. Score function Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 5

Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 6

1. Score function 2. Two loss functions Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 7

Three key components to training Neural Nets: 1. Score function 2. Loss function 3. Optimization Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 9

Brief aside: Image Features - In practice, very rare to see Computer Vision applications that train linear classifiers on pixel values Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 10

Brief aside: Image Features - In practice, very rare to see Computer Vision applications that train linear classifiers on pixel values Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 11

Example: Color (Hue) Histogram hue bins +1 Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 12

Example: HOG features 8x8 pixel region, quantize the edge orientation into 9 bins (images from vlfeat.org) Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 13

Example: Bag of Words 1. Resize patch to a fixed size (e.g. 32x32 pixels) 2. Extract HOG on the patch (get 144 numbers) repeat for each detected feature gives a matrix of size [number_of_features x 144] Problem: different images will have different numbers of features. Need fixed-sized vectors for linear classification Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 14

Example: Bag of Words 1. Resize patch to a fixed size (e.g. 32x32 pixels) 2. Extract HOG on the patch (get 144 numbers) repeat for each detected feature gives a matrix of size [number_of_features x 144] Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 15

Example: Bag of Words histogram of visual words visual word vectors 1000-d vector learn k-means centroids “vocabulary of visual words 144 1000-d vector e.g. 1000 centroids 1000-d vector Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 16

Brief aside: Image Features Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 17

Most recognition systems are build on the same Architecture (slide from Yann LeCun) Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 18

Most recognition systems are build on the same Architecture CNNs: end-to-end models (slide from Yann LeCun) Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 19

Visualizing the loss function Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 20

Visualizing the (SVM) loss function Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 21

Visualizing the (SVM) loss function the full data loss: Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 24

Visualizing the (SVM) loss function Suppose there are 3 examples with 3 classes (class 0, 1, 2 in sequence), then this becomes: Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 25

Visualizing the (SVM) loss function Question: CIFAR-10 has 50,000 training images, 5,000 per class and 10 labels. How many occurrences of one classifier row in the full data loss? Suppose there are 3 examples with 3 classes (class 0, 1, 2 in sequence), then this becomes: Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 28

Optimization Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 29

Strategy #1: A first very bad idea solution: Random search Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 30

Strategy #1: A first very bad idea solution: Random search Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 31

Strategy #1: A first very bad idea solution: Random search what’s up with 0.0001? Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 32

Lets see how well this works on the test set... Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 33

Fun aside: When W = 0, what is the CIFAR-10 loss for SVM and Softmax? Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 34

Strategy #2: A better but still very bad idea solution: Random local search Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 35

Strategy #2: A better but still very bad idea solution: Random local search gives 21.4%! Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 36

Strategy #3: Following the gradient In 1-dimension, the derivative of a function: In multiple dimension, the gradient is the vector of (partial derivatives). Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 39

Evaluation the gradient numerically Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 40

Evaluation the gradient numerically “finite difference approximation” Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 41

Evaluation the gradient numerically in practice: “centered difference formula” Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 42

Evaluation the gradient numerically Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 43

performing a parameter update Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 44

performing a parameter update Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 45

original W negative gradient direction Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7 Jan 2015 46

Administrative - how is the assignment going? - btw, the notes get - PowerPoint PPT Presentation

Administrative - how is the assignment going? - btw, the notes get updated all the time based on your feedback - no lecture on Monday Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7

Administrative Notes Administrative Notes Michael Stonebraker, Joseph M. Hellerstein Michael

Week 4 Basic Python 2 1 Notes from Assignment 3 Biggest thing: please make sure your assignment

ADMINISTRATIVE PANEL ADMINISTRATIVE PANEL Administrative panel is an instrument which helps

Assignment II: Calculator Brain Objective In this assignment, youre going to push your

6.s096 Lecture 2 1 Thursday, January 10, 13 Administrative Notes Assignment 1 due at

Problem solved: IBM Notes Replacement 2 IBM Notes Replacement Migrating from IBM Notes to

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

Briefing Notes The Briefing Notes Page The Briefing Notes include: An introduction to the

Week 2 Command-line Text Processing 1 Notes from Assignment 1 Careful with assignment naming

3Q 2017 SET Opportunity Day Presentation 7 Dec 2017 Agenda Section 1 About BTW Group Section 2

3Q2016 SET Opportunity Day Presentation 14 November 2016 Agenda Section 1 About BTW Group

Alleged Use of Biological and Toxin Weapons Dr B.P. Steyn South Africa 1 Scope

Emulation Circuitry Adds P1581 to an Off-the-Shelf SRAM Bob Russell (r.russell@ieee.org) P1581

Typing Copyless Message Passing Viviana Bono Chiara Messa Luca Padovani Dipartimento di

Survey and Comparison of Open Source Time Series Databases SCDM @ BTW 2017 Andreas Bader,

Ranking Specific Sets of Objects Jan Maly, Stefan Woltran PPI17 @ BTW 2017, Stuttgart March 7,

Manufacturing Test Strategy Cost Model Rosa Reinosa Carlos Michel Hewlett-Packard Company

Rare B Decays and CP Violation Beyond the Standard Model Prospects for New Physics New

Continuous Integration im Rechenzentrum Michael Prokop Roadmap Begriffsklrung + Grnde

Microstructure, mutually exciting processes Delattre, A. Iuga, M.H., and market impact J.F.

Efficient Data-Parallel Cumulative Aggregates for Large-Scale Machine Learning Matthias Boehm 1 ,

Semantic Subtyping for Session Types Luca Padovani Dipartimento di Informatica, Universit di

Quantum algorithms for Information Set Decoding Elena Kirshanova ENS Lyon April 11, 2018 Quantum

Bonsai in the Fog: an Active Learning Lab with Fog Computing Antonio Brogi, Stefano Fort orti,