ECE 5984: Introduction to Machine Learning Topics: Supervised - PowerPoint PPT Presentation

ECE 5984: Introduction to Machine Learning Topics: – Supervised Learning – Measuring performance – Nearest Neighbour Readings: Barber 14 (kNN) Dhruv Batra Virginia Tech

TA: Qing Sun • PhD candidate at ECE department • Research work/interest: – Diverse outputs based on structured probabilistic models – Structured-output prediction (C) Dhruv Batra 2

Recap from last time (C) Dhruv Batra 3

(C) Dhruv Batra 4 Slide Credit: Yaser Abu-Mostapha

Nearest Neighbour • Demo 1 – http://cgm.cs.mcgill.ca/~soss/cs644/projects/perrier/ Nearest.html • Demo 2 – http://www.cs.technion.ac.il/~rani/LocBoost/ (C) Dhruv Batra 5

Spring 2013 Projects • Gender Classification from body proportions – Igor Janjic & Daniel Friedman, Juniors (C) Dhruv Batra 6

Plan for today • Supervised/Inductive Learning – (A bit more on) Loss functions • Nearest Neighbour – Common Distance Metrics – Kernel Classification/Regression – Curse of Dimensionality (C) Dhruv Batra 7

Loss/Error Functions • How do we measure performance? • Regression: – L 2 error • Classification: – #misclassifications – Weighted misclassification via a cost matrix – For 2-class classification: • True Positive, False Positive, True Negative, False Negative – For k-class classification: • Confusion Matrix • ROC curves – http://psych.hanover.edu/JavaTest/SDT/ROC.html (C) Dhruv Batra 8

Nearest Neighbours (C) Dhruv Batra Image Credit: Wikipedia 9

Instance/Memory-based Learning Four things make a memory based learner: • A distance metric • How many nearby neighbors to look at? • A weighting function (optional) • How to fit with the local points? (C) Dhruv Batra Slide Credit: Carlos Guestrin 10

1-Nearest Neighbour Four things make a memory based learner: • A distance metric – Euclidean (and others) • How many nearby neighbors to look at? – 1 • A weighting function (optional) – unused • How to fit with the local points? – Just predict the same output as the nearest neighbour. (C) Dhruv Batra Slide Credit: Carlos Guestrin 11

k-Nearest Neighbour Four things make a memory based learner: • A distance metric – Euclidean (and others) • How many nearby neighbors to look at? – k • A weighting function (optional) – unused • How to fit with the local points? – Just predict the average output among the nearest neighbours. (C) Dhruv Batra Slide Credit: Carlos Guestrin 12

1-NN for Regression Here, this is the closest datapoint y x (C) Dhruv Batra Figure Credit: Carlos Guestrin 13

Multivariate distance metrics Suppose the input vectors x 1 , x 2 , … x N are two dimensional: x 1 = ( x 11 , x 12 ) , x 2 = ( x 21 , x 22 ) , … x N = ( x N1 , x N2 ). One can draw the nearest-neighbor regions in input space. Dist ( x i , x j ) = ( x i1 – x j1 ) 2 + ( x i2 – x j2 ) 2 Dist ( x i , x j ) =( x i1 – x j1 ) 2 +( 3x i2 – 3x j2 ) 2 The relative scalings in the distance metric affect region shapes Slide Credit: Carlos Guestrin

Euclidean distance metric sX D ( x, x 0 ) = σ 2 i ( x i − x 0 i ) 2 Or equivalently, i q i ) T A ( x i − x 0 D ( x, x 0 ) = ( x i − x 0 i ) where A Slide Credit: Carlos Guestrin

Notable distance metrics (and their level sets) Mahalanobis Scaled Euclidian (L 2 ) (non-diagonal A) Slide Credit: Carlos Guestrin

Minkowski distance Image Credit: By Waldir (Based on File:MinkowskiCircles.svg) (C) Dhruv Batra 17 [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

Notable distance metrics (and their level sets) Scaled Euclidian (L 2 ) L 1 norm (absolute) Mahalanobis (non-diagonal A) L inf (max) norm Slide Credit: Carlos Guestrin

Parametric vs Non-Parametric Models • Does the capacity (size of hypothesis class) grow with size of training data? – Yes = Non-Parametric Models – No = Parametric Models • Example – http://www.theparticle.com/applets/ml/nearest_neighbor/ (C) Dhruv Batra 19

Weighted k-NNs • Neighbors are not all the same

1 vs k Nearest Neighbour (C) Dhruv Batra Image Credit: Ying Wu 21

1 vs k Nearest Neighbour (C) Dhruv Batra Image Credit: Ying Wu 22

1-NN for Regression Here, this is the closest datapoint y x (C) Dhruv Batra Figure Credit: Carlos Guestrin 23

1-NN for Regression • Often bumpy (overfits) (C) Dhruv Batra Figure Credit: Andrew Moore 24

9-NN for Regression • Often bumpy (overfits) (C) Dhruv Batra Figure Credit: Andrew Moore 25

Kernel Regression/Classification Four things make a memory based learner: • A distance metric – Euclidean (and others) • How many nearby neighbors to look at? – All of them • A weighting function (optional) – w i = exp(-d(x i , query) 2 / σ 2 ) – Nearby points to the query are weighted strongly, far points weakly. The σ parameter is the Kernel Width . Very important. • How to fit with the local points? – Predict the weighted average of the outputs predict = Σ w i y i / Σ w i (C) Dhruv Batra Slide Credit: Carlos Guestrin 26

ECE 5984: Introduction to Machine Learning Topics: Supervised - PowerPoint PPT Presentation

ECE 5984: Introduction to Machine Learning Topics: Supervised Learning Measuring performance Nearest Neighbour Readings: Barber 14 (kNN) Dhruv Batra Virginia Tech TA: Qing Sun PhD candidate at ECE department Research

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech ECE 4424 / 5424G (CS

ECE 5984: Introduction to Machine Learning Topics: Supervised Learning General Setup,

ECE 5984: Introduction to Machine Learning Topics: Decision/Classification Trees Readings:

ECE 5984: Introduction to Machine Learning Topics: Neural Networks Backprop Readings:

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Introduction to Machine Learning Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Nearest Neighbor Classification Seed classification by area and What should we compactness

Hilberts problems and contemporary mathematical logic Jan Kraj cek MFF UK (KA)

Geometry of numbers: old and new problems Jacques Martinet Universit e de Bordeaux, IMB

Hermann Minkowski (1864 1909) Mathematician. Friend of David Hilbert. Teacher of Einstein in

Interlinking: Performance Assessment of User Evaluation vs. Supervised Learning Approaches Mofeed

Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint work with Yair Bartal ,

Web Data Representation Web Graph, Text, Images, Metadata, Search spaces Web Search 1 The Web

A Minkowski problem for nonlinear capacity Andrew Vogel April 22, Boston AMS special session