Nearest Neighbor Classification Machine Learning 1 This lecture - PowerPoint PPT Presentation

Nearest Neighbor Classification Machine Learning 1

This lecture • K-nearest neighbor classification – The basic algorithm – Different distance measures – Some practical aspects • Voronoi Diagrams and Decision Boundaries – What is the hypothesis space? • The Curse of Dimensionality 2

This lecture • K-nearest neighbor classification – The basic algorithm – Different distance measures – Some practical aspects • Voronoi Diagrams and Decision Boundaries – What is the hypothesis space? • The Curse of Dimensionality 3

How would you color the blank circles? B A C 4

How would you color the blank circles? B If we based it on the color of their nearest neighbors, we would get A: Blue B: Red C: Red A C 5

Training data partitions the entire instance space (using labels of nearest neighbors) 6

Nearest Neighbors: The basic version • Training examples are vectors x i associated with a label y i – E.g. x i = a feature vector for an email, y i = SPAM • Learning: Just store all the training examples • Prediction for a new example x – Find the training example x i that is closest to x – Predict the label of x to the label y i associated with x i 7

K-Nearest Neighbors • Training examples are vectors x i associated with a label y i – E.g. x i = a feature vector for an email, y i = SPAM • Learning: Just store all the training examples • Prediction for a new example x – Find the k closest training examples to x – Construct the label of x using these k points. How? – For classification: ? 8

K-Nearest Neighbors • Training examples are vectors x i associated with a label y i – E.g. x i = a feature vector for an email, y i = SPAM • Learning: Just store all the training examples • Prediction for a new example x – Find the k closest training examples to x – Construct the label of x using these k points. How? – For classification: Every neighbor votes on the label. Predict the most frequent label among the neighbors. – For regression: ? 9

K-Nearest Neighbors • Training examples are vectors x i associated with a label y i – E.g. x i = a feature vector for an email, y i = SPAM • Learning: Just store all the training examples • Prediction for a new example x – Find the k closest training examples to x – Construct the label of x using these k points. How? – For classification: Every neighbor votes on the label. Predict the most frequent label among the neighbors. – For regression: Predict the mean value 10

Instance based learning • A class of learning methods – Learning: Storing examples with labels – Prediction: When presented a new example, classify the labels using similar stored examples • K-nearest neighbors algorithm is an example of this class of methods • Also called lazy learning, because most of the computation (in the simplest case, all computation) is performed only at prediction time Questions? 11

Distance between instances • In general, a good place to inject knowledge about the domain • Behavior of this approach can depend on this • How do we measure distances between instances? 12

Distance between instances Numeric features, represented as n dimensional vectors 13

Distance between instances Numeric features, represented as n dimensional vectors – Euclidean distance – Manhattan distance – L p -norm • Euclidean = L 2 • Manhattan = L 1 • Exercise: What is L 1 ? 14

Distance between instances What about symbolic/categorical features? 17

Distance between instances Symbolic/categorical features Most common distance is the Hamming distance – Number of bits that are different – Or: Number of features that have a different value – Also called the overlap – Example: X 1 : {Shape=Triangle, Color=Red, Location=Left, Orientation=Up} X 2 : {Shape=Triangle, Color=Blue, Location=Left, Orientation=Down} Hamming distance = 2 18

Advantages • Training is very fast – Just adding labeled instances to a list – More complex indexing methods can be used, which slow down learning slightly to make prediction faster • Can learn very complex functions • We always have the training data – For other learning algorithms, after training, we don’t store the data anymore. What if we want to do something with it later… 19

Disadvantages • Needs a lot of storage – Is this really a problem now? • Prediction can be slow! – Naïvely: O(dN) for N training examples in d dimensions – More data will make it slower – Compare to other classifiers, where prediction is very fast • Nearest neighbors are fooled by irrelevant attributes – Important and subtle Questions? 20

Summary: K-Nearest Neighbors Probably the first “machine learning” algorithm • Guarantee: If there are enough training examples, the error of the nearest neighbor – classifier will converge to the error of the optimal (i.e. best possible) predictor In practice, use an odd K. Why? • To break ties – How to choose K? Using a held-out set or by cross-validation • Feature normalization could be important • Often, good idea to center the features to make them zero mean and unit standard – deviation. Why? – Because different features could have different scales (weight, height, etc); but the distance weights them equally Variants exist • Neighbors’ labels could be weighted by their distance – 21

Where are we? • K-nearest neighbor classification – The basic algorithm – Different distance measures – Some practical aspects • Voronoi Diagrams and Decision Boundaries – What is the hypothesis space? • The Curse of Dimensionality 26

Where are we? • K-nearest neighbor classification – The basic algorithm – Different distance measures – Some practical aspects • Voronoi Diagrams and Decision Boundaries – What is the hypothesis space? • The Curse of Dimensionality 27

The decision boundary for KNN Is the K nearest neighbors algorithm explicitly building a function? 28

Nearest Neighbor Classification Machine Learning 1 This lecture - PowerPoint PPT Presentation

Nearest Neighbor Classification Machine Learning 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision Boundaries What is the

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

NEAREST NEIGHBOR RULE Jeff Robble, Brian Renzenbrink, Doug Roberts Nearest Neighbor Rule

CSCI 447/547 MACHINE LEARNING Outline Nearest Neighbor K-Nearest Neighbor Algorithm

Nearest Neighbor Classification Seed classification by area and What should we compactness

High-Dimensional Nearest Neighbor Search High-Dimensional Nearest Neighbor Search Who?

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled

Graph-based Nearest Neighbor Search: From Practice to Theory Liudmila Prokhorenkova, Aleksandr

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Simultaneous Nearest Neighbor Search Piotr Indyk Robert Kleinberg MIT Cornell Sepideh

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders , Johannes

Nearest neighbor classification in metric spaces: universal consistency and rates of convergence

Classification K-nearest neighbor classification D istance functions Choice of k Choice of k

9/28/2009 Nearest Neighbor Queries What are the two nearest stars to Andromeda? Reverse

Nearest Neighbor Classifiers CSE 4308/5360: Artificial Intelligence I University of Texas at

2020 PREPARE FOR LIFTOFF. Coffee, With Benefits. Electrolyte & Honey Vanilla Antioxidant

Jan Pilzer 2019-03-29 From Bean to Cup: a history of coffee Michael Firmin, 19 Sep 2014

3. Caffeine Caffeine is absorbed and distributed very quickly. After absorption, it passes into

Quantitative analysis with statistics (and ponies) (Some slides and pony examples from Blase Ur)

Course Business Discuss midterm projects Due today! Short-ish lecture on effect size

Playing the Game in a Professional Services Firm Chris Hutchinson, CEO Kathy Steel, CEO

Inference for Distributions Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Based

Set Intentions NOT Resolutions! Taught by Harris Health System Employee Wellness Team

Nearest Neighbor Classification Machine Learning 1 This lecture - PowerPoint PPT Presentation

Nearest Neighbor Classification Machine Learning 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision Boundaries What is the

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

NEAREST NEIGHBOR RULE Jeff Robble, Brian Renzenbrink, Doug Roberts Nearest Neighbor Rule

CSCI 447/547 MACHINE LEARNING Outline Nearest Neighbor K-Nearest Neighbor Algorithm

Nearest Neighbor Classification Seed classification by area and What should we compactness

High-Dimensional Nearest Neighbor Search High-Dimensional Nearest Neighbor Search Who?

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled

Graph-based Nearest Neighbor Search: From Practice to Theory Liudmila Prokhorenkova, Aleksandr

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Simultaneous Nearest Neighbor Search Piotr Indyk Robert Kleinberg MIT Cornell Sepideh

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders , Johannes

Nearest neighbor classification in metric spaces: universal consistency and rates of convergence

Classification K-nearest neighbor classification D istance functions Choice of k Choice of k

9/28/2009 Nearest Neighbor Queries What are the two nearest stars to Andromeda? Reverse

Nearest Neighbor Classifiers CSE 4308/5360: Artificial Intelligence I University of Texas at

2020 PREPARE FOR LIFTOFF. Coffee, With Benefits. Electrolyte &amp; Honey Vanilla Antioxidant

Jan Pilzer 2019-03-29 From Bean to Cup: a history of coffee Michael Firmin, 19 Sep 2014

3. Caffeine Caffeine is absorbed and distributed very quickly. After absorption, it passes into

Quantitative analysis with statistics (and ponies) (Some slides and pony examples from Blase Ur)

Course Business Discuss midterm projects Due today! Short-ish lecture on effect size

Playing the Game in a Professional Services Firm Chris Hutchinson, CEO Kathy Steel, CEO

Inference for Distributions Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Based

Set Intentions NOT Resolutions! Taught by Harris Health System Employee Wellness Team

2020 PREPARE FOR LIFTOFF. Coffee, With Benefits. Electrolyte & Honey Vanilla Antioxidant