Nearest Neighbor Classification Seed classification by area and - PDF document

Nearest Neighbor Classification Seed classification by area and ◮ What should we compactness predict for unlabeled Lecture 6 – KNN and Decision Trees test points (stars)? ◮ Nearest neighbor classification: predict CS 335 label of nearest training example ◮ k -nearest neighbor: Dan Sheldon predict consensus of k nearest training examples k -Nearest Neighbor Classification Distance and Similarity ◮ Training : store the training data (trivial!) ◮ KNN can use any distance function to determine k nearest neighbors. A distance function d ( x , x ′ ) takes two data points D = { ( x (1) , y (1) , . . . ( x ( m ) , y ( m ) ) } and returns a distance. It should satisfy ◮ d ( x , x ′ ) ≥ 0 (non-negativity) ◮ d ( x , x ′ ) = 0 (distance from a point to itself is zero) ◮ Prediction : for a new instance x , predict label that is most frequent among k training examples closest to x ◮ Or you can use a similarity function ◮ s ( x , x ′ ) ≥ 0 ◮ s ( x , x ) ≥ s ( x , x ′ ) for all other x ′ ( x is more similar to itself ◮ KNN can work with any distance function and any value of k . than any other point) We need to choose these. Euclidean Distance Minkowski Distance ◮ A more general class of distance functions come from ◮ We’ve already seen one distance function, the Euclidean Minkowski Distance distance : d p ( x , x ′ ) := � x − x ′ � p d ( x , x ′ ) = � x − x ′ � n | r i | p � 1 /p � � � r � p := ◮ Length of straight line between x and x ′ (= vector norm of i =1 x − x ′ ) ◮ p = 2 is Euclidean distance (verify on own) ◮ p = 1 is called the “Manhattan distance”

Examples KNN Implementation ◮ The “brute force” version of KNN is very straightforward: ◮ Given test point x , compute distances d ( i ) := d ( x , x ( i ) ) to each training example ◮ Sort training examples by distance ◮ Jupyter Demo 1: different distance functions ◮ k -nearest neighbors = first k examples in this sorted list. ◮ Now, making the prediction is straightforward. ◮ Running time : O ( m log m ) for one prediction ◮ In practice, clever data structures (e.g., KD-trees) can be constructed to find k nearest neighbors and make predictions more quickly. KNN Trade-Offs Decision Trees ◮ Strengths Example: Flu decision tree ◮ Simple ◮ Converges to the correct decision surface as data goes to infinity Temp%>%100% ◮ Weaknesses F" T" ◮ Lots of variability in the decision surface when amount of data is low ◮ Curse of dimensionality: everything is far from everything else in Sore%Throat%=Y% Runny%Nose=Y% high dimensions ◮ Running time and memory usage: store all training data and F" T" F" T" perform neighbor search for every prediction → use a lot of memory / time ◮ Jupyter Demo 2: KNN in action Healthy% Cold% Other% Flu% ◮ Effect of k ◮ KNN convergence as data goes to infinity Decision Trees Decision Tree Intution ◮ Classical model for making a decision or classfication using “splitting rules” organized into tree data structure ◮ Data instance x is routed from the root to leaf ◮ Board work ◮ Geometric illustration of decision tree: recursive axis-aligned ◮ Nodes = “splitting rules” partitioning ◮ Continuous variables: test if ( x j < c ) or ( x j ≥ c ) (2 branches) ◮ Intuition for how to partition to fit a dataset (= learning a ◮ Discrete variables: test ( x j = 1) , ( x j = 2) , . . . for k possible decision tree) values of x j ( k branches) ◮ x goes down branch corresponding to result of test ◮ Leaf nodes are assigned labels → prediction for x

Decision Tree Learning Decision Tree Learning ◮ How do we fit a decision tree to training data? We won’t give details here, just some intuition. . . Patrons? Type? None Some Full ◮ Idea : recursive splitting of training set French Italian Thai Burger ◮ “Best” splitting rule? Which of these is better? Patrons? Type? ◮ Ideally, split the examples into subsets that are all the same class ◮ Design heuristics based on this principle to choose the best split None Some Full French Italian Thai Burger ◮ When to stop? Recusively split training examples until: ◮ All examples have same class ◮ Start with all training examples at root of tree ◮ Too few data training examples ◮ Find “best” splitting rule at root ◮ Maximum depth exceeded ◮ Recurse on each branch Decision Tree Learning Decision Tree Trade-Offs ◮ Strengths ◮ Interpretability : the learned model is easy to understand ◮ Running time for predictions : shallow trees can be extremely fast classifiers ◮ Weaknessees ◮ Jupyter Demo 3: visualize decision trees to fit to seeds dataset ◮ Running time for learning : finding the optimal trees is computationally intractable (NP-complete), so we need to design greedy heuristics. ◮ Representation : we may need very large trees to accurately model geometry of our problem with axis-aligned splits ◮ General advice: decision trees are very competitive “out-of-the-box” machine learning models for lots of problems!

Nearest Neighbor Classification Seed classification by area and - PDF document

Nearest Neighbor Classification Seed classification by area and What should we compactness predict for unlabeled Lecture 6 KNN and Decision Trees test points (stars)? Nearest neighbor classification: predict CS 335 label of

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

NEAREST NEIGHBOR RULE Jeff Robble, Brian Renzenbrink, Doug Roberts Nearest Neighbor Rule

CSCI 447/547 MACHINE LEARNING Outline Nearest Neighbor K-Nearest Neighbor Algorithm

Nearest Neighbor Classification Machine Learning 1 This lecture K-nearest neighbor

High-Dimensional Nearest Neighbor Search High-Dimensional Nearest Neighbor Search Who?

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled

Graph-based Nearest Neighbor Search: From Practice to Theory Liudmila Prokhorenkova, Aleksandr

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Simultaneous Nearest Neighbor Search Piotr Indyk Robert Kleinberg MIT Cornell Sepideh

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders , Johannes

Nearest neighbor classification in metric spaces: universal consistency and rates of convergence

Classification K-nearest neighbor classification D istance functions Choice of k Choice of k

9/28/2009 Nearest Neighbor Queries What are the two nearest stars to Andromeda? Reverse

Nearest Neighbor Classifiers CSE 4308/5360: Artificial Intelligence I University of Texas at

Hilberts problems and contemporary mathematical logic Jan Kraj cek MFF UK (KA)

Geometry of numbers: old and new problems Jacques Martinet Universit e de Bordeaux, IMB

Hermann Minkowski (1864 1909) Mathematician. Friend of David Hilbert. Teacher of Einstein in

modern kraft pulp mill 7th INTERNATIONAL COLLOQUIUM ON EUCALYPTUS PULP May 26-29, 2015 Vitria,

ECE 5984: Introduction to Machine Learning Topics: Supervised Learning Measuring

Interlinking: Performance Assessment of User Evaluation vs. Supervised Learning Approaches Mofeed

Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint work with Yair Bartal ,

Web Data Representation Web Graph, Text, Images, Metadata, Search spaces Web Search 1 The Web