introduction to machine learning
play

Introduction to Machine Learning COMPSCI 371D Machine Learning - PowerPoint PPT Presentation

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Introduction to Machine Learning 1 / 12 Outline 1 Nearest Neighbor Prediction 2 Complexity Considerations 3 The Voronoi Diagram 4 Overfitting


  1. Introduction to Machine Learning COMPSCI 371D — Machine Learning COMPSCI 371D — Machine Learning Introduction to Machine Learning 1 / 12

  2. Outline 1 Nearest Neighbor Prediction 2 Complexity Considerations 3 The Voronoi Diagram 4 Overfitting and k Nearest Neighbors COMPSCI 371D — Machine Learning Introduction to Machine Learning 2 / 12

  3. Nearest Neighbor Prediction Nearest Neighbor Prediction • NN is very simple: This is why we start here • NN is very unusual: • No training! • Slow inference (using the predictor) • Y can be anything • Almost no difference between regression and classification • Hypothesis space hard to define COMPSCI 371D — Machine Learning Introduction to Machine Learning 3 / 12

  4. Nearest Neighbor Prediction How it Works • Given T = { ( x 1 , y 1 ) , . . . , ( x N , y N ) } • Just store T (memorization) • Need a distance in the data space X • Perhaps ∆( x , x ′ ) = � x − x ′ � 2 • Then, h ( x ) = y ν ( x ) where ν ( x ) ∈ arg min n = 1 ,..., N ∆( x , x n ) • Return the value y n for the training point x n that is nearest to x COMPSCI 371D — Machine Learning Introduction to Machine Learning 4 / 12

  5. Nearest Neighbor Prediction COMPSCI 371D — Machine Learning Introduction to Machine Learning 5 / 12

  6. Complexity Considerations How to find ν ( x ) ? ν ( x ) = arg min n = 1 ,..., N ∆( x , x n ) • Compute all ∆( x , x n ) and find the smallest • O ( Nd ) (where x ∈ R d ) • Cannot do better exactly • Can do better if we accept ∆( x , x ν ( x ) ) < ( 1 + ǫ )∆( x , x ν ∗ ( x ) ) for some ǫ > 0 • “Approximate NN” uses k - d trees, R-trees, locality sensitive hashing COMPSCI 371D — Machine Learning Introduction to Machine Learning 6 / 12

  7. The Voronoi Diagram The Voronoi Diagram • Only conceptual, or for d = 2 , 3, maybe 4 • Θ( N log N + N ⌈ d / 2 ⌉ ) COMPSCI 371D — Machine Learning Introduction to Machine Learning 7 / 12

  8. The Voronoi Diagram Decision Boundary COMPSCI 371D — Machine Learning Introduction to Machine Learning 8 / 12

  9. Overfitting and k Nearest Neighbors Overfitting COMPSCI 371D — Machine Learning Introduction to Machine Learning 9 / 12

  10. Overfitting and k Nearest Neighbors k Nearest Neighbors • Retrieve the k nearest neighbors x 1 , . . . , x k of x • Return a summary of the corresponding y 1 , . . . , y k • Classification summary: majority • Regression summary: Mean, median COMPSCI 371D — Machine Learning Introduction to Machine Learning 10 / 12

  11. Overfitting and k Nearest Neighbors Less Overfitting ( k = 9) COMPSCI 371D — Machine Learning Introduction to Machine Learning 11 / 12

  12. Overfitting and k Nearest Neighbors A Simple Regression Example, R → R 800 800 k = 1 k = 10 k = 100 700 700 600 600 500 500 400 400 300 300 200 200 100 100 0 0 0 1000 2000 3000 4000 5000 6000 0 1000 2000 3000 4000 5000 6000 COMPSCI 371D — Machine Learning Introduction to Machine Learning 12 / 12

Recommend


More recommend