Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M. Magdon-Ismail CSCI 4100/6100
recap: Similarity and Nearest Neighbor Similarity 1. Simple. | x − x ′ | d ( x , x ′ ) = | | 2. No training. 1-NN rule 3. Near optimal E out : ⇒ E out → E ∗ k → ∞ , k/N → 0 = out . 4. Good ways to choose k : � √ � k = 3; k = N ; validation/cross validation. 5. Easy to justify classification to customer. 6. Can easily do multi-class. 21-NN rule 7. Can easily adapt to regression or logistic regression k k g ( x ) = 1 g ( x ) = 1 � � � � y [ i ] ( x ) y [ i ] ( x ) = +1 k k i =1 i =1 8. Computationally demanding . M Memory and Efficiency in Nearest Neighbor : 2 /25 � A c L Creator: Malik Magdon-Ismail Computational demands − →
Computational Demands of Nearest Neighbor Memory. Need to store all the data, O ( Nd ) memory. N = 10 6 , d = 100, double precision ≈ 1GB Finding the nearest neighbor of a test point. Need to compute distance to every data point, O ( Nd ). N = 10 6 , d = 100, 3GHz processor ≈ 3ms (compute g ( x )) N = 10 6 , d = 100, 3GHz processor ≈ 1hr (compute CV error) N = 10 6 , d = 100, 3GHz processor > 1month (choose best k from among 1000 using CV) M Memory and Efficiency in Nearest Neighbor : 3 /25 � A c L Creator: Malik Magdon-Ismail Two basic approaches − →
Two Basic Approaches Reduce the amount of data. The 5-year old does not remember every horse he has seen, only a few representative horses. Store the data in a specialized data structure. Ongoing research field to develop geometric data structures to make finding nearest neighbors fast. M Memory and Efficiency in Nearest Neighbor : 4 /25 � A c L Creator: Malik Magdon-Ismail Irrelevant data − →
Throw Away Irrelevant Data − − − − − − → − − − → k = 1 M Memory and Efficiency in Nearest Neighbor : 5 /25 � A c L Creator: Malik Magdon-Ismail Decision boundary consistent − →
Decision Boundary Consistent − − − − − − → − − − → g ( x ) unchanged M Memory and Efficiency in Nearest Neighbor : 6 /25 � A c L Creator: Malik Magdon-Ismail Training set consistent − →
Training Set Consistent − − − − − − → − − − → g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 7 /25 � A c L Creator: Malik Magdon-Ismail Comparing − →
� Decision Boundary Vs. Training Set Consistent DB − − − − − − → − − TS − → g ( x ) unchanged versus g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 8 /25 � A c L Creator: Malik Magdon-Ismail Consistent = ⇒ ( g ( x n ) = y n ) − →
Consistent Does Not Mean g ( x n ) = y n DB − − − − − − → − − TS − → k = 3 M Memory and Efficiency in Nearest Neighbor : 9 /25 � A c L Creator: Malik Magdon-Ismail Training set consistent ( k = 3) − →
Training Set Consistent ( k = 3 ) − − − − − − → − − − → g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 10 /25 � A c L Creator: Malik Magdon-Ismail CNN − →
CNN: Condensed Nearest Neighbor ( k = 3 ) + add this point + + + 1. Randomly select k data points into S . 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . Consider the solid blue point: 5. Iterate until S classifies all points consistently with D . i. blue w.r.t. selected points ii. red w.r.t. D Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 11 /25 � A c L Creator: Malik Magdon-Ismail CNN: add red point − →
CNN: Condensed Nearest Neighbor + add this point + + + 1. Randomly select k data points into S . 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . Consider the solid blue point: 5. Iterate until S classifies all points consistently with D . i. blue w.r.t. selected points ii. red w.r.t. D Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 12 /25 � A c L Creator: Malik Magdon-Ismail CNN: algorithm − →
CNN: Condensed Nearest Neighbor + add this point + + 1. Randomly select k data points into S . + 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . 5. Iterate until S classifies all points consistently with D . Consider the solid blue point: i. blue w.r.t. selected points ii. red w.r.t. D Minimum consistent set (MCS)? ← NP-hard Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 13 /25 � A c L Creator: Malik Magdon-Ismail Digits Data − →
Nearest Neighbor on Digits Data 1-NN rule 21-NN rule M Memory and Efficiency in Nearest Neighbor : 14 /25 � A c L Creator: Malik Magdon-Ismail Condensing the Digits Data − →
Condensing the Digits Data 1-NN rule 21-NN rule M Memory and Efficiency in Nearest Neighbor : 15 /25 � A c L Creator: Malik Magdon-Ismail Finding the nearest neighbor − →
Finding the Nearest Neighbor 1. S 1 , S 2 are ‘clusters’ with centers µ 1 , µ 2 and radii r 1 , r 2 . 2. [Branch] Search S 1 first → ˆ x [1] . S 2 3. The distance from x to any point in S 2 is at least | | x − µ 2 | | − r 2 4. [Bound] So we are done if | | x − ˆ x [1] | | ≤ | | x − µ 2 | | − r 2 x S 1 A branch and bound algorithm Can be applied recursively M Memory and Efficiency in Nearest Neighbor : 16 /25 � A c L Creator: Malik Magdon-Ismail When does the bound hold? − →
When Does the Bound Hold? | x − ˆ Bound condition: | x [1] | | ≤ | | x − µ 2 | | − r 2 . | x − ˆ | x [1] | | ≤ | | x − µ 1 | | + r 1 S 2 So, it suffices that r 1 + r 2 ≤ | | x − µ 2 | | − | | x − µ 1 | | . | | x − µ 1 | | ≈ 0 means | | x − µ 2 | | ≈ | | µ 2 − µ 2 | | . x It suffices that S 1 r 1 + r 2 ≤ | | µ 2 − µ 1 | | . within cluster spread should be less than between cluster spread M Memory and Efficiency in Nearest Neighbor : 17 /25 � A c L Creator: Malik Magdon-Ismail Finding clusters – Lloyd’s algorithm − →
Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 18 /25 � A c L Creator: Malik Magdon-Ismail Furtherest away point − →
Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 19 /25 � A c L Creator: Malik Magdon-Ismail Next furtherest away point − →
Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 20 /25 � A c L Creator: Malik Magdon-Ismail All centers picked − →
Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 21 /25 � A c L Creator: Malik Magdon-Ismail Construct Voronoi regions − →
Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 22 /25 � A c L Creator: Malik Magdon-Ismail Update centers − →
Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 23 /25 � A c L Creator: Malik Magdon-Ismail Update Voronoi regions − →
Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 24 /25 � A c L Creator: Malik Magdon-Ismail Preview RBF − →
Recommend
More recommend