Automatic Sample-by- sample Model Selection Between Two Off-the-shelf Classifiers Steve P. Chadwick University of Texas at Dallas
Model Selection by Predicting the Better Classifier Idea: � Two classifiers, "primary" and "secondary" ✁ Use confidence to predict which one is expected to perform best Pima Indian Diabetes ----------------- Primary classifier: Fisher LD 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 6 5 4 7 3 2 1 8 1 0 9 Fisher LD classifies over 70% of the data before half the total error is accumulated. Secondary classifier: 1-Nearest Neighbor 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 3 4 2 5 1 6 10 7 9 8 1-Neigh classifies about 50% of the data when half the total error is accumulated.
Ljubljana Breast Cancer ----------------- Primary classifier: Fisher LD 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 5 4 3 6 2 1 7 1 0 8 9 Fisher LD classifies over 60% of the data before half the total error is accumulated. Secondary classifier: Nearest Unlike Nei. 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 5 4 3 6 2 1 10 7 9 8 NUN classifies about 60% of the data before half the total error is accumulated.
Confidence measure profiles ----------------- (Ljubljana Breast Cancer) 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Fisher LD using 1/(w t x+s) 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor using 1-Neighbor distance 40 30 20 10 1 2 3 4 5 6 7 8 9 10 MSE using Q 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor using distance from centers 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Neighbor using nearest unlike neighbor ratio
Differential error ----------------- (Ljubljana Breast Cancer) 30% class A, 70% class B Fisher LD: 70 60 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Nearest Unlike Neighbor: 80 60 40 20 1 2 3 4 5 6 7 8 9 10 Differential error ----------------- (Pima Indian Diabetes) 35% class A, 65% class B Fisher LD: 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor: 70 60 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Differential error ----------------- (Synthetic Data) 50% class A, 50% class B Fisher LD: 30 25 20 15 10 5 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor: 80 60 40 20 1 2 3 4 5 6 7 8 9 10
✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ Obstacles � About 25% of the training data contributes to calculating the selection LD when combining linear discriminant and nearest neighbor classifiers. 3 2.5 2 1.5 1 0.5 100 200 300 400 500 600 Selection LD and data in q-space ✄ The different confidence measures have different ranges, which makes them difficult to compare with each other.
Recommend
More recommend