Ensemble methods CS 446
Why ensembles? Standard machine learning setup: ◮ We have some data. ◮ We train 10 predictors (3-nn, least squares, SVM, ResNet, . . . ). ◮ We output the best on a validation set. 1 / 27
Why ensembles? Standard machine learning setup: ◮ We have some data. ◮ We train 10 predictors (3-nn, least squares, SVM, ResNet, . . . ). ◮ We output the best on a validation set. Question: can we do better than the best? 1 / 27
Why ensembles? Standard machine learning setup: ◮ We have some data. ◮ We train 10 predictors (3-nn, least squares, SVM, ResNet, . . . ). ◮ We output the best on a validation set. Question: can we do better than the best? What if we use an ensemble/aggregate/combination ? 1 / 27
Why ensembles? Standard machine learning setup: ◮ We have some data. ◮ We train 10 predictors (3-nn, least squares, SVM, ResNet, . . . ). ◮ We output the best on a validation set. Question: can we do better than the best? What if we use an ensemble/aggregate/combination ? We’ll consider two approaches: boosting and bagging . 1 / 27
Bagging 2 / 27
Bagging? This first approach is based upon a simple idea: ◮ If the predictors have indepedent errors , a majority vote of their outputs should be good. Let’s first check this. 3 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . #classifiers = n = 10 0.25 0.20 0.15 0.10 0.05 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . #classifiers = n = 20 0.175 0.150 0.125 0.100 0.075 0.050 0.025 0.000 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . #classifiers = n = 30 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . #classifiers = n = 40 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . #classifiers = n = 50 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . #classifiers = n = 60 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Red: all classifiers wrong. #classifiers = n = 2, fraction red = 0.16 0.5 0.4 0.3 0.2 0.1 0.0 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Red: all classifiers wrong. #classifiers = n = 3, fraction red = 0.064 0.4 0.3 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Red: all classifiers wrong. #classifiers = n = 4, fraction red = 0.0256 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Red: all classifiers wrong. #classifiers = n = 5, fraction red = 0.01024 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Red: all classifiers wrong. #classifiers = n = 6, fraction red = 0.004096 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Red: all classifiers wrong. #classifiers = n = 7, fraction red = 0.0016384 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Green: at least half classifiers wrong. #classifiers = n = 10, fraction green = 0.366897 0.25 0.20 0.15 0.10 0.05 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Green: at least half classifiers wrong. #classifiers = n = 20, fraction green = 0.244663 0.175 0.150 0.125 0.100 0.075 0.050 0.025 0.000 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Green: at least half classifiers wrong. #classifiers = n = 30, fraction green = 0.175369 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Green: at least half classifiers wrong. #classifiers = n = 40, fraction green = 0.129766 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Green: at least half classifiers wrong. #classifiers = n = 50, fraction green = 0.0978074 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Combining classifiers Suppose we have n classifiers. Suppose each is wrong independently with probability 0 . 4 . Model classifier errors as random variables ( Z i ) n i =1 (thus E ( Z i ) = 0 . 4 ). We can model the distribution of errors with Binom ( n, 0 . 4) . Green: at least half classifiers wrong. #classifiers = n = 60, fraction green = 0.0746237 0.10 0.08 0.06 0.04 0.02 0.00 0.0 0.2 0.4 0.6 0.8 1.0 4 / 27
Recommend
More recommend