bias variance tradeoff
play

Bias-Variance Tradeoff Machine Learning 1 Bias and variance Every - PowerPoint PPT Presentation

Bias-Variance Tradeoff Machine Learning 1 Bias and variance Every learning algorithm requires assumptions about the hypothesis space. Eg: My hypothesis space is linear decision trees with 5 nodes a three layer


  1. Bias-Variance Tradeoff Machine Learning 1

  2. Bias and variance Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is – …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units” Bias is the true error (loss) of the best predictor in the hypothesis set What will the bias be if the hypothesis set can not represent the • target function? (high or low?) – Bias will be non zero, possibly high Underfitting: When bias is high • 2

  3. Bias and variance Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is – …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units” Bias is the true error (loss) of the best predictor in the hypothesis set What will the bias be if the hypothesis set can not represent the • target function? (high or low?) – Bias will be non zero, possibly high Underfitting: When bias is high • 3

  4. Bias and variance Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is – …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units” Bias is the true error (loss) of the best predictor in the hypothesis set What will the bias be if the hypothesis set can not represent the • target function? (high or low?) – Bias will be non zero, possibly high Underfitting: When bias is high • 4

  5. Bias and variance Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is – …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units” Bias is the true error (loss) of the best predictor in the hypothesis set What will the bias be if the hypothesis set can not represent the • target function? (high or low?) – Bias will be non zero, possibly high Underfitting: When bias is high • 5

  6. Bias and variance • The performance of a classifier is dependent on the specific training set we have – Perhaps the model will change if we slightly change the training set • Variance: Describes how much the best classifier depends on the training set • Overfitting: High variance • Variance – Increases when the classifiers become more complex – Decreases with larger training sets 6

  7. Bias and variance • The performance of a classifier is dependent on the specific training set we have – Perhaps the model will change if we slightly change the training set • Variance: Describes how much the best classifier depends on the training set • Overfitting: High variance • Variance – Increases when the classifiers become more complex – Decreases with larger training sets 7

  8. Bias and variance • The performance of a classifier is dependent on the specific training set we have – Perhaps the model will change if we slightly change the training set • Variance: Describes how much the best classifier depends on the training set • Overfitting: High variance • Variance – Increases when the classifiers become more complex – Decreases with larger training sets 8

  9. Bias and variance • The performance of a classifier is dependent on the specific training set we have – Perhaps the model will change if we slightly change the training set • Variance: Describes how much the best classifier depends on the training set • Overfitting: High variance • Variance – Increases when the classifiers become more complex – Decreases with larger training sets 9

  10. Each dot is a model Suppose the true Let’s play darts that is learned from a concept is the center different dataset High bias Low bias High variance Low variance 10

  11. Each dot is a model Suppose the true Let’s play darts that is learned from a concept is the center different dataset High bias Low bias High variance Low variance 11

  12. Each dot is a model Suppose the true Let’s play darts that is learned from a concept is the center different dataset High bias Low bias High variance Low variance 12

  13. Each dot is a model Suppose the true Let’s play darts that is learned from a concept is the center different dataset High bias Low bias High variance Low variance 13

  14. Each dot is a model Suppose the true Let’s play darts that is learned from a concept is the center different dataset High bias Low bias High variance Low variance 14

  15. Bias variance tradeoffs • Error = bias + variance (+ noise) • High bias ) both training and test error can be high – Arises when the classifier can not represent the data • High variance ) training error can be low, but test error will be high – Arises when the learner overfits the training set Bias variance tradeoff has been studied extensively in the context of regression Generalized to classification (Domingos, 2000) 15

  16. Managing of bias and variance Ensemble methods reduce variance • – Multiple classifiers are combined – Eg: Bagging, boosting Decision trees of a given depth • – Increasing depth decreases bias, increases variance SVMs • – Higher degree polynomial kernels decreases bias, increases variance – Stronger regularization increases bias, decreases variance Neural networks • – Deeper models can increase variance, but decrease bias K nearest neighbors • – Increasing k generally increases bias, reduces variance 16

  17. Summary Bias and Variance – Rich exploration in statistics – Provides a different view of learning criteria 17

Recommend


More recommend