Bias-Variance Tradeoff Machine Learning 1 Bias and variance Every - - PowerPoint PPT Presentation

bias variance tradeoff
SMART_READER_LITE
LIVE PREVIEW

Bias-Variance Tradeoff Machine Learning 1 Bias and variance Every - - PowerPoint PPT Presentation

Bias-Variance Tradeoff Machine Learning 1 Bias and variance Every learning algorithm requires assumptions about the hypothesis space. Eg: My hypothesis space is linear decision trees with 5 nodes a three layer


slide-1
SLIDE 1

Machine Learning

Bias-Variance Tradeoff

1

slide-2
SLIDE 2

Bias and variance

Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is

– …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units”

Bias is the true error (loss) of the best predictor in the hypothesis set

  • What will the bias be if the hypothesis set can not represent the

target function? (high or low?)

– Bias will be non zero, possibly high

  • Underfitting: When bias is high

2

slide-3
SLIDE 3

Bias and variance

Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is

– …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units”

Bias is the true error (loss) of the best predictor in the hypothesis set

  • What will the bias be if the hypothesis set can not represent the

target function? (high or low?)

– Bias will be non zero, possibly high

  • Underfitting: When bias is high

3

slide-4
SLIDE 4

Bias and variance

Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is

– …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units”

Bias is the true error (loss) of the best predictor in the hypothesis set

  • What will the bias be if the hypothesis set can not represent the

target function? (high or low?)

– Bias will be non zero, possibly high

  • Underfitting: When bias is high

4

slide-5
SLIDE 5

Bias and variance

Every learning algorithm requires assumptions about the hypothesis space. Eg: “My hypothesis space is

– …linear” – …decision trees with 5 nodes” – …a three layer neural network with rectifier hidden units”

Bias is the true error (loss) of the best predictor in the hypothesis set

  • What will the bias be if the hypothesis set can not represent the

target function? (high or low?)

– Bias will be non zero, possibly high

  • Underfitting: When bias is high

5

slide-6
SLIDE 6

Bias and variance

  • The performance of a classifier is dependent on the specific

training set we have

– Perhaps the model will change if we slightly change the training set

  • Variance: Describes how much the best classifier depends on

the training set

  • Overfitting: High variance
  • Variance

– Increases when the classifiers become more complex – Decreases with larger training sets

6

slide-7
SLIDE 7

Bias and variance

  • The performance of a classifier is dependent on the specific

training set we have

– Perhaps the model will change if we slightly change the training set

  • Variance: Describes how much the best classifier depends on

the training set

  • Overfitting: High variance
  • Variance

– Increases when the classifiers become more complex – Decreases with larger training sets

7

slide-8
SLIDE 8

Bias and variance

  • The performance of a classifier is dependent on the specific

training set we have

– Perhaps the model will change if we slightly change the training set

  • Variance: Describes how much the best classifier depends on

the training set

  • Overfitting: High variance
  • Variance

– Increases when the classifiers become more complex – Decreases with larger training sets

8

slide-9
SLIDE 9

Bias and variance

  • The performance of a classifier is dependent on the specific

training set we have

– Perhaps the model will change if we slightly change the training set

  • Variance: Describes how much the best classifier depends on

the training set

  • Overfitting: High variance
  • Variance

– Increases when the classifiers become more complex – Decreases with larger training sets

9

slide-10
SLIDE 10

Let’s play darts

Suppose the true concept is the center High bias Low bias Low variance High variance Each dot is a model that is learned from a different dataset

10

slide-11
SLIDE 11

Let’s play darts

Suppose the true concept is the center High bias Low bias Low variance High variance Each dot is a model that is learned from a different dataset

11

slide-12
SLIDE 12

Let’s play darts

Suppose the true concept is the center High bias Low bias Low variance High variance Each dot is a model that is learned from a different dataset

12

slide-13
SLIDE 13

Let’s play darts

Suppose the true concept is the center High bias Low bias Low variance High variance Each dot is a model that is learned from a different dataset

13

slide-14
SLIDE 14

Let’s play darts

Suppose the true concept is the center High bias Low bias Low variance High variance Each dot is a model that is learned from a different dataset

14

slide-15
SLIDE 15

Bias variance tradeoffs

  • Error = bias + variance (+ noise)
  • High bias ) both training and test error can be high

– Arises when the classifier can not represent the data

  • High variance ) training error can be low, but test error will

be high

– Arises when the learner overfits the training set

Bias variance tradeoff has been studied extensively in the context of regression Generalized to classification (Domingos, 2000)

15

slide-16
SLIDE 16

Managing of bias and variance

  • Ensemble methods reduce variance

– Multiple classifiers are combined – Eg: Bagging, boosting

  • Decision trees of a given depth

– Increasing depth decreases bias, increases variance

  • SVMs

– Higher degree polynomial kernels decreases bias, increases variance – Stronger regularization increases bias, decreases variance

  • Neural networks

– Deeper models can increase variance, but decrease bias

  • K nearest neighbors

– Increasing k generally increases bias, reduces variance

16

slide-17
SLIDE 17

Summary

Bias and Variance

– Rich exploration in statistics – Provides a different view of learning criteria

17