Lecture 20:
−AdaBoost
Aykut Erdem
December 2017 Hacettepe University
Lecture 20: AdaBoost Aykut Erdem December 2017 Hacettepe University - - PowerPoint PPT Presentation
Lecture 20: AdaBoost Aykut Erdem December 2017 Hacettepe University Last time Bias/Variance Tradeo ff slide by David Sontag Graphical illustration of bias and variance. http://scott.fortmann-roe.com/docs/BiasVariance.html 2 Last time
−AdaBoost
Aykut Erdem
December 2017 Hacettepe University
2
http://scott.fortmann-roe.com/docs/BiasVariance.html Graphical illustration of bias and variance.
slide by David Sontag
3
slide by David Sontag
4
slide by Nando de Freitas
[From the book of Hastie, Friedman and Tibshirani]
Tree t=1 t=2 t=3
training data, then let the learned classifiers vote
weighted by their strength
5
slide by Aarti Singh & Barnabas Poczos
6
7
slide by Tommi S. Jaakkola
8
slide by Tommi S. Jaakkola
9
slide by Tommi S. Jaakkola
10
slide by Tommi S. Jaakkola
11
slide by Tommi S. Jaakkola
12
slide by Tommi S. Jaakkola
13
slide by Tommi S. Jaakkola
14
slide by Tommi S. Jaakkola
15
slide by Tommi S. Jaakkola
16
slide by Jiri Matas and Jan Šochman
17
Given: (x1, y1), . . . , (xm, ym); xi ∈ X, yi ∈ {−1, +1}
slide by Jiri Matas and Jan Šochman
18
Given: (x1, y1), . . . , (xm, ym); xi ∈ X, yi ∈ {−1, +1} Initialise weights D1(i) = 1/m
slide by Jiri Matas and Jan Šochman
19
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop t = 1
slide by Jiri Matas and Jan Šochman
20
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t )
t = 1
slide by Jiri Matas and Jan Šochman
21
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor t = 1
slide by Jiri Matas and Jan Šochman
22
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 1
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
23
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 2
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
24
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 3
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
25
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 4
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
26
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 5
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
27
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 6
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
28
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 7
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
29
Given: (x1, y1), . . . , (xm, ym); xi 2 X, yi 2 {1, +1} Initialise weights D1(i) = 1/m For t = 1, ..., T:
⌅
Find ht = arg min
hj∈H ✏j = m
P
i=1
Dt(i)Jyi 6= hj(xi)K
⌅
If ✏t 1/2 then stop
⌅
Set ↵t = 1
2 log(1−✏t ✏t ) ⌅
Update Dt+1(i) = Dt(i)exp(↵tyiht(xi)) Zt where Zt is normalisation factor Output the final classifier: H(x) = sign T X
t=1
↵tht(x) !
step training error
t = 40
5 10 15 20 25 30 35 40 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide by Jiri Matas and Jan Šochman
30
slide by Jiri Matas and Jan Šochman
31
slide by Jiri Matas and Jan Šochman
32
slide by Jiri Matas and Jan Šochman
33
[Schapire, 1989]
10 100 1000 5 10 15 20
error # rounds training error test error
slide by Carlos Guestrin
34
[Viola & Jones]
slide by Rob Schapire
35
[Viola & Jones]
slide by Rob Schapire
36
(not$a$$linear$classifier)$
slide by Aarti Singh
37
slide by Aarti Singh