SLIDE 1
Minimax Fixed-Design Linear Regression Peter L. Bartlett, Wouter M. - - PowerPoint PPT Presentation
Minimax Fixed-Design Linear Regression Peter L. Bartlett, Wouter M. - - PowerPoint PPT Presentation
Minimax Fixed-Design Linear Regression Peter L. Bartlett, Wouter M. Koolen, Alan Malek , Eiji Takimoto, Manfred Warmuth Conference on Learning Theory Paris, France July 5th, 2015 Context: Linear regression We have data ( x 1 , y 1 ) , . . .
SLIDE 2
SLIDE 3
Context: Linear regression
◮ We have data (x1, y1), . . . , (xT, yT) ◮ Offline linear regression: predict ˆ
y = θ⊺x, where θ = (X ⊺X)−1X ⊺Y .
◮ Online fixed-design linear regression:
- 1. Covariates x1, . . . , xT are fixed at the start
- 2. Need to predict ˆ
y t before seeing y t
SLIDE 4
Protocol
Given: x1, . . . , xT ∈ Rd For t = 1, 2, . . . , T:
◮ Learner predicts ˆ
yt ∈ R,
◮ Adversary reveals yt ∈ R, ◮ Learner incurs loss (ˆ
yt − yt)2.
Figure: Fixed-design protocol
SLIDE 5
Minimax
Our goal is to find a strategy that achieves the minimax regret: min
ˆ y1
max
y1
· · · min
ˆ yT
max
yT T
- t=1
(ˆ yt − yt)2 − min
θ∈Rd T
- t=1
(θ⊺xt − yt)2
SLIDE 6
Minimax
Our goal is to find a strategy that achieves the minimax regret: min
ˆ y1
max
y1
· · · min
ˆ yT
max
yT T
- t=1
(ˆ yt − yt)2
- algorithm
− min
θ∈Rd T
- t=1
(θ⊺xt − yt)2
SLIDE 7
Minimax
Our goal is to find a strategy that achieves the minimax regret: min
ˆ y1
max
y1
· · · min
ˆ yT
max
yT T
- t=1
(ˆ yt − yt)2
- algorithm
− min
θ∈Rd T
- t=1
(θ⊺xt − yt)2
- best linear predictor
SLIDE 8
The Minimax Strategy
◮ Is linear
ˆ yt = s⊺
t−1P txt
where st =
t
- q=1
xqyq,
◮ with coefficients:
P −1
t
=
t
- q=1
xqx⊺
q + T
- q=t+1
x⊺
qP qxq
1 + x⊺
qP qxq
xqx⊺
q . ◮ Cheap recursive calculation, can be done before seeing yts. ◮ Minimax under alignment condition and |yt| ≤ B
SLIDE 9
The Minimax Strategy
◮ Is linear
ˆ yt = s⊺
t−1P txt
where st =
t
- q=1
xqyq,
◮ with coefficients:
P −1
t
=
t
- q=1
xqx⊺
q
- least squares
+
T
- q=t+1
x⊺
qP qxq
1 + x⊺
qP qxq
xqx⊺
q
- re-weighted future instances
.
◮ Cheap recursive calculation, can be done before seeing yts. ◮ Minimax under alignment condition and |yt| ≤ B
SLIDE 10
Guarantees
◮ If the adversary plays yt with T
- t=1
y2
t x⊺ t P txt = R,
we are minimax against all yts in this set
◮ Explains re-weighting:
P −1
t
=
t
- q=1
xqx⊺
q + T
- q=t+1
x⊺
qP qxq
1 + x⊺
qP qxq
- future regret potential
xqx⊺
q ◮ Minimax strategy does not depend on R ◮ We achieve regret exactly R = O(log T)
SLIDE 11
Guarantees
◮ If the adversary plays yt with T
- t=1
y2
t x⊺ t P txt = R,
we are minimax against all yts in this set
◮ Explains re-weighting:
P −1
t
=
t
- q=1
xqx⊺
q + T
- q=t+1
x⊺
qP qxq
1 + x⊺
qP qxq
- future regret potential