Minimax Fixed-Design Linear Regression Peter L. Bartlett, Wouter M. - - PowerPoint PPT Presentation

minimax fixed design linear regression
SMART_READER_LITE
LIVE PREVIEW

Minimax Fixed-Design Linear Regression Peter L. Bartlett, Wouter M. - - PowerPoint PPT Presentation

Minimax Fixed-Design Linear Regression Peter L. Bartlett, Wouter M. Koolen, Alan Malek , Eiji Takimoto, Manfred Warmuth Conference on Learning Theory Paris, France July 5th, 2015 Context: Linear regression We have data ( x 1 , y 1 ) , . . .


slide-1
SLIDE 1

Minimax Fixed-Design Linear Regression

Peter L. Bartlett, Wouter M. Koolen, Alan Malek, Eiji Takimoto, Manfred Warmuth Conference on Learning Theory Paris, France July 5th, 2015

slide-2
SLIDE 2

Context: Linear regression

◮ We have data (x1, y1), . . . , (xT, yT) ◮ Offline linear regression: predict ˆ

y = θ⊺x, where θ = (X ⊺X)−1X ⊺Y .

slide-3
SLIDE 3

Context: Linear regression

◮ We have data (x1, y1), . . . , (xT, yT) ◮ Offline linear regression: predict ˆ

y = θ⊺x, where θ = (X ⊺X)−1X ⊺Y .

◮ Online fixed-design linear regression:

  • 1. Covariates x1, . . . , xT are fixed at the start
  • 2. Need to predict ˆ

y t before seeing y t

slide-4
SLIDE 4

Protocol

Given: x1, . . . , xT ∈ Rd For t = 1, 2, . . . , T:

◮ Learner predicts ˆ

yt ∈ R,

◮ Adversary reveals yt ∈ R, ◮ Learner incurs loss (ˆ

yt − yt)2.

Figure: Fixed-design protocol

slide-5
SLIDE 5

Minimax

Our goal is to find a strategy that achieves the minimax regret: min

ˆ y1

max

y1

· · · min

ˆ yT

max

yT T

  • t=1

(ˆ yt − yt)2 − min

θ∈Rd T

  • t=1

(θ⊺xt − yt)2

slide-6
SLIDE 6

Minimax

Our goal is to find a strategy that achieves the minimax regret: min

ˆ y1

max

y1

· · · min

ˆ yT

max

yT T

  • t=1

(ˆ yt − yt)2

  • algorithm

− min

θ∈Rd T

  • t=1

(θ⊺xt − yt)2

slide-7
SLIDE 7

Minimax

Our goal is to find a strategy that achieves the minimax regret: min

ˆ y1

max

y1

· · · min

ˆ yT

max

yT T

  • t=1

(ˆ yt − yt)2

  • algorithm

− min

θ∈Rd T

  • t=1

(θ⊺xt − yt)2

  • best linear predictor
slide-8
SLIDE 8

The Minimax Strategy

◮ Is linear

ˆ yt = s⊺

t−1P txt

where st =

t

  • q=1

xqyq,

◮ with coefficients:

P −1

t

=

t

  • q=1

xqx⊺

q + T

  • q=t+1

x⊺

qP qxq

1 + x⊺

qP qxq

xqx⊺

q . ◮ Cheap recursive calculation, can be done before seeing yts. ◮ Minimax under alignment condition and |yt| ≤ B

slide-9
SLIDE 9

The Minimax Strategy

◮ Is linear

ˆ yt = s⊺

t−1P txt

where st =

t

  • q=1

xqyq,

◮ with coefficients:

P −1

t

=

t

  • q=1

xqx⊺

q

  • least squares

+

T

  • q=t+1

x⊺

qP qxq

1 + x⊺

qP qxq

xqx⊺

q

  • re-weighted future instances

.

◮ Cheap recursive calculation, can be done before seeing yts. ◮ Minimax under alignment condition and |yt| ≤ B

slide-10
SLIDE 10

Guarantees

◮ If the adversary plays yt with T

  • t=1

y2

t x⊺ t P txt = R,

we are minimax against all yts in this set

◮ Explains re-weighting:

P −1

t

=

t

  • q=1

xqx⊺

q + T

  • q=t+1

x⊺

qP qxq

1 + x⊺

qP qxq

  • future regret potential

xqx⊺

q ◮ Minimax strategy does not depend on R ◮ We achieve regret exactly R = O(log T)

slide-11
SLIDE 11

Guarantees

◮ If the adversary plays yt with T

  • t=1

y2

t x⊺ t P txt = R,

we are minimax against all yts in this set

◮ Explains re-weighting:

P −1

t

=

t

  • q=1

xqx⊺

q + T

  • q=t+1

x⊺

qP qxq

1 + x⊺

qP qxq

  • future regret potential

xqx⊺

q ◮ Minimax strategy does not depend on R ◮ We achieve regret exactly R = O(log T) ◮ Thanks!