Minimax Fixed-Design Linear Regression Peter L. Bartlett, Wouter M. Koolen, Alan Malek , Eiji Takimoto, Manfred Warmuth Conference on Learning Theory Paris, France July 5th, 2015
Context: Linear regression ◮ We have data ( x 1 , y 1 ) , . . . , ( x T , y T ) ◮ Offline linear regression: predict ˆ y = θ ⊺ x , where θ = ( X ⊺ X ) − 1 X ⊺ Y .
Context: Linear regression ◮ We have data ( x 1 , y 1 ) , . . . , ( x T , y T ) ◮ Offline linear regression: predict ˆ y = θ ⊺ x , where θ = ( X ⊺ X ) − 1 X ⊺ Y . ◮ Online fixed-design linear regression: 1. Covariates x 1 , . . . , x T are fixed at the start 2. Need to predict ˆ y t before seeing y t
Protocol Given: x 1 , . . . , x T ∈ R d For t = 1 , 2 , . . . , T : ◮ Learner predicts ˆ y t ∈ R , ◮ Adversary reveals y t ∈ R , y t − y t ) 2 . ◮ Learner incurs loss (ˆ Figure: Fixed-design protocol
Minimax Our goal is to find a strategy that achieves the minimax regret: T T � � y t − y t ) 2 − min ( θ ⊺ x t − y t ) 2 min max · · · min max (ˆ y 1 ˆ y 1 y T ˆ y T θ ∈ R d t =1 t =1
Minimax Our goal is to find a strategy that achieves the minimax regret: T T � � y t − y t ) 2 ( θ ⊺ x t − y t ) 2 min max · · · min max (ˆ − min y 1 ˆ y 1 y T ˆ y T θ ∈ R d t =1 t =1 � �� � algorithm
Minimax Our goal is to find a strategy that achieves the minimax regret: T T � � y t − y t ) 2 ( θ ⊺ x t − y t ) 2 min max · · · min max (ˆ − min y 1 ˆ y 1 y T ˆ y T θ ∈ R d t =1 t =1 � �� � � �� � algorithm best linear predictor
The Minimax Strategy ◮ Is linear t � y t = s ⊺ ˆ where s t = t − 1 P t x t x q y q , q =1 ◮ with coefficients: t T x ⊺ q P q x q � � P − 1 x q x ⊺ x q x ⊺ = q + q . t 1 + x ⊺ q P q x q q =1 q = t +1 ◮ Cheap recursive calculation, can be done before seeing y t s. ◮ Minimax under alignment condition and | y t | ≤ B
The Minimax Strategy ◮ Is linear t � y t = s ⊺ ˆ where s t = t − 1 P t x t x q y q , q =1 ◮ with coefficients: t T x ⊺ q P q x q � � P − 1 x q x ⊺ x q x ⊺ = + . t q q 1 + x ⊺ q P q x q q =1 q = t +1 � �� � � �� � least squares re-weighted future instances ◮ Cheap recursive calculation, can be done before seeing y t s. ◮ Minimax under alignment condition and | y t | ≤ B
Guarantees ◮ If the adversary plays y t with T � y 2 t x ⊺ t P t x t = R , t =1 we are minimax against all y t s in this set ◮ Explains re-weighting: t T x ⊺ q P q x q � � P − 1 x q x ⊺ x q x ⊺ = q + t q 1 + x ⊺ q P q x q q =1 q = t +1 � �� � future regret potential ◮ Minimax strategy does not depend on R ◮ We achieve regret exactly R = O (log T )
Guarantees ◮ If the adversary plays y t with T � y 2 t x ⊺ t P t x t = R , t =1 we are minimax against all y t s in this set ◮ Explains re-weighting: t T x ⊺ q P q x q � � P − 1 x q x ⊺ x q x ⊺ = q + t q 1 + x ⊺ q P q x q q =1 q = t +1 � �� � future regret potential ◮ Minimax strategy does not depend on R ◮ We achieve regret exactly R = O (log T ) ◮ Thanks!
Recommend
More recommend