adversarial online learning with noise
play

Adversarial Online Learning with noise Alon Resler Yishay Mansour - PowerPoint PPT Presentation

Adversarial Online Learning with noise Alon Resler Yishay Mansour Tel Aviv University Jun 13, 2019 Alon Resler Yishay Mansour (TAU) Online Learning with noise Jun 13, 2019 1 / 5 Adversarial bandits A T rounds game between a learner and an


  1. Adversarial Online Learning with noise Alon Resler Yishay Mansour Tel Aviv University Jun 13, 2019 Alon Resler Yishay Mansour (TAU) Online Learning with noise Jun 13, 2019 1 / 5

  2. Adversarial bandits A T rounds game between a learner and an adversary Set of K actions A = { 1 , . . . , K } On round t : ℓ t ∈ { 0 , 1 } K where ℓ i , t is the loss ◮ The adversary selects a loss vector � associated with action i at round t ◮ The learner chooses an action I t (usually random) ◮ The learner incurs a loss ℓ I t , t ◮ Finally, the learner observes a feedback Alon Resler Yishay Mansour (TAU) Online Learning with noise Jun 13, 2019 2 / 5

  3. Feedback Types and Regret Full information feedback : the learner observes � ℓ t Bandit feedback : the learner observes ℓ I t , t The learner goal is to minimize the expected regret: � T T � � � Regret ( T ) = E ℓ I t , t − min ℓ i , t i ∈ A t =1 t =1 We say that the algorithm has vanishing regret if Regret ( T ) = o ( T ) Alon Resler Yishay Mansour (TAU) Online Learning with noise Jun 13, 2019 3 / 5

  4. Our work We study online learning settings in which the feedback is corrupted by random noise We consider binary losses xored with the noise, which is a Bernoulli random variable We consider both settings: bandit feedback and full information feedback Alon Resler Yishay Mansour (TAU) Online Learning with noise Jun 13, 2019 4 / 5

  5. Results Summary Feedback type \ Noise model Constant noise Variable noise (Uniform) √ Θ( T 2 / 3 ln 1 / 3 K ) Θ( 1 Full information (known noise) T ln K ) ǫ √ Θ( 1 Full Information (unknown noise) T ln K ) Θ( T ) ǫ √ Θ( 1 ˜ ˜ Θ( T 2 / 3 K 1 / 3 ) Bandit (known noise) TK ) ǫ √ Θ( 1 ˜ Bandit (unknown noise) TK ) Θ( T ) ǫ Poster @ Pacific Ballroom #156 Alon Resler Yishay Mansour (TAU) Online Learning with noise Jun 13, 2019 5 / 5

Recommend


More recommend