on compiling online combinatorial learning problems
play

On Compiling (Online) Combinatorial Learning Problems Fr ed eric - PowerPoint PPT Presentation

On Compiling (Online) Combinatorial Learning Problems Fr ed eric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr Dagstuhl17, New Trends in Knowledge Compilation 1 Outline 1 Online Learning 2 The Convex Case 3 The


  1. On Compiling (Online) Combinatorial Learning Problems Fr´ ed´ eric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr Dagstuhl’17, New Trends in Knowledge Compilation 1

  2. Outline 1 Online Learning 2 The Convex Case 3 The Combinatorial Case 4 Compiling Hedge 2

  3. Online Learning Online learning is a zero-sum repeated game between a learning algorithm and its environment. The components of the game are: A class H of hypotheses ( the learner’s moves ) A space Z of instances ( the environment’s moves ) A loss function ℓ : H × Z → R ( the game “matrix” ) 3

  4. Online Learning Learner Environment During each round t of the game, 4

  5. Online Learning Learner h 1 Environment z 1 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z 4

  6. Online Learning Learner h 1 ℓ ( h 1 , z 1 ) Environment z 1 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) 4

  7. Online Learning h 1 h 2 Learner ℓ ( h 1 , z 1 ) Environment z 1 z 2 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  8. Online Learning h 1 h 2 Learner + ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) Environment z 1 z 2 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  9. Online Learning h 1 h 2 h 3 Learner + ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) Environment z 1 z 2 z 3 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  10. Online Learning h 1 h 2 h 3 Learner + + ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) Environment z 1 z 2 z 3 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  11. Online Learning h 1 h 2 h 3 Learner + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) Environment z 1 z 2 z 3 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  12. Online Learning h 1 h 2 h 3 h T Learner + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) Environment z 1 z 2 z 3 z T During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  13. Online Learning h 1 h 2 h 3 h T Learner + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  14. Online Learning Example: Online Linear Classification On each round t , 5

  15. Online Learning h t z t Example: Online Linear Classification On each round t , The learner plays a separating hyperplane h t = sign � w t , ·� Simultaneously, the environment plays a labeled example z t = ( x t , y t ) 5

  16. Online Learning h t z t Example: Online Linear Classification On each round t , The learner plays a separating hyperplane h t = sign � w t , ·� Simultaneously, the environment plays a labeled example z t = ( x t , y t ) Then, the learner incurs the hinge loss ℓ ( h t , z t ) = max(0 , 1 − y t � x t , w t � ) 5

  17. Online Learning 1 0 z 1 z 2 z 3 z 4 z 5 z 6 z 7 z 8 Example: Online Density Estimation On each round t , 6

  18. Online Learning z t 1 h t 0 z 1 z 2 z 3 z 4 z 5 z 6 z 7 z 8 Example: Online Density Estimation On each round t , The learner plays a probability distribution h t over Z Simultaneously, the environment plays an instance z t 6

  19. Online Learning z t 1 h t 0 z 1 z 2 z 3 z 4 z 5 z 6 z 7 z 8 Example: Online Density Estimation On each round t , The learner plays a probability distribution h t over Z Simultaneously, the environment plays an instance z t Then, the learner incurs the log loss ℓ ( h t , z t ) = − ln h t ( z t ) 6

  20. Online Learning Learner h 1 h 2 h 3 h T + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T Online learning can be applied to a wide range of tasks, ranging 7

  21. Online Learning h 1 h 2 h 3 h T Learner + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T D Online learning can be applied to a wide range of tasks, ranging from statistical learning, where the environment is an oblivious player modelled by a fixed probability distribution D over Z , 7

  22. Online Learning Learner h 1 h 2 h 3 h T + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T D 1 D 2 D 3 D T Online learning can be applied to a wide range of tasks, ranging from statistical learning, where the environment is an oblivious player modelled by a fixed probability distribution D over Z , to adversarial learning, where the environment is an active player who changes its distribution at each iteration in response to the learner’s moves. 7

  23. Online Learning In a nutshell, online learning is particularly suited to: Adaptive environments, where the data distribution can change over time Streaming applications, where all the data is not available in advance Large-scale datasets, by processing only one instance at a time 8

  24. Online Learning The performance of an online learning algorithm A is measured according to two metrics: 9

  25. Online Learning The performance of an online learning algorithm A is measured according to two metrics: Minimax Regret Defined by the maximum, over every sequence z 1: T = ( z 1 , · · · , z T ) ∈ Z T , of the cumulative relative loss between A and the best hypothesis in H : � T � T � � max ℓ ( h t , z t ) − min ℓ ( h , z t ) z 1: T ∈Z T h ∈H t =1 t =1 A is Hannan consistent if its minimax regret is sublinear in T . 9

  26. Online Learning The performance of an online learning algorithm A is measured according to two metrics: Minimax Regret Defined by the maximum, over every sequence z 1: T = ( z 1 , · · · , z T ) ∈ Z T , of the cumulative relative loss between A and the best hypothesis in H : � T � T � � max ℓ ( h t , z t ) − min ℓ ( h , z t ) z 1: T ∈Z T h ∈H t =1 t =1 A is Hannan consistent if its minimax regret is sublinear in T . Per-round Complexity Given by the amount of computational operations spent by A at each trial t , for choosing a hypothesis h t in H , and evaluating its loss ℓ ( h t , z t ). 9

  27. Outline 1 Online Learning 2 The Convex Case 3 The Combinatorial Case 4 Compiling Hedge 10

  28. The Convex Case loss 3 2 b 1 0 y t � w t , x t � − 1 0 1 2 H = { w ∈ R n : � w � ≤ b } ℓ ( h t , z t ) = max(0 , 1 − y t � w t , x t � ) (in blue) Online Convex Learning An online learning problem ( H , Z , ℓ ) is convex if: H is a closed convex subset of R d ℓ is convex in its first argument, i.e. ℓ ( · , z ) is convex for all z ∈ Z 11

  29. The Convex Case Online Gradient Descent Start with the vector w 1 = 0 . During each round t Play with w t Observe z t and incur loss ℓ ( w t , z t ) Update the hypothesis as follows: � 2 where w ′ � w − w ′ � � w t +1 = argmin t = w t − η t ∇ ℓ ( w t , z t ) t w ∈H 12

  30. The Convex Case Online Gradient Descent Start with the vector w 1 = 0 . During each round t Play with w t Observe z t and incur loss ℓ ( w t , z t ) Update the hypothesis as follows: � 2 where w ′ � w − w ′ � � w t +1 = argmin t = w t − η t ∇ ℓ ( w t , z t ) t w ∈H The regret of OGD with respect to any w ∗ ∈ H is bounded by T � w ∗ � 2 + 1 � η t �∇ ℓ ( w t , z t ) � 2 η T 2 t =1 Thus, if ℓ is L -Lipschitz and H is D -bounded, then using η t = D / L √ t , OGD is √ Hannan consistent with regret 2 DL T 12

  31. Outline 1 Online Learning 2 The Convex Case 3 The Combinatorial Case 4 Compiling Hedge 13

Recommend


More recommend