stochastic hamiltonian gradient methods for smooth games
play

Stochastic Hamiltonian Gradient Methods for Smooth Games Nicolas - PowerPoint PPT Presentation

Stochastic Hamiltonian Gradient Methods for Smooth Games Nicolas Loizou joint work with Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent , Simon Lacoste-Julien , Ioannis Mitliagkas . : : : ICML 2020 : :


  1. Stochastic Hamiltonian Gradient Methods for Smooth Games Nicolas Loizou joint work with Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent † , Simon Lacoste-Julien † , Ioannis Mitliagkas † . � : � : � : � ICML 2020 � : � : � : � † Canada CIFAR AI Chair N. Loizou, Stochastic Hamiltonian Methods 1 / 14

  2. Overview Min-max Optimization Problem 1 Motivation Related Work Main Contributions Classes of Stochastic Games and Hamiltonian Viewpoint 2 Stochastic Hamiltonian Gradient Methods 3 Stochastic Hamiltonian Gradient Descent Stochastic Variance Reduced Hamiltonian Gradient Method Convergence Guarantees Numerical Experiments 4 Conclusion & Future Directions of Research 5 N. Loizou, Stochastic Hamiltonian Methods 2 / 14

  3. The Min-Max Optimization Problem Problem: Stochastic Smooth Game. n x 2 2 R d 2 g ( x 1 , x 2 ) = 1 X x 1 2 R d 1 max min g i ( x 1 , x 2 ) (1) n i =1 where g : R d 1 ⇥ R d 2 ! R is a smooth objective. Goal: Find Min-max solution / Nash Equilibrium. Find x ⇤ = ( x ⇤ 2 ) 2 R d such that, for every x 1 2 R d 1 and x 2 2 R d 2 , 1 , x ⇤ g ( x ⇤ 1 , x 2 )  g ( x ⇤ 1 , x ⇤ 2 )  g ( x 1 , x ⇤ 2 ) , Appears in many applications: Domain Generalization (Albuquerque et al., 2019) Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) Formulations in Reinforcement Learning (Pfau, Vinyals, 2016) N. Loizou, Stochastic Hamiltonian Methods 3 / 14

  4. Related Work Deterministic Games: Last-iterate convergence guarantees. Classic results (Korpelevich, 1976; Nemirovski, 2004) and recent results (Mescheder et al., 2017; Daskalakis et al., 2017; Gidel et al., 2018b; Azizian et al., 2019). N. Loizou, Stochastic Hamiltonian Methods 4 / 14

  5. Related Work Deterministic Games: Last-iterate convergence guarantees. Classic results (Korpelevich, 1976; Nemirovski, 2004) and recent results (Mescheder et al., 2017; Daskalakis et al., 2017; Gidel et al., 2018b; Azizian et al., 2019). Stochastic Games: Convergent methods rely on iterate averaging over compact domains (Nemirovski, 2004) . Palaniappan & Bach, 2016 and Chavdarova et al., 2019 proposed methods with last-iterate convergence guarantees over a non-compact domain but under strong monotonicity assumption. N. Loizou, Stochastic Hamiltonian Methods 4 / 14

  6. Related Work Deterministic Games: Last-iterate convergence guarantees. Classic results (Korpelevich, 1976; Nemirovski, 2004) and recent results (Mescheder et al., 2017; Daskalakis et al., 2017; Gidel et al., 2018b; Azizian et al., 2019). Stochastic Games: Convergent methods rely on iterate averaging over compact domains (Nemirovski, 2004) . Palaniappan & Bach, 2016 and Chavdarova et al., 2019 proposed methods with last-iterate convergence guarantees over a non-compact domain but under strong monotonicity assumption. Second-Order Methods: Consensus optimization method (Mescheder et al., 2017) and Hamiltonian gradient descent (Balduzzi et al., 2018; Abernethy et al., 2019) . No available analysis for the stochastic problem. N. Loizou, Stochastic Hamiltonian Methods 4 / 14

  7. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  8. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  9. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . 3 A novel unbiased estimator of the Hamiltonian gradient. Crucial point for proving convergence for the proposed methods (existing methods use biased estimators). N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  10. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . 3 A novel unbiased estimator of the Hamiltonian gradient. Crucial point for proving convergence for the proposed methods (existing methods use biased estimators). 4 First stochastic Hamiltonian variance reduced method (linear convergence guarantees). N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  11. Main Contributions 1 First global non-asymptotic last-iterate convergence guarantees in the stochastic setting (without assuming strong monotonicity or bounded domain) including a class of non-convex non-concave games. 2 First convergence analysis of stochastic Hamiltonian methods for solving min-max problems. Existing papers on these methods are empirical (Mescheder et al. 2017, Balduzzi et al. 2018) . 3 A novel unbiased estimator of the Hamiltonian gradient. Crucial point for proving convergence for the proposed methods (existing methods use biased estimators). 4 First stochastic Hamiltonian variance reduced method (linear convergence guarantees). Hamiltonian Perspective: Popular stochastic optimization algorithms can be used as methods for solving stochastic min-max problems. N. Loizou, Stochastic Hamiltonian Methods 5 / 14

  12. Smooth Games and Hamiltonian Gradient Descent x 1 2 R d 1 max min x 2 2 R d 2 g ( x 1 , x 2 ) (2) ✓ r x 1 g ✓ r 2 r 2 ◆ ◆ x 1 , x 1 g x 1 , x 2 g x = ( x 1 , x 2 ) > 2 R d ξ ( x ) = J = r ξ = �r 2 �r 2 �r x 2 g x 2 , x 1 g x 2 , x 2 g Vector x ⇤ 2 R d is a stationary point when ξ ( x ⇤ ) = 0. Key Assumption: All stationary points of the objective g are global min-max solutions. Hamiltonian Gradient Descent (HGD) (Balduzzi et al., 2018) H ( x ) = 1 2 k ξ ( x ) k 2 . min (3) x HGD can be expressed using a Jacobian-vector product: x k +1 = x k � η k r H ( x ) = x k � η k h i J > ξ N. Loizou, Stochastic Hamiltonian Methods 6 / 14

  13. Stochastic Hamiltonian Function n x 2 2 R d 2 g ( x 1 , x 2 ) = 1 X x 1 2 R d 1 max min g i ( x 1 , x 2 ) (4) n i =1 ✓ r x 1 g i ✓ r 2 n r 2 ◆ J = 1 ◆ x 1 , x 1 g i x 1 , x 2 g i X ξ i ( x ) = J i , where J i = . �r 2 �r 2 �r x 2 g i x 2 , x 1 g i x 2 , x 2 g i n i =1 Finite-Sum Structure Hamiltonian Function n H ( x ) = 1 H i , j ( x ) = 1 X H i , j ( x ) 2 h ξ i ( x ) , ξ j ( x ) i (5) where n 2 i , j =1 Algorithms use gradient of only one component function H i , j ( x ): r H i , j ( x ) = 1 h i J > i ξ j + J > j ξ i . (6) 2 Unbiased estimator of the r H ( x ). That is, E i , j [ r H i , j ( x )] = r H ( x ). N. Loizou, Stochastic Hamiltonian Methods 7 / 14

  14. Classes of Stochastic Smooth Games Stochastic Bilinear Games. n g ( x 1 , x 2 ) = 1 X x > 1 b i + x > 1 A i x 2 + c > (7) i x 2 n i =1 Stochastic su ffi ciently bilinear games. (Abernethy et al., 2019) Games where the following condition is true: ( δ 2 + ρ 2 )( δ 2 + β 2 ) � 4 L 2 ∆ 2 > 0 , (8) ⇤ 2 and  ∆ , ρ 2 = min x 1 , x 2 λ min r 2 r 2 � � ⇥ where 0 < δ  σ i x 1 , x 1 g ( x 1 , x 2 ) x 1 , x 2 g β 2 = min x 1 , x 2 λ min ⇤ 2 . r 2 ⇥ x 2 , x 2 g ( x 1 , x 2 ) N. Loizou, Stochastic Hamiltonian Methods 8 / 14

  15. Classes of Stochastic Smooth Games Stochastic Bilinear Games. n g ( x 1 , x 2 ) = 1 X x > 1 b i + x > 1 A i x 2 + c > (7) i x 2 n i =1 Proposition: Stochastic bilinear game (7) ) Stochastic Hamiltonian function (5) is a smooth quadratic quasi-strongly convex function. Stochastic su ffi ciently bilinear games. (Abernethy et al., 2019) Games where the following condition is true: ( δ 2 + ρ 2 )( δ 2 + β 2 ) � 4 L 2 ∆ 2 > 0 , (8) ⇤ 2 and  ∆ , ρ 2 = min x 1 , x 2 λ min r 2 r 2 � � ⇥ where 0 < δ  σ i x 1 , x 1 g ( x 1 , x 2 ) x 1 , x 2 g β 2 = min x 1 , x 2 λ min ⇤ 2 . r 2 ⇥ x 2 , x 2 g ( x 1 , x 2 ) N. Loizou, Stochastic Hamiltonian Methods 8 / 14

  16. Classes of Stochastic Smooth Games Stochastic Bilinear Games. n g ( x 1 , x 2 ) = 1 X x > 1 b i + x > 1 A i x 2 + c > (7) i x 2 n i =1 Proposition: Stochastic bilinear game (7) ) Stochastic Hamiltonian function (5) is a smooth quadratic quasi-strongly convex function. Stochastic su ffi ciently bilinear games. (Abernethy et al., 2019) Games where the following condition is true: ( δ 2 + ρ 2 )( δ 2 + β 2 ) � 4 L 2 ∆ 2 > 0 , (8) ⇤ 2 and  ∆ , ρ 2 = min x 1 , x 2 λ min r 2 r 2 � � ⇥ where 0 < δ  σ i x 1 , x 1 g ( x 1 , x 2 ) x 1 , x 2 g β 2 = min x 1 , x 2 λ min ⇤ 2 . r 2 ⇥ x 2 , x 2 g ( x 1 , x 2 ) Proposition : Stochastic su ffi ciently bilinear game ) Stochastic Hamiltonian function (5) is smooth and satisfies the PL condition. N. Loizou, Stochastic Hamiltonian Methods 8 / 14

Recommend


More recommend