learning step size controllers for robust neural network
play

Learning Step Size Controllers for Robust Neural Network Training - PowerPoint PPT Presentation

Learning Step Size Controllers for Robust Neural Network Training Christian Daniel et al. Recent Trends in Automated Machine Learning Abeeha Shafiq 18.07.2019 Motivation Optimizers are sensitive to initial learning rate Good


  1. Learning Step Size Controllers for Robust Neural Network Training Christian Daniel et al. Recent Trends in Automated Machine Learning Abeeha Shafiq 18.07.2019

  2. Motivation • Optimizers are sensitive to initial learning rate • Good learning rate is problem specific • Manual search required Image taken from I2DL lecture slide Abeeha Shafiq | Recent Trends in Automated Machine Learning 2

  3. Previous Work • Waterfall scheme • Exponential/power scheme • TONGA Abeeha Shafiq | Recent Trends in Automated Machine Learning 3

  4. Goal Develop an adaptive controller for the learning rate used in training algorithms such as Stochastic Gradient Descent (SGD) with Reinforcement Learning Abeeha Shafiq | Recent Trends in Automated Machine Learning 4

  5. Contributions • Identifying informative features for controller • Proposing a learning setup for a controller • Showing that the resulting controller generalizes across different tasks and architectures. Abeeha Shafiq | Recent Trends in Automated Machine Learning 5

  6. Problem statement for controller • Find the minimizer • F ( · ) sums over the function values induced by the individual inputs T ( · ) is an optimization operator which yields a weight update vector to find ω ∗ • • SGD weight update Abeeha Shafiq | Recent Trends in Automated Machine Learning 6

  7. Learning a Controller Relative Entropy Policy Search (REPS) Concept similar to Proximal Policy Optimization Abeeha Shafiq | Recent Trends in Automated Machine Learning 7

  8. Features • Informative about current state • Generalize across different tasks and architectures • Constrained by computation and memory limits

  9. Features • Predictive change in function value. • Disagreement of function values. Abeeha Shafiq | Recent Trends in Automated Machine Learning 9

  10. Mini Batch Setting • Discounted Average. • Smooths outliers • Serve as memory • Uncertainty Estimate • Estimate of noise in the system Abeeha Shafiq | Recent Trends in Automated Machine Learning 10

  11. Experimental Setup • Datasets: MNIST, CIFAR-10 • Learning Algorithms: SGD and RMSProp • Model: CNN • For Learning Controller parameters: • Subset of MNIST • Small CNN architecture • π ( θ ) to a Gaussian with isotropic covariance Abeeha Shafiq | Recent Trends in Automated Machine Learning 11

  12. Results • overhead of 36% for controller training • Generalized to different variants of CNN • Did not generalize to different training methods Abeeha Shafiq | Recent Trends in Automated Machine Learning 12

  13. Static RMSProp vs Controlled RMSProp Abeeha Shafiq | Recent Trends in Automated Machine Learning 13

  14. Static SGD vs Controlled SGD Abeeha Shafiq | Recent Trends in Automated Machine Learning 14

  15. Discussion • Strengths: • Features • Not sensitive to initial learning rate • Effort to generalize • Weakness: • Tested on only 2 dataset • CNN only • Lacks comparison with • learning rate decay techniques • Grid search for initial learning rate This is a prior technique to learning the complete optimizer Abeeha Shafiq | Recent Trends in Automated Machine Learning 15

  16. Questions?

Recommend


More recommend