Using Learning Dynamics to Understand Neural Network Generalisation - PowerPoint PPT Presentation

Using Learning Dynamics to Understand Neural Network Generalisation Arnu Pretorius ∗ Supervisor: Dr. Steve Kroon (Computer Science ∗ ) Co-supervisor: Dr. Herman Kamper (EE Engineering + ) 20 October 2017 Maties Machine Learning, Stellenbosch University

The success of Deep Learning Scale of deep learning deployment at Google over time. Large-scale deep learning for intelligent computer systems , Dean. BayLearn keynote speech, 2015. 1

The mysteries of Deep Learning • Generalisation • The role of explicit versus implicit regularisation • Non-convex optimisation • Adversarial examples • and more ... Understanding deep learning requires rethinking generalization , Zhang, Bengio, Hardt, Recht, Vinyals. ICLR, 2017. On large-batch training for deep learning: generalization gap and sharp minima , Keskar, Mudigere, Smelyanskiy, Tang. ICLR, 2017. Intriguing Properties of Neural Networks , Szegedy, Zaremba, Sutskever, Bruna, Erhan, Goodfellow, Fergus. ICLR, 2014. 2

Approaches Towards Understanding Neural Networks Approaches towards understanding neural networks. [1] Understanding the difficulty of training deep feedforward neural networks , Glorot, Bengio. AISTATS, 2010. [2] Sharp minima can generalize for deep nets , Dinh, Pascanu, Bengio, Bengio. arXiv 2017. [3] Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , Saxe, McClelland, Ganguli. ICLR, 2014. [4] On the Expressive Power of Deep Neural Networks , Raghu, Poole, Kleinberg, Ganguli, Dickstein. ICML, 2017. 3

Scalar linear neural network Let w 2 , w 1 , x ∈ R , y = w 2 w 1 x ˆ (1) Scalar linear neural network. 4

Scalar learning dynamics Loss surface (1 − w 2 w 1 ) 2 , GD path (orange), learning dynamics (green). 5

Exact solutions for scalar learning dynamics The learning dynamics for a scalar linear neural network starting from small initial values u 0 ≡ w 2(0) w 1(0) is given by yE u f ( t ) = x ( E − 1) + y / u 0 , (2) where E = e 2 yxt α , α is the learning rate and t is measured in epochs. Simulated versus theoretical learning dynamics. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , Saxe, McClelland, Ganguli. ICLR, 2014. 6

Learning dynamics for deep linear networks Linear neural network learning dynamics. Learning hierarchical category structure in deep neural networks , Saxe, McClelland, Ganguli. CogSci, 2013. 7

Our current focus 3.5 No regularisation Simulated 3.0 Theory 2.5 Mode strength 2.0 1.5 • The role of regularisation 1.0 0.5 – Weight decay 0.0 0 200 400 600 800 1000 Epoch (t) ( s − λ ∗ ) e 2( s − λ ∗ ) t /τ u ( t , s , u 0 , λ ∗ ) = e 2( s − λ ∗ ) t /τ − 1 + ( s − λ ∗ ) / u 0 – Dropout 3.5 No regularisation � s � τ Dropout τ � � 3.0 Ed ∼ Bern ( p )[ t ] ≤ ln = O ǫ sp sp 2.5 Mode strength 2.0 • Autoencoder networks 1.5 • Generalisation 1.0 0.5 0.0 0 200 400 600 800 1000 Epoch (t) 8

Summary • Deep learning has been hugely successful in solving large and complex machine learning task, however many mysteries remain. • Better understanding deep neural networks might be achieved via several different routes. • Studying the learning dynamics of neural networks may help us understand how neural networks learn. • We hope to use this learning dynamics approach to study generalisation in deep neural networks. 9

Thank you for listening! 9

Using Learning Dynamics to Understand Neural Network Generalisation - PowerPoint PPT Presentation

Using Learning Dynamics to Understand Neural Network Generalisation Arnu Pretorius Supervisor: Dr. Steve Kroon (Computer Science ) Co-supervisor: Dr. Herman Kamper (EE Engineering + ) 20 October 2017 Maties Machine Learning, Stellenbosch

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Computa(on through dynamics Using recurrent neural networks to unveil mechanism in neural

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network

Advanced Machine Learning Dense Neural Networks Amit Sethi Electrical Engineering, IIT Bombay

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Neural Network II Neural Network II Week 8 1 Team Homework Assignment #10 Team Homework

CHAPTER VI VI CHAPTER Learning in Feedforward Feedforward Learning in Neural Networks Neural

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

Deep Learning Primer Nishith Khandwala Neural Networks Overview Neural Network Basics

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Web Dynamics Part 1 Introduction 1.1 Dimensions of dynamics in the Web 1.2 Application examples

Railway Optimization Tom Robenek Transport and Mobility Laboratory EPFL May 6, 2013 1 / 25

LEO KROON Research: Model based Decision Support for planning and real-time rescheduling in

Mode l s f o r r o l l i ng s t o ck p l ann ing Leo Kroon , AM O RE mee t

Steve Kroon PLEASED: Planning, Learning, and Search for Decision-making.

Cohomology Seminar Algorithms Jari de Kroon Eindhoven University of Technology May 22, 2018

JCCS Expert Task Group on Robustness Dr T.D. Gerard Canisius Scott Wilson PLC, The UK Chairman,

The future of some Bianchi A spacetimes with an ensemble of free falling particles Ernesto

Monte-Carlo Tree Search Parallelisation International Go Symposium 2012 Francois van Niekerk