Deep Random Neural Field Shun-ichi Amari RIKEN Center for Brain - PowerPoint PPT Presentation

Deep Learning and Physics -- 2019 Deep Random Neural Field Shun-ichi Amari RIKEN Center for Brain Science ； Araya

Brief History of AI and NN First Boom ： start 1956 ~ AI neural networks--perceptron Dartmouth Conf. Perceptron symbol universal computation logic learning machine Dark period (late 1960~1970’s) stochastic gradient descent learning (1967) for MLP

Perceptron F.Rosenblatt, Principles of Neurodynamics, 1961 x z McCulloch-Pitts neuron 0,1 binary learning Multilayer lateral & feedback connection

latt ： multilayer Ros osenbla Dee eep Neu Neural Ne Networks er perceptron = x ( , ) z f x W ( ) ( ) 2 = − , , L W y f W x x ( ) ∂ , L W x → + ∆ ∆ = − , c w w w w learning of hidden neurons ∂ W differentiable : analog neuron analog neuron stochastic gradient learning Amari, Tsypkin, 1966~67: error back-prop, 1976

First stochastic descent learning of MLP (1967;1968) Information Theory II --Geometrical Theory of Information Shun-ichi Amari University of Tokyo Kyoritu Press, Tokyo, 1968

( ) { } { } θ = ⋅ ⋅ + ⋅ ⋅ , max , min , f v v x w x w x w x w x 1 1 2 2 3 4 w max 1 v 1 x y v 2 max w 4

Second Boom 1970~ AI 1980~ neural networks expert system MLP (backprop) (MYCIN) associative memory stochastic inference (Bayes) chess (1997)

Third Boom 2010~ Deep learning Stochastic inference (graphical model; Bayesian; WATSON) Deep learning pattern recognition: vision, auditory, sentence analysis, machine translation alpha-go Language processing; sequence and dynamics (word2vec, deep learning with rec. net) Integration of (symbol, logic) vs (pattern, dynamics)

De Deep L p Learni rning Self-Organization + Supervised Learning RBM: Restricted Boltzmann Machine Auto-Encoder, Recurrent Net Dropout Contrastive divergence Convolution Resnet ReLU Adversarial net

Victory of Deep Neural Networks Hinton 2005, 2006 ~ 2012 many others visual pattern, auditory pattern Go-game sentence analysis, machine translation adversarial network, pattern generation

Mathematical Neuroscience searches for the principles mathematical studies using simple idealistic models (not realistic)  Computational neuroscience  AI : technological realization

Mathematical Neuroscience and Brain Brain has found and implemented the principles through evolution (random search) historical restriction material restriction Very complex (not smartly designed)

Theoretical Problems on Learning: 1 Local solution and global solution ( ) Θ L Simulated annealing Quantum annealing : Θ

Theoretical Problem of Learning: 2 = θ + ε ( , ) y f x training loss and 1 ∑ = − θ 2 | ( , ) | L y f x generalization loss :overtraining emp i i N = − θ 2 [| ( , ) | ] L E y f x gen P ≈ + L L generalization loss gen emp N Training loss

Extremely wdie network P-> ∞ ： P>>N Local minimum =global minimum Kawaguchi, 2019

Learning curve P>>N Double descent Belkin et al. 2019 ； Hastie et al. 2019

Random Neural Network Random is excellent !! Random is magic!! Statistical dynamics Random code

Random Deep Networks Poole et. Al., 2016 Schoenholtz et. Al., 2017 ~~ Signal propagation Error back propagation

Jacot et al; Neural tangent kernel 1 = θ θ = − θ 2 ( , ); ( , ) ( ( , )) y f x l x y f x 2 1 θ = − θ = θ − θ 2 ( , ) { ( , )} ; ( , ) ( , *) l x y f x e f x f x 2 ここに数式を入力します。 ∂ θ = − ∂ η = − ∂ η θ − θ ( ')( ( , ) ( , *)) l f x f x f x θ θ t ∂ θ = ∂ ∂ θ = − ∂ η ⋅∂ θ ( , ) ( ) ( ) ( ') ( ', ) f x f x f x f x e x θ θ θ t t K; Gaussian kernel

θ = ∂ ⋅∂ ( , '; ) ( ) ( ') K x x f x f x θ θ ∂ θ = − < η θ θ > ( , ) ( , '; ) ( ', ) f x K x x e x t θ ≈ ( , '; ) ( , ') : Gaussian kernel K x x K x x initial θ ≈ θ t ini

Theorem P>>N Optimal solution lies near a random network. random Bailey et al 2019 1 = ( ) w O ij n 1 ∆ = ( ) w O n ij

Random Neural Field − 1 l ∫ = + ( ') ( , ') ( ) ( ') u z w z z x z dz b z l = ϕ ( ') ( ( ')) x z u z ( ', ) : randam w z z (0 mean Gaussian correla ; t d e )

Stati tisti tical Ne Neurodynam amics microdynamics ( ) ( ) ( ) ( ) = + = sgn W x t x x t 1 T x x t w macrodynamics + = X F X ( ) : macrostate t 1 t = = X X ( x ) X T ( x ) 2 2 W 1 = = X X ( x ) X T T ( x )? 3 3 W W 1

Statistical Neurodynamics Rozonoer (1969 ） Amari (1969, 1971; 1973) Sompolinski Amari et al (2013) ~ (0, 1) w N ij Toyoizumi et al (2015) Poole, …, Ganguli (2016) Schoenholz et al (2017) Yang & Schoenholtz (2017), Macroscopic behaviors Karakida, et al (2019) common to almost all (typical) networks Jacot et al. (2019) ……

Random Deep Networks + 1 l l l ∑ = ϕ + ( ) x w x w ij j 0 i i 1 σ ∑ 2 ~ (0, / ) 2 w N n = A x ij l i n l l σ 2 l ~ (0, ) w N = 0 i b ( ) A F A + 1 l l

Macroscopic variables 1 ∑ = 2 activity : A x i n distance: = [ : '] D D x x metric,curvature & Fisher information = ( ) A F A + 1 l l = ( ) D K D + 1 l l

Dynamics of Activity: law of large numbers ∑ = ϕ + = ϕ = φ +   ( ) ( ) : ( ) x w x b u x Wx b i ik k i i ~ (0, ) u N A i 1 ∑  = = ϕ = χ  2 2 ( ) [ ( ) ] ( ) A x E u A 0 i i n + 1 l ∫ χ = ϕ 2 ( ) ( ) ~ (0,1) A Av Dv v N 0

χ > '(0) 1 0 = χ ( ) A A 0 ∑ → 2 converge x i

Pullback Metric & Curvature = φ  ( ) x Wx 1 ∑ l = = ⋅ 2 i j ds g dx dx d d x x ij n l

= φ  ( ) Basis vec tors x Wx ( ) ∑ ∑ ′ = ϕ =  i i dx u W dx B dx l l i i i i i i − − − − 1 1 1 1 l l l l l l − − 1 1 l l l l m m = =  ここに数式を入力します。 ( .. . ) d Bd B Bd x x x ( ) ′ = ϕ i i : Jacob ia n B u W l l i i i − − 1 1 l l l − − 1 1 l l l l m m = = . . . B B B e e e a a a

1 = ⋅ e e g ab a b n l

Dynamics of Metric ∑ =  a dx B dx a k k =  e B e a a = = ϕ a a ( ) ( '( ) ) B B u w k a k ∑ =  a b g B B g ab k j kj ϕ = ϕ 2 2 a a a a E[ '( )) ] E[ '( )) ]E[ ] u w w u w w a k j a k j − − mean field approximation ∫ χ = ϕ 2 ( ) '( ) A Av Dv 1

Mectric Ｌａｗ of large numbers − 1 l l l l = = , g BB g e e a b ab ab l l l ∑ = 2 ds g d x d x a b ab ( ) ∑ 2 ′  ′  = ϕ ≈ σ ϕ δ 2 2 i i BB w w u E   l l ′ ′ i i i l i i − − − − 1 1 1 1 l l l l l i l ( )   2 ′ χ = σ ϕ 2 E u     1 l i l

= ∏ l χ ( ) ( ( )) ( ) g x x g x 1 ij ij conformal geometry

= χ  rotation, expansion ( ) ( ) ( ) g x A g x 1 ij ij conformal transformation! χ = χ δ ( ) A 1 1 ij l ⇒ = χ δ l g 1 ij ij

Domino Theorem − − 1 2 l l l ∂ ∂ ∂ x x x = = =  B BB BBB m m m ∂ ∂ ∂ x x x − − 1 2 l l l ∂ ∂ ∂ x x x = = =  B BB BB B m m m ∂ ∂ ∂ W W W ′ Σ δ = χ δ i i B B L L ′ ′ ′ 1 i i i i i i − − − − 1 1 1 1 L L L L L L ′ Σ δ = χ χ χ χ δ i i B B BBBB L L ′ ′ ′ 1 1 1 1 i i i i i i − − − − L L L 1 L 1 L 1 L 1

Dynamics of Curvature  = ∇ = ∂ ∂   i i H e x ab a b a b = ϕ ⋅ ⋅ + ϕ ⋅∂ ''( )( )( ) '( ) u w e w e w e i a b a b  ⊥ = +  H H H ab ab ab   = 2 2 | | H H ab ab

∫ χ = ϕ 2 ( ) ''( ) A Av Dv 2 2 2 + 1 1 l l l l = χ δ + + χ ( )( 2 1) ( ) H A A H ab ab χ 2 1 ab 2 n 1 χ > 1 1 exponential expansion! creation is smal l !

Poole et al (2016) Deep neural networks

Distance 1 [ ] ∑ 2 = − , D x y x y i i n

Dynamics of Distance (Amari, 1974) 1 ∑ = − 2 ( , ') ( ') D x x x x i i n 1 ∑ = ⋅ = ( , ') ' ' C x x x x x x i i n = + − ' 2 D A A C ∑ = i ~N(0, V) u w y i k k ∑ ( ) = i ' ' V= u w y A C i k k  = ϕ − ε + ν ϕ − ε + ν ( ') E[ ( ) ( ' )] C A C A C C A C C

+ = ( ) D K D 1 l l  dD = χ > 1 1 dD

Deep Random Neural Field Shun-ichi Amari RIKEN Center for Brain - PowerPoint PPT Presentation

Deep Learning and Physics -- 2019 Deep Random Neural Field Shun-ichi Amari RIKEN Center for Brain Science Araya Brief History of AI and NN First Boom start 1956 ~ AI neural

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

The Fundamentals of Deep Learning Building Blocks Theory with Applications Neural Units Neural

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Optimizing Deep Neural Networks Leena Chennuru Vankadara 26-10-2015 Table of Contents Neural

On the Expressive Power of Deep Neural Networks Maithra Raghu, Ben Poole, Jon Kleinberg, Surya

Visualizing and Interpreting Deep Neural Networks Bolei Zhou Department of Information

Weight Parameterizations in Deep Neural Networks Sergey Zagoruyko e Paris-Est, Universit

1/10/2019 www.captain.ca.gov/handouts.html 9:30- 10:30 Developed by Ann England, M.A., CCC-SLP-L

of Learning Curricular Implications for Medical Student & Residency Training Lawrence

Processing in Audio-Visual Human-Robot Interaction Petros Maragos and Athanasia Zlatintsi

Integrated Vehicle- Based Safety Systems (IVBSS) Initiative Chris Flanigan FMCSA Office of

CMP722 ADVANCED COMPUTER VISION Lecture #4 Multimodality Aykut Erdem // Hacettepe

The User Experience Week 15 LBSC 671 Creating Information Infrastructures Tonight

Beyond Text INFM 718X/LBSC 708X Session 10 Douglas W. Oard Agenda Beyond Text, but still

Graphics, Interaction and Perception in Augmented and Virtual Reality AR/VR Karan Singh Inspired

Deep Random Neural Field Shun-ichi Amari RIKEN Center for Brain - PowerPoint PPT Presentation

Deep Learning and Physics -- 2019 Deep Random Neural Field Shun-ichi Amari RIKEN Center for Brain Science Araya Brief History of AI and NN First Boom start 1956 ~ AI neural

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

The Fundamentals of Deep Learning Building Blocks Theory with Applications Neural Units Neural

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Optimizing Deep Neural Networks Leena Chennuru Vankadara 26-10-2015 Table of Contents Neural

On the Expressive Power of Deep Neural Networks Maithra Raghu, Ben Poole, Jon Kleinberg, Surya

Visualizing and Interpreting Deep Neural Networks Bolei Zhou Department of Information

Weight Parameterizations in Deep Neural Networks Sergey Zagoruyko e Paris-Est, Universit

1/10/2019 www.captain.ca.gov/handouts.html 9:30- 10:30 Developed by Ann England, M.A., CCC-SLP-L

of Learning Curricular Implications for Medical Student &amp; Residency Training Lawrence

Processing in Audio-Visual Human-Robot Interaction Petros Maragos and Athanasia Zlatintsi

Integrated Vehicle- Based Safety Systems (IVBSS) Initiative Chris Flanigan FMCSA Office of

CMP722 ADVANCED COMPUTER VISION Lecture #4 Multimodality Aykut Erdem // Hacettepe

The User Experience Week 15 LBSC 671 Creating Information Infrastructures Tonight

Beyond Text INFM 718X/LBSC 708X Session 10 Douglas W. Oard Agenda Beyond Text, but still

Graphics, Interaction and Perception in Augmented and Virtual Reality AR/VR Karan Singh Inspired

of Learning Curricular Implications for Medical Student & Residency Training Lawrence