Monte Carlo Methods and Neural Networks Alexander Keller, partially - PowerPoint PPT Presentation

Monte Carlo Methods and Neural Networks Alexander Keller, partially joint work with Noah Gamboa

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation � input layer a 0 , L − 1 fully connected hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 − 1 a L , n L − 1 a 1 , n 1 − 1 a 2 , n 2 − 1 2

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation � input layer a 0 , L − 1 fully connected hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 − 1 a L , n L − 1 a 1 , n 1 − 1 a 2 , n 2 − 1 – n l rectified linear units (ReLU) a l , i = max { 0 , ∑ w l , j , i · a l − 1 , j } in layer l 2

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation � input layer a 0 , L − 1 fully connected hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 − 1 a L , n L − 1 a 1 , n 1 − 1 a 2 , n 2 − 1 – n l rectified linear units (ReLU) a l , i = max { 0 , ∑ w l , j , i · a l − 1 , j } in layer l – backpropagating the error δ l − 1 , i = ∑ a l , j > 0 δ l , j · w l , j , i 2

Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation � input layer a 0 , L − 1 fully connected hidden layers, and output layer a L a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 − 1 a L , n L − 1 a 1 , n 1 − 1 a 2 , n 2 − 1 – n l rectified linear units (ReLU) a l , i = max { 0 , ∑ w l , j , i · a l − 1 , j } in layer l – backpropagating the error δ l − 1 , i = ∑ a l , j > 0 δ l , j · w l , j , i , update weights w ′ l , j , i = w l , j , i − λ · δ l , j · a l − 1 , i if a l , j > 0 2

Artificial Neural Networks in a Nutshell Convolutional neural networks: Similarity measures � convolutional layer: feature map defined by convolution kernel – identical weights across all neural units of one feature map � max pooling layer: maximum of tile of neurons in feature map for subsampling ◮ Gradient based learning applied to document recognition ◮ Quasi-Monte Carlo feature maps for shift-invariant kernels 3

Relations to Mathematical Objects

Relations to Mathematical Objects Maximum pooling layers � rectified linear unit ReLU ( x ) := max { 0 , x } as a basic non-linearity 5

Relations to Mathematical Objects Maximum pooling layers � rectified linear unit ReLU ( x ) := max { 0 , x } as a basic non-linearity � for example, leaky ReLU is ReLU ( x ) − α · ReLU ( − x ) 5

Relations to Mathematical Objects Maximum pooling layers � rectified linear unit ReLU ( x ) := max { 0 , x } as a basic non-linearity � for example, leaky ReLU is ReLU ( x ) − α · ReLU ( − x ) which for α = − 1 yields the absolute value | x | = ReLU ( x )+ ReLU ( − x ) 5

Relations to Mathematical Objects Maximum pooling layers � rectified linear unit ReLU ( x ) := max { 0 , x } as a basic non-linearity � for example, leaky ReLU is ReLU ( x ) − α · ReLU ( − x ) which for α = − 1 yields the absolute value | x | = ReLU ( x )+ ReLU ( − x ) � hence the maximum of two values is � � max { x , y } = x + y x − y � = 1 � � + 2 · ( x + y + ReLU ( x − y )+ ReLU ( y − x ))) � � 2 2 � which allows one to represent maximum pooling by ReLU functions and introduces skip links 5

Relations to Mathematical Objects Residual layers looks like projections onto half spaces � halfspace H + with weights ˆ ω as normal and bias b as distance from origin O 6

Relations to Mathematical Objects Residual layers as differential equations � relation to a differential equation by introduction a step size h W ( 2 ) � 0 , W ( 1 ) � a l = a l − 1 + h · max · a l − 1 l l 7

Relations to Mathematical Objects Residual layers as differential equations � relation to a differential equation by introduction a step size h W ( 2 ) � 0 , W ( 1 ) � a l = a l − 1 + h · max · a l − 1 resembles like Euler method l l a l − a l − 1 ⇔ h 7

Relations to Mathematical Objects Residual layers as differential equations � relation to a differential equation by introduction a step size h W ( 2 ) � 0 , W ( 1 ) � a l = a l − 1 + h · max · a l − 1 resembles like Euler method l l a l − a l − 1 W ( 2 ) � 0 , W ( 1 ) � ⇔ = max · a l − 1 h l l which for h → 0 becomes ˙ a l 7

Relations to Mathematical Objects Residual layers as differential equations � relation to a differential equation by introduction a step size h W ( 2 ) � 0 , W ( 1 ) � a l = a l − 1 + h · max · a l − 1 resembles like Euler method l l a l − a l − 1 W ( 2 ) � 0 , W ( 1 ) � ⇔ = max · a l − 1 h l l which for h → 0 becomes ˙ a l – select your favorite ordinary differential equation to determine W ( 1 ) and W ( 2 ) l l ◮ Neural networks motivated by partial differential Equations – use your favorite ordinary differential equation solver for both inference and training ◮ A radical new neural network design could overcome big challenges in AI 7

Relations to Mathematical Objects Learning integral operator kernels � neural unit with ReLU � � n l − 1 − 1 ∑ a l , j := max 0 , w l , j , i a l − 1 , i i = 0 8

Relations to Mathematical Objects Learning integral operator kernels � neural unit with ReLU � � n l − 1 − 1 n l − 1 − 1 ∑ ∑ a l , j := max 0 , w l , j , i a l − 1 , i → a l , j := w l , j , i max { 0 , a l − 1 , i } i = 0 i = 0 8

Relations to Mathematical Objects Learning integral operator kernels � neural unit with ReLU � � n l − 1 − 1 n l − 1 − 1 ∑ ∑ a l , j := max 0 , w l , j , i a l − 1 , i → a l , j := w l , j , i max { 0 , a l − 1 , i } i = 0 i = 0 written in continuous form � 1 a l ( y ) := 0 w l ( x , y )max { 0 , a l − 1 ( x ) } dx relates to high-dimensional integro-approximation 8

Relations to Mathematical Objects Learning integral operator kernels � neural unit with ReLU � � n l − 1 − 1 n l − 1 − 1 ∑ ∑ a l , j := max 0 , w l , j , i a l − 1 , i → a l , j := w l , j , i max { 0 , a l − 1 , i } i = 0 i = 0 written in continuous form � 1 a l ( y ) := 0 w l ( x , y )max { 0 , a l − 1 ( x ) } dx relates to high-dimensional integro-approximation � recurrent neural network layer in continuous form alludes to integral equation � 1 a ′ l ( y ) := 0 w l ( x , y )max { 0 , a l − 1 ( x ) } dx 8

Relations to Mathematical Objects Learning integral operator kernels � neural unit with ReLU � � n l − 1 − 1 n l − 1 − 1 ∑ ∑ a l , j := max 0 , w l , j , i a l − 1 , i → a l , j := w l , j , i max { 0 , a l − 1 , i } i = 0 i = 0 written in continuous form � 1 a l ( y ) := 0 w l ( x , y )max { 0 , a l − 1 ( x ) } dx relates to high-dimensional integro-approximation � recurrent neural network layer in continuous form alludes to integral equation � 1 � 1 a ′ 0 w h l ( y ) := 0 w l ( x , y )max { 0 , a l − 1 ( x ) } dx + l ( x , y )max { 0 , a l ( x ) } dx – weights w h establish recurrence, e.g. for processing sequences of data 8

Monte Carlo Methods and Neural Networks Explore algorithms linear in time and space � structural equivalence of integral equations and reinforcement learning � learning integro-approximation from noisy/sampled data 9

Monte Carlo Methods and Neural Networks Explore algorithms linear in time and space � structural equivalence of integral equations and reinforcement learning � learning integro-approximation from noisy/sampled data � examples of random sampling – pseudo-random initialization – training by stochastic gradient descent – regularization by drop-out and drop-connect – random binarization – sampling by generative adversarial networks – fixed pseudo-random matrices for direct feedback alignment ◮ Learning light transport the reinforced way ◮ Machine learning and integral equations ◮ Noise2Noise: Learning image restoration without clean data 9

Partition instead of Dropout

Partition instead of Dropout Guaranteeing coverage of neural units � dropout neuron if threshold 1 P > ξ – ξ by linear feedback shift register generator (for example) a 0 , 0 a L , 0 a 1 , 0 a 2 , 0 a 0 , 1 a L , 1 a 1 , 1 a 2 , 1 a 0 , 2 a L , 2 a 1 , 2 a 2 , 2 . . . . . . . . . . . . a 0 , n 0 − 1 a L , n L − 1 a 1 , n 1 − 1 a 2 , n 2 − 1 11

Monte Carlo Methods and Neural Networks Alexander Keller, partially - PowerPoint PPT Presentation

Monte Carlo Methods and Neural Networks Alexander Keller, partially joint work with Noah Gamboa Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation input layer a 0 , L 1 fully connected

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Methods for physically based Volume rendering Monte Carlo Methods for physically based

Monte Carlo methods for volumetric light transport Monte Carlo methods for volumetric light

Monte Carlo Methods An introduction to Monte Carlo (MC) methods How to use MC methods

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.

Monte Carlo Methods and Area Estimates CS3220 - Summer 2008 Jonathan Kaldor Monte Carlo Methods

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Monte Carlo Methods Monte Carlo Methods I, at any rate, am convinced that He does not throw dice.

Cryptanalysis of MORUS (Initially discussed at Lorentz center in Mar 2018) Tomer Ashur

Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University

Unsupervised Learning of Object Deformation Models Iasonas Kokkinos and Alan Yuille Center for

31) Feature Models and MDA for Product Lines 1. Feature Models 2. Product Linie Configuration with

Viterbi Algorithm Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical

Lecture 2 (I ): Lecture 2 (I ): Pipelining & Retiming Pipelining & Retiming

Quiz 1 Quiz 1 Question 1 Compare the differences between a thread and a process. What do both

Quiz Question: Assuming a preemptive shortest job first algorithm is in effect, a) Draw the Gantt

Monte Carlo Methods and Neural Networks Alexander Keller, partially - PowerPoint PPT Presentation

Monte Carlo Methods and Neural Networks Alexander Keller, partially joint work with Noah Gamboa Artificial Neural Networks in a Nutshell Supervised learning of high dimensional function approximation input layer a 0 , L 1 fully connected

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Methods for physically based Volume rendering Monte Carlo Methods for physically based

Monte Carlo methods for volumetric light transport Monte Carlo methods for volumetric light

Monte Carlo Methods An introduction to Monte Carlo (MC) methods How to use MC methods

4. THE MONTE CARLO METHOD 4.1 I ntroduction This chapter is aimed at describing the Monte Carlo

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&amp;B 5.3-5.5, 5.7 Lecture Outline 1.

Monte Carlo Methods and Area Estimates CS3220 - Summer 2008 Jonathan Kaldor Monte Carlo Methods

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Monte Carlo Methods Monte Carlo Methods I, at any rate, am convinced that He does not throw dice.

Cryptanalysis of MORUS (Initially discussed at Lorentz center in Mar 2018) Tomer Ashur

Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University

Unsupervised Learning of Object Deformation Models Iasonas Kokkinos and Alan Yuille Center for

31) Feature Models and MDA for Product Lines 1. Feature Models 2. Product Linie Configuration with

Viterbi Algorithm Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical

Lecture 2 (I ): Lecture 2 (I ): Pipelining &amp; Retiming Pipelining &amp; Retiming

Quiz 1 Quiz 1 Question 1 Compare the differences between a thread and a process. What do both

Quiz Question: Assuming a preemptive shortest job first algorithm is in effect, a) Draw the Gantt

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.

Lecture 2 (I ): Lecture 2 (I ): Pipelining & Retiming Pipelining & Retiming