Neural Networks Stefan Edelkamp 1 Overview - Introduction - - PowerPoint PPT Presentation

Neural Networks Stefan Edelkamp

1 Overview - Introduction - Percepton - Hofield-Nets - Self-Organizing Maps - Feed-Forward Neural Networks - Backpropagation Overview 1

2 Introduction Idea: Mimic principle of biological neural networks with artificial neural networks 6 5 9 8 1 3 2 7 4 - adapt settled solutions of nature - parallelization ⇒ high performance - redundancy ⇒ tolerance for failures - enable learning with small programming efforts Introduction 2

Ingrediences Needs for an artificial neural network: • behavior artificial neurons • order of computation • activation function • structure of the net (topology) • recurrent nets • feed-forward nets • integration in environment • learning algorithm Introduction 3

Percepton-Learning . . . very simple network with no hidden neurons Inputs: x , weighted with w , weights added Activating Function: Θ Output: z , determined by computing Θ( w T x ) Additional: weighted input representing constant 1 Introduction 4

Training R d → { 0 , 1 } net function f : M ⊂ I 1. initialize counter i and initial weight vector w 0 to 0 2. as long as there are vectors for which w i x ≤ 0 set w i +1 to w i + x and increase i by 1 3. return w i +1 Introduction 5

Termination on Training Data Assume vector w to be normalized, and w ∗ to be final with || w ∗ || = 1 - f = Θ(( x , 1) w ∗ ) , constants δ and γ with | ( x , 1) w ∗ | ≥ δ and || ( x , 1) || ≤ γ - for angle between w i and w ∗ we have 1 ≥ cos α i = w T i w ∗ / || w i || i +1 w ∗ = ( w i + x i ) T w ∗ = w T i w ∗ + x T - w T i w ∗ 0 w ∗ ≥ δ ⇒ w T i +1 w ∗ ≥ δ ( i + 1) - x T � � || w i || 2 + || x i || 2 + 2 w T ( w i + x i ) T ( w i + x i ) ≤ - || w i +1 || = i x i ≤ √ || w i || 2 + γ 2 ≤ γ √ i + 1 (Induction: || w i || ≤ γ � i ) ⇒ cos α i ≥ δ √ i + 1 /γ converges to ∞ (as i goes to ∞ ) Introduction 6

3 Hopfield Nets 1 2 Neurons: . . . d ; x i ∈ { 0 , 1 } Activations: x 1 x 2 . . . x d Connections: w ij ∈ I R (1 ≤ i, j ≤ d ) with � � w ii = 0 , w ij = w ji ⇒ W := w ij d × d Update: asynchronous & stochastic  if � d 0 i =1 x i w ij < 0    x ′ j := if � d 1 i =1 x i w ij > 0   x j otherwise  Hopfield Nets 7

Example 1 x 1 x 2   0 1 − 2   W = 1 0 3 − 2 3   − 2 3 0 x 3 Use: • associative memory • computing Boolean functions • combinatorical optimization Hopfield Nets 8

Energy of a Hopfield-Net x = ( x 1 , x 2 , . . . , x d ) T ⇒ E ( x ) := − 1 2 x T W x = − � i<j x i w ij x j be the energy of a Hopfield net Theorem Every update, which changes the Hopfield-Netz, reduces the energy. Proof Assume: Update changes x k into x ′ k ⇒ � � E ( x ) − E ( x ′ ) x ′ i w ij x ′ = − x i w ij x j + j i<j i<j � � x ′ k w kj x ′ = x k w kj x j + − j �� j � = k j � = k = x j � � � x k + ( − x ′ = − k ) w kj x j > 0 j � = k Hopfield Nets 9

Solving a COP Input: Combinatorial Optimization Problem (COP) Output: Solution for COP Algorithm: • select Hopfield net with parameters of COP as weights and solution at minimum of energy • start net with random activation • computer sequence of updates until stabilization • read parameters • test feasibility and optimality of solution Hopfield Nets 10

Multi Flop-Problem Problem Instance: k, n ∈ I N , k < n x = ( x 1 , . . . , x n ) ∈ { 0 , 1 } n Feasible Solutions: ˜ x ) = � n Objective Function: P (˜ i =1 x i Optimal Solution: solution ˜ x with P (˜ x ) = k Minimization Problem: d = n + 1 , x d = 1 , x = ( x 1 , x 2 , . . . , x n , x d ) T ⇒ � 2 �� d E ( x ) = i =1 x i − ( k + 1) d d � � � x 2 x i + ( k + 1) 2 = + x i x j − 2( k + 1) i �� i =1 i � = j i =1 = x i d − 1 � � x i x d + k 2 = x i x j − (2 k + 1) i =1 i � = j − 1 x i ( − 4) x j − 1 � � x i (4 k + 2) x d + k 2 = 2 2 i<j i<d Hopfield Nets 11

Example ( n = 3 , k = 1) : x 1 1 − 2 1 x 4 x 2 − 2 − 2 1 x 3 Hopfield Nets 12

Traveling Salesperson-Problem (TSP) Problem Instance: Cities: 1 2 . . . n R + d ij ∈ I (1 ≤ i, j ≤ n ) with d ii = 0 Distances: Feasible Solution: permutation π of (1 , 2 , . . . , n ) Objective Functions: P ( π ) = � n i =1 d π ( i ) ,π ( i mod n +1) Optimal Solutions: feasible solution π with minimal P ( π ) Hopfield Nets 13

Encoding Idea: Hopfield-Net with d = n 2 + 1 neurons: + i − − d 12 − d 21 π ( i ) − d 32 − d 23 Problem: ”‘Size”’ of the weights to allow both feasible and good solutions Trick: Transition to continuous Hopfield-Net and modified weights ⇒ good solution of TSP Hopfield Nets 14

4 Self-Organizing Maps (SOM) Neurons: Input: 1 , 2 , . . . , d for components x i Map: 1 , 2 , . . . , m ; regular (linear-, rectangular, hexagonial-) Grid r to store R d pattern vectors µ i ∈ I Output: 1 , 2 , . . . , d for µ c Update: R d , learning set; at time t ∈ I N + , x ∈ L is chosen by random ⇒ L ⊂ I c ∈ { 1 , . . . , m } determined with � x − µ c � ≤ � x − µ i � ( ∀ i ∈ { 1 , . . . , m } ) and adapted to the pattern: µ ′ i := µ i + h ( c, i, t ) ( x − µ i ) ∀ i ∈ { 1 , . . . , m } with h ( c, i, t ) time-dependent neighborhood-relation � � −� r c − r i � 2 and h ( c, i, t ) → 0 for t → ∞ , e.g.: h ( c, i, t ) = α ( t ) · exp 2 σ ( t ) 2 Self-Organizing Maps (SOM) 15

Application of SOM . . . include: visualization and interpretation, dimension reduction schemes, clustering, and classification, COPs . . . Self-Organizing Maps (SOM) 16

A size- 50 map adapts to a triangle Self-Organizing Maps (SOM) 17

A 15 × 15 -Grid is adapted to a triangle Self-Organizing Maps (SOM) 18

SOM for Combinatorial Optimization ∆ -TSP Idea: Use growing ring (elastic band) of neurons Tests with n ≤ 2392 show that the running time scales linearly and deviates from the optimum by less than 9 % Self-Organizing Maps (SOM) 19

SOM for Combinatorial Optimization Self-Organizing Maps (SOM) 20

10 neurons 50 neurons 500 neurons 2000 neurons

SOM for Combinatorial Optimization Tour with 2526 neurons: Self-Organizing Maps (SOM) 21

5 Layered Feed-Forward Nets (MLP) 1 2 3 Layered Feed-Forward Nets (MLP) 22

Formalization A L -layered MLP (multi-layered perceptron) Layer: S 0 , S 1 , . . . , S L − 1 , S L Connection: Of each neuron i in S ℓ to j in S ℓ +1 with weight w ij , exept 1-neurons Update: layer-wise synchronously mixed �� x ′ j := ϕ i ∈V ( j ) x i w ij with ϕ differenciable, 1 ϕ ( a ) = σ ( a ) = z.B. 1+exp( − a ) -5 5 Layered Feed-Forward Nets (MLP) 23

Layered Feed-Forward Nets Applications: function approximation, classification Theorem: All Boolean functions can be computed with a 2-layered MLP (no proof) Theorem: continuous real functions and their derivatives can be jointly approximated to an arbitrary precision on a compact sets (no proof) Layered Feed-Forward Nets (MLP) 24

Learning Parameters in MLP Given: x 1 , . . . , x N ∈ I R d und t 1 , . . . , t N ∈ I R c , MLP with d input and c output neurons, w = w 1 , . . . , w M contains all weights, f ( x , w ) is the net function find optimal w ∗ , that minimizes the error Task: N c E ( w ) := 1 � � � 2 � f k ( x n , w ) − t n k 2 n =1 k =1 partial derivatives of f exists with respect to the inputs and the parameters ⇒ any gradient-based optimization methods can be used (conjugated gradient, . . . ) N c � ∇ w f k ( x n , w ) � � � f k ( x n , w ) − t n ∇ w E ( w ) = k n =1 k =1 Layered Feed-Forward Nets (MLP) 25

Backpropagation Basic Calculus: � � � � � � � ∂ ∂ ∂ � � � ∂t f ( g ( t )) = ∂s f ( s ) ∂t g ( t ) � � � � t = t 0 � s = g ( t 0 ) � t = t 0 Example: ϕ ( a ) := 9 − a 2 , x = (1 , 2) T , w = (1 , 1) T , t = 2 : x 1 ∗ f E ϕ + . 2 − w 1 2 x 2 ∗ t w 2 Layered Feed-Forward Nets (MLP) 26

∇ w E ( w ) | w =(1 , 1) T = h ( x, y ) = x ∗ y ⇒ ∂/∂x h ( x, y ) = y h ( x, y ) = x + y ⇒ ∂/∂x h ( x, y ) = 1 h ( x, y ) = x − y ⇒ ∂/∂x h ( x, y ) = 1 = 9 − x 2 ⇒ ∂/∂x ϕ ( x ) ϕ ( x ) = − 2 x = x 2 / 2 h ( x ) ⇒ ∂/∂x h ( x ) = x

Backpropagation Theorem: ∇ w E ( w ) can be computed in time O ( N × M ) if network is of size O ( M ) Algorithm: ∀ n ∈ { 1 , . . . , N } • compute net functions f ( x n , w ) and associated error E in forward direction store values in net • compute partial derivatives of E with respect to all intermediates in backward direction and add all parts for total gradient Layered Feed-Forward Nets (MLP) 27

Neural Networks Stefan Edelkamp 1 Overview - Introduction - - PowerPoint PPT Presentation

Neural Networks Stefan Edelkamp 1 Overview - Introduction - Percepton - Hofield-Nets - Self-Organizing Maps - Feed-Forward Neural Networks - Backpropagation Overview 1 2 Introduction Idea: Mimic principle of biological neural networks

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

EFET position paper One line title for an improved market design in intraday Irina Nikolova

Statistical Preliminaries Stony Brook University CSE545, Fall 2016 Random Variables X : A

CS 188: Artificial Intelligence Optimization and Neural Nets Instructors: Pieter Abbeel and Dan

Fairness-Aware Learning for Continuous Attributes and Treatments Jrmie Mary, Criteo AI Lab

Deep networks CS 446 The ERM perspective These lectures will follow an ERM perspective on deep

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 Recap: Hopfield network &

ts tr P

Definition of Natural Logarithm Function Recall x n dx = x n +1 n + 1 + C n = 1 .

Neural Networks Stefan Edelkamp 1 Overview - Introduction - - PowerPoint PPT Presentation

Neural Networks Stefan Edelkamp 1 Overview - Introduction - Percepton - Hofield-Nets - Self-Organizing Maps - Feed-Forward Neural Networks - Backpropagation Overview 1 2 Introduction Idea: Mimic principle of biological neural networks

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

EFET position paper One line title for an improved market design in intraday Irina Nikolova

Statistical Preliminaries Stony Brook University CSE545, Fall 2016 Random Variables X : A

CS 188: Artificial Intelligence Optimization and Neural Nets Instructors: Pieter Abbeel and Dan

Fairness-Aware Learning for Continuous Attributes and Treatments Jrmie Mary, Criteo AI Lab

Deep networks CS 446 The ERM perspective These lectures will follow an ERM perspective on deep

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 Recap: Hopfield network &amp;

ts tr P

Definition of Natural Logarithm Function Recall x n dx = x n +1 n + 1 + C n = 1 .

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 Recap: Hopfield network &