Introduction to (shallow) Neural Networks Pr. Fabien MOUTARDE Center for Robotics MINES ParisTech PSL Université Paris Fabien.Moutarde@mines-paristech.fr http://people.mines-paristech.fr/fabien.moutarde Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 1 Neural Networks: from biology to engineering • Understanding and modelling of brain • Imitation to reproduce high-level functions • Mathematical tool for engineers Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 2
Application domains Modelling any input-output function by “learning” from examples: • Pattern recognition • Voice recognition • Classification, diagnosis • Identification • Forecasting • Control, regulation Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 3 Biological neurons Cell body synapse dendrite axon • Electric signal: dendrites à cell body à > axon à synapses Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 4
Empirical model of neurons Frequency f Input å (electric) potential of » q C i f i membrane i f ~ 500 Hz sigmoïd q q (membrane potential) è Neuron output = periodic signal with frequency f » sigmoid(q) = sigmoid ( S i C i f i ) Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 5 “Birth” of formal Neuron • Mc Culloch & Pitts (1943) - Simple model of neuron - goal: model the brain x 1 W 1 y S W i +1 x i 0 W D threshold W 0 x D Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 6
Linear separation by a single neuron linear separation W.X hyperplane with W.X – W 0 = 0 Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 7 Theoretical model for learning • Hebb rule (1949) ” Cells that fire together wire together ”, ie synaptic weight increases between neurons that activate simultaneously y i y j W ij ( ) ( ) + = + l W ( t dt ) W ( t ) y t y t ij ij i j Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 8
First formal Neural Networks (en français : Réseaux de Neurones) • PERCEPTRON (Rosenblatt, 1957) • ADALINE (Widrow, 1962) Formal neuron of Mac Culloch & Pitts + Hebb rule for learning Possible to “learn” Boolean functions by training from examples Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 9 Training of Perceptron W k . X > W 0 W W k . X X < W 0 y W.X è Linear separation Training algorithm: W k+1 = W k + vX if X incorrectly classified (v: target value) W k+1 = W k if X correctly classified • Convergence if linearly-separable problem • ?? if NOT linearly-separable Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 10
Limits of first models, necessity of “hidden” neurons • PERCEPTRONS, book by Minsky & Papert (1969) Detailed study on Perceptrons and their intrinsic limits: - can NOT learn some types of Boolean functions (even simple one like XOR) - can do ONLY LINEAR separations But many classes cannot be linearly-separated (by a single hyper-plane) CLASSE 2 CLASSE 1 è Necessity of several layers in the Neural Network è Requires new training algorithm Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 11 1 st revival of Neural Nets USE OF DERIVABLE NEURONS + APPLY GRADIENT DESCENT METHOD • GRADIENT BACK-PROPAGATION (Rumelhart 1986, Le Cun 85) (en français : Rétro-propagation du gradient ) à Overcome Credit Assignment Problem by training Neural Networks with HIDDEN layers • Empirical solutions for MANY real-world applications • Some strong theoretical results: Multi-Layer Perceptrons are UNIVERSAL (and parsimonious) approximators • around years 2000 ’s : still used, but much less popular than SVMs and boosting Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 13
2 nd recent « revival »: Deep-Learning • Since ~2006, rising interest for, and excellent results with ”deep” neural networks , consisting in MANY layers: – Unsupervised ”intelligent” initialization of weights – Standard gradient descent, and/or fine-tuning from initial values of weights – Hidden layers è learnt hierarchy of features • In particular, since ~2013 dramatic progresses in visual recognition (and voice recognition), with deep Convolutional Neural Networks Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 14 What is a FORMAL neuron? DEFINITIONS OF FORMAL NEURONS In general: a processing “unit” applying a simple operation to its inputs, and which can be “connected” to others to build a networks able to realize any input-output function “Usual” definition: a “unit” computing a weighted sum of its inputs, and then applying some non-linearity (sigmoïd, ReLU, Gaussian, … ) Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 15
General formal neuron e i : inputs of neuron e i s j : potential of neuron W ij O j : output of neuron O j h W ij : (synaptic) weights f h: input function (computation of potential = S , dist, kernel, …) f: activation (or transfer) function s j s j = h ( e i , {W ij , i=0 à k j } ) O j = f( s j ) The combination of particular h and f functions defines the type of formal neuron Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 16 Summating artificial “neurons” ACTIVATION FUNCTIONS PRINCIPLE • Threshold (Heaviside or sign) à binary neurons • Sigmoïd (logistic or tanh) O j S f à most common for MLPs e i W ij • Identity à linear neurons • ReLU (Rectified Linear Unit) æ ö n j å ç ÷ = + O f W W e ç ÷ j 0 j ij i è ø = • Saturation i 1 W 0j = "bias" • Gaussian Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 17
“Distance” formal neurons æ ö æ ö æ e ö W ç ÷ ç ÷ ç 1 ÷ 1 j ( ) å 2 ç ÷ = - Input function: h ... , ç ... ÷ e W ç ÷ e i i ij ç ÷ ç ÷ ç ÷ ç ÷ i e W W ij è ø è ø è ø k kj Activation function = Identity of Gaussian O j DIST f The potential of these neurons is the Euclidian DISTANCE between input vector (e i ) i and weight vector (W ij ) i Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 18 Kernel-type formal neurons ( ) ( ) = h e , w K e , w Input function: e with K symmetric and ”positive” w in Mercer sense: "y tq ò y 2 (x)dx < ¥ , ò K(u,v) y (u) y (v)dudv ³ 0 O j K Id Activation function = Identity Examples of possible kernels: – Polynomial: K(u,v)=[u.v + 1] p – Radial Basis Function: K(u,v)=exp(-||u-v|| 2 / 2 s 2 ) è equivalent to distance-neuron+gaussian-activation – Sigmoïd: K(u,v)=tanh(u.v+ q ) è equivalent to summating-neurons+sigmoïd-activation Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 19
Networks of formal neurons TWO FAMILIES OF NETWORKS • FEED-FORWARD NETWORKS ( en français, “ réseaux non bouclés ” ) : NO feedback connection, The output depends only on current input (NO memory) • FEEDBACK OR RECURRENT NETWORKS ( en français, “réseaux bouclés ” ) : Some internal feedback/backwards connection è output depends on current input AND ON ALL PREVIOUS INPUTS (some memory inside!) Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 20 Feed-forward networks (en français : réseau “NON - bouclé”) Neurons can be ordered so that there Y2 Y1 is NO “backwards” connection 5 4 3 Time is NOT a functional variable, i.e. there is NO MEMORY, and the output 2 depends only on current input Neurons 1, 3 and 4 a re said “hidden” 1 X2 Input X1 X4 X3 neurons Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 21
Feed-forward Multi-layer Neural Networks Input Output layer X1 Y1 X2 Y2 X3 Connections with Weights Hidden layers (0, 1 or more) For “ Multi- Layer Perceptron” (MLP) , neurons type generally “summating with sigmoid activation” [terme français pour MLP : “ Réseau Neuronal à couches ” ] Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 22 Recurrent Neural Networks output output 2 x 2 (t-1) x 3 x 2 x 2 (t) 1 f f f S S S S 1 1 0 x 3 (t) f S 1 1 1 1 0 0 x 3 (t-1) input x 1 x 1 (t) x 2 (t-2) input x 2 (t-1) A time-delay is associated to each connection Equivalent form Introduction to (shallow) Neural Networks, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 23
Recommend
More recommend