Introduction to Neural Networks and Deep Learning Ling Guan References: S. Haykin, Neural Networks , 2 nd Edition, New Jersey: Prentice Hall, 1. 2004. C. M. Bishop, Neural Networks for Pattern Re cognition, New York: 2. Oxford University Press, 1995. I. Goodfellow, Y. Bengio and A. Courville, Deep Learning , 3. Massachusetts: MIT Press, 201. 1 Outline Evolution of neurocomputing Artificial neural networks Feed forward neural networks Radial basis functions (RBF): A feed forward neural net for classification Modular neural net (MNN): Divide and conquer General regression neural network (GRNN): Optimum feature selection Self-organizing map (SOM): Unsupervised data classification Deep neural network 2
Evolution of NeuroComputing Image courtesy of DeView 3 Evolution of NeuroComputing (2) 4
Artificial Neural Networks Nonlinear computing machines to mimic the functions of human brain. Machine learning perspective: Learning process. Signal processing perspective: Nonlinear filters. Pattern recognition perspective: Universal classifiers. Statistical analysis perspective: Non-parametric modeling and estimation. Operational research perspective: generalized optimization procedures. … 5 The Human Brain Structure: massive, highly sparse and modularized Key aspects in the study of neural nets: learning architecture 6
Learning in Neural Networks The property of primary significance for a neural network is its ability to learn from its environment, and to improve its performance through learning. Learning: a process by which the free parameters of a NN are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place. 7 Feed Forward Neural Networks Structurally, feed forward neural networks (FFNNs) have clearly defined input and output. With hidden layers and hidden units with nonlinear activation functions, e.g. a FFNN can perform nonlinear information processing. A FFNN is a universal approximation machine trained by supervised learning using numerical examples. FFNNs are popularly used in non- parametric statistical pattern recognition. 8
Radial-basis Function Networks A radial-basis function (RBF) network consists of three layers. The input layer is made up of source nodes connecting the RBF to its environment. The hidden layer provides a set of functions (radial-basis functions) that transform the feature vectors in the input space to the hidden space: The output layer is linear, supplying the response of the network by calculating the weighted output of the hidden layer at the output layer. 9 RBF Network Cont. The Gaussian-shaped RBF Function is given by 2 P ( ) r x x 1 p mp ( ) p ( x , x r , ) exp G 2 m m m 2 m The summation of M Gaussian units yields a similarity function for the x input vector as follows: M ( ) ( x ) ( x , x , ) r S w m G m m m 1 m 10
Modular Networks E 1 N • Hierarchical Structure E .... ••• y 1 •• T • Each Sub-network E r an W expert system R D E 2 E O E E M C M R y 2 C O O O • The decision module .... ••• G T I K •• D N I S classifies the input vector I U O Z I N . . . . Y net L as a particular class when E I . . . . O . . . . E D N N E r Y net = arg max y j P U .... ••• y j or linear combination of the •• T ’ s y i 11 Sub-network Architecture Feedforward Architecture Backward propagation of errors in learning Each sub-network is specialized in one particular class. 12
Feature Selection Sequential forward selection (SFS) for feature subset construction General Regression Neural Network (GRNN) for evaluating the relevancy of each subset The approach is at least piece-wise linear. 13 Sequential Forward Selection SFS generates a sequence of subsets , F F F F F m 1 m N m Construct the reduced dimension training set from the original ( x , y ), x R m , , F p p F p m m training set ( x , y ), x R N , , F p p F p . Adopt a suitable error measure 1 P 2 y ( x , ) E g F p F p m P m 1 p 14
New Subset Construction , , Construct from and F i i F 1 1 m m m , , F F F i i 1 m m m N Form the new subsets as follows G 1 , m j , 1 , , G F i j m N 1 , m j m j The subset is selected as follows: F 1 m * , arg min F G j E 1 * m 1 , G m j 1 , m j j 15 General Regression Neural Network (GRNN) GRNN is a special example of a radial basis function (RBF) network No iterative training procedures are required. Each Gaussian kernel is associated with a training pattern ( x , y ), 1 , p P p p The input vector is assigned as the center of x p the kernel. Award winning work! 16
Self-Organizing Map (SOM) A self-organizing map is a neural network which learns without a teacher. Self-organization learning tends to follow neurobiological structure to a much greater extent than supervised learning. Self-organization learning consists of repeatedly modifying the weights in response to activation patterns and in accordance with prescribed rules until a final configuration develops. Essence of self-organization: Global order arises from local interaction ( Turing ). Three self-organization models: Principal components analysis networks (linear). Self-organizing maps (SOMs). Information-theoretic models. 17 Self-Organizing Tree Map Self- Organizing Tree Map (SOTM) is a tree structured Self- Organizing Map It offers: Independent learning based on competitive learning technique A unique feature map that preserves topological ordering SOTM is more suitable than the conventional SOM and k -means when input feature space is of high dimensionality 18
Self Organizing Tree Map (SOTM) -- specialized tool for data/pattern analysis SOTM SOM No nodes converge to areas of zero data density Nodes converge to areas of zero data density 19 SOHVM vs fixed & fuzzy techniques SOHVM – a special form of SOTM K-means FCM : Fuzzy C-Means GK : Gustafson-Kessel GG : Gath-Geva SOHVM self determines centres N=9; Other methods initialised with N=9 . 20
21 2D Mapping of SOTM ( A ) ( B ) ( C ) ( F ) ( E ) ( D ) DSOTM Classification in a Two-Dimensional Feature Space. 22
Spherical SOM [6] Closed structure, no boundary problem. 3D, provides a step towards 3D analysis and visualization in immersive environment (e.g. in a CAVE). Open Boundaries SSOM SOM 23 SSOM for Rendering Use a combination of spatial features and local features (intensity, gradient magnitude, X,Y,Z location). Train Spherical SOM. Visualize the SOM with colors mapped to cluster densities (U-Matrix). User interacts with the map, simple selection/de- selection of nodes. RGBA texture generated accordingly for immediate volume rendering. 24
Sample Results Full Render Map Selection Corresponding Render 25 Deep Learning Architecture: Feedforward of more than three layers (deep) of NN, massively complicated Motivation: In theory, a three layer FFNN can approximate any nonlinear functions In practice, it is impossible due to multiple factors: architecture, size of training samples, etc. More layers and more hidden neurons to compensate for the abovementioned shortcomings The pro and con Performance is extremely impressive in numerous applications, e.g. image recognition. Lack of theoretical justification. Very difficult in architecture design and training of free parameters. 26
Why Deep Architectures Inspired from nature: The mammalian visual cortex is hierarchical. Deep machines are more efficient for representing certain classes of functions, particularly those involved in static visual recognition. Recognition pathway in the visual cortex has multiple stages. Lots of intermediate representations. Image courtesy of Simon 27 Thorpe A Popular Architecture of Deep Learning Learning hierarchical feature representations. Deep: More than one stage of non-linear feature transformation. Low-Level High-Level Linear Transformation Linear Transformation Mid-Level Linear Transformation Trainable Linear Transformation Feature Feature Feature Classifier Low-Level Mid-Level High-Level Feature Feature Feature Feature visualization of convolutional net trained on ImageNet [Zeiler & 28 Fergus 2013]
Deep learning: Feature Discovery 29 Deep Netwrok Models Energy based model (EBM) Restricted Boltzmann machine (RBM) Deep belief network (DBN) Higher-order Boltzmann machines (HOBM) Recurrent neural network (RNN) Convolutional Neural Network (CNN) Image courtesy of Simon 30 Thorpe
Recommend
More recommend