outline
play

Outline Evolution of neurocomputing Artificial neural networks - PDF document

Introduction to Neural Networks and Deep Learning Ling Guan References: S. Haykin, Neural Networks , 2 nd Edition, New Jersey: Prentice Hall, 1. 2004. C. M. Bishop, Neural Networks for Pattern Re cognition, New York: 2. Oxford University


  1. Introduction to Neural Networks and Deep Learning Ling Guan References: S. Haykin, Neural Networks , 2 nd Edition, New Jersey: Prentice Hall, 1. 2004. C. M. Bishop, Neural Networks for Pattern Re cognition, New York: 2. Oxford University Press, 1995. I. Goodfellow, Y. Bengio and A. Courville, Deep Learning , 3. Massachusetts: MIT Press, 201. 1 Outline  Evolution of neurocomputing  Artificial neural networks  Feed forward neural networks  Radial basis functions (RBF): A feed forward neural net for classification  Modular neural net (MNN): Divide and conquer  General regression neural network (GRNN): Optimum feature selection  Self-organizing map (SOM): Unsupervised data classification  Deep neural network 2

  2. Evolution of NeuroComputing Image courtesy of DeView 3 Evolution of NeuroComputing (2) 4

  3. Artificial Neural Networks  Nonlinear computing machines to mimic the functions of human brain.  Machine learning perspective: Learning process.  Signal processing perspective: Nonlinear filters.  Pattern recognition perspective: Universal classifiers.  Statistical analysis perspective: Non-parametric modeling and estimation.  Operational research perspective: generalized optimization procedures.  … 5 The Human Brain Structure: massive, highly sparse and modularized Key aspects in the study of neural nets:  learning  architecture 6

  4. Learning in Neural Networks  The property of primary significance for a neural network is its ability  to learn from its environment, and  to improve its performance through learning.  Learning: a process by which the free parameters of a NN are adapted through a process of stimulation by the environment in which the network is embedded.  The type of learning is determined by the manner in which the parameter changes take place. 7 Feed Forward Neural Networks  Structurally, feed forward neural networks (FFNNs) have clearly defined input and output.  With hidden layers and hidden units with nonlinear activation functions, e.g. a FFNN can perform nonlinear information processing.  A FFNN is a universal approximation machine trained by supervised learning using numerical examples.  FFNNs are popularly used in non- parametric statistical pattern recognition. 8

  5. Radial-basis Function Networks A radial-basis function (RBF) network consists of three layers. The input layer is made up of  source nodes connecting the RBF to its environment. The hidden layer provides a set of  functions (radial-basis functions) that transform the feature vectors in the input space to the hidden space: The output layer is linear, supplying  the response of the network by calculating the weighted output of the hidden layer at the output layer. 9 RBF Network Cont. The Gaussian-shaped RBF Function is given by        2 P  ( ) r  x x     1 p mp ( ) p ( x , x r , ) exp G    2 m m m  2    m  The summation of M Gaussian units yields a similarity function for the x input vector as follows: M   ( )  ( x ) ( x , x , ) r S w m G m m m  1 m 10

  6. Modular Networks E 1 N • Hierarchical Structure E .... ••• y 1 •• T • Each Sub-network E r an W expert system R D E 2 E O E E M C M R y 2 C O O O • The decision module .... ••• G T I K •• D N I S classifies the input vector I U O Z I N . . . . Y net L as a particular class when E I . . . . O . . . . E D N N E r Y net = arg max y j P U .... ••• y j or linear combination of the •• T ’ s y i 11 Sub-network Architecture  Feedforward Architecture  Backward propagation of errors in learning  Each sub-network is specialized in one particular class. 12

  7. Feature Selection  Sequential forward selection (SFS) for feature subset construction  General Regression Neural Network (GRNN) for evaluating the relevancy of each subset  The approach is at least piece-wise linear. 13 Sequential Forward Selection  SFS generates a sequence of subsets         ,   F F F F F m 1 m N m  Construct the reduced dimension training  set from the original ( x , y ), x R m   , , F p p F p m m training set  ( x , y ), x R N , , F p p F p .  Adopt a suitable error measure 1 P  2   y ( x , ) E g   F p F p m P m  1 p 14

  8. New Subset Construction      , ,  Construct from and   F i i F  1 1 m m m        , ,   F F F i i  1 m m m N   Form the new subsets as follows G  1 , m j          , 1 , ,  G F i j m N 1 , m j m j   The subset is selected as follows: F  1 m     * , arg min F G j E   1  * m 1 , G m j  1 , m j j 15 General Regression Neural Network (GRNN)  GRNN is a special example of a radial basis function (RBF) network  No iterative training procedures are required.  Each Gaussian kernel is associated with a training pattern  ( x , y ), 1 ,   p P p p  The input vector is assigned as the center of x p the kernel.  Award winning work! 16

  9. Self-Organizing Map (SOM)  A self-organizing map is a neural network which learns without a teacher.  Self-organization learning tends to follow neurobiological structure to a much greater extent than supervised learning.  Self-organization learning consists of repeatedly modifying the weights  in response to activation patterns and  in accordance with prescribed rules until a final configuration develops.  Essence of self-organization: Global order arises from local interaction ( Turing ).  Three self-organization models:  Principal components analysis networks (linear).  Self-organizing maps (SOMs).  Information-theoretic models. 17 Self-Organizing Tree Map  Self- Organizing Tree Map (SOTM) is a tree structured Self- Organizing Map  It offers:  Independent learning based on competitive learning technique  A unique feature map that preserves topological ordering  SOTM is more suitable than the conventional SOM and k -means when input feature space is of high dimensionality 18

  10. Self Organizing Tree Map (SOTM) -- specialized tool for data/pattern analysis SOTM SOM No nodes converge to areas of zero data density Nodes converge to areas of zero data density 19 SOHVM vs fixed & fuzzy techniques SOHVM – a special form of SOTM K-means FCM : Fuzzy C-Means GK : Gustafson-Kessel GG : Gath-Geva SOHVM self determines centres N=9; Other methods initialised with N=9 . 20

  11. 21 2D Mapping of SOTM ( A ) ( B ) ( C ) ( F ) ( E ) ( D ) DSOTM Classification in a Two-Dimensional Feature Space. 22

  12. Spherical SOM [6]  Closed structure, no boundary problem.  3D, provides a step towards 3D analysis and visualization in immersive environment (e.g. in a CAVE). Open Boundaries SSOM SOM 23 SSOM for Rendering  Use a combination of spatial features and local features (intensity, gradient magnitude, X,Y,Z location).  Train Spherical SOM.  Visualize the SOM with colors mapped to cluster densities (U-Matrix).  User interacts with the map, simple selection/de- selection of nodes.  RGBA texture generated accordingly for immediate volume rendering. 24

  13. Sample Results Full Render Map Selection Corresponding Render 25 Deep Learning  Architecture: Feedforward of more than three layers (deep) of NN, massively complicated  Motivation:  In theory, a three layer FFNN can approximate any nonlinear functions  In practice, it is impossible due to multiple factors: architecture, size of training samples, etc.  More layers and more hidden neurons to compensate for the abovementioned shortcomings  The pro and con  Performance is extremely impressive in numerous applications, e.g. image recognition.  Lack of theoretical justification.  Very difficult in architecture design and training of free parameters. 26

  14. Why Deep Architectures Inspired from nature: The mammalian visual cortex is hierarchical. Deep machines are more efficient for representing certain classes of functions, particularly those involved in static visual recognition.  Recognition pathway in the visual cortex has multiple stages.  Lots of intermediate representations. Image courtesy of Simon 27 Thorpe A Popular Architecture of Deep Learning Learning hierarchical feature representations. Deep: More than one stage of non-linear feature transformation. Low-Level High-Level Linear Transformation Linear Transformation Mid-Level Linear Transformation Trainable Linear Transformation Feature Feature Feature Classifier Low-Level Mid-Level High-Level Feature Feature Feature Feature visualization of convolutional net trained on ImageNet [Zeiler & 28 Fergus 2013]

  15. Deep learning: Feature Discovery 29 Deep Netwrok Models  Energy based model (EBM)  Restricted Boltzmann machine (RBM)  Deep belief network (DBN)  Higher-order Boltzmann machines (HOBM)  Recurrent neural network (RNN)  Convolutional Neural Network (CNN) Image courtesy of Simon 30 Thorpe

Recommend


More recommend