Neural Networks: Introduction Machine Learning Based on slides and - PowerPoint PPT Presentation

Neural Networks: Introduction Machine Learning Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, 1 Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others

Where are we? Learning algorithms General learning principles Decision Trees Overfitting • • Perceptron Mistake-bound learning • • AdaBoost PAC learning, sample complexity • • Support Vector Machines Hypothesis choice & VC dimensions • • Naïve Bayes Training and generalization errors • • Produce linear Logistic Regression Regularized Empirical Loss • • classifiers Minimization Bayesian Learning • 4

Neural Networks • What is a neural network? • Predicting with a neural network • Training neural networks • Practical concerns 6

This lecture • What is a neural network? – The hypothesis class – Structure, expressiveness • Predicting with a neural network • Training neural networks • Practical concerns 7

We have seen linear threshold units Prediction sgn (& ' ( + *) = sgn(∑. / 0 / + *) Learning threshold various algorithms dot perceptron, SVM, logistic regression,… product in general, minimize loss features But where do these input features come from? What if the features were outputs of another classifier? 11

Features from classifiers 12

Features from classifiers Each of these connections have their own weights as well 14

Features from classifiers This is a two layer feed forward neural network 16

Features from classifiers This is a two layer feed forward neural network The output layer The input layer The hidden layer Think of the hidden layer as learning a good representation of the inputs 17

Features from classifiers This is a two layer feed forward neural network The dot product followed by the threshold constitutes a neuron Five neurons in this picture (four in hidden layer and one output) 19

But where do the inputs come from? The input layer What if the inputs were the outputs of a classifier? We can make a three layer network…. And so on. 20

Let us try to formalize this 21

Neural networks A robust approach for approximating real-valued, discrete- valued or vector valued functions Among the most effective general purpose supervised learning methods currently known Especially for complex and hard to interpret data such as real- world sensory data The Backpropagation algorithm for neural networks has been shown successful in many practical problems Across various application domains 22

Artificial neurons Functions that very loosely mimic a biological neuron A neuron accepts a collection of inputs (a vector x ) and produces an output by: 1. Applying a dot product with weights w and adding a bias b 2. Applying a (possibly non-linear) transformation called an activation 123423 = activation(& ' ( + *) 25

Artificial neurons Functions that very loosely mimic a biological neuron A neuron accepts a collection of inputs (a vector x ) and produces an output by: 1. Applying a dot product with weights w and adding a bias b 2. Applying a (possibly non-linear) transformation called an activation 123423 = activation(& ' ( + *) Dot product Threshold activation Other activations are possible 27

Activation functions Also called transfer functions 123423 = activation(& ' ( + *) Name of the neuron Activation function: activation(;) Linear unit ; Threshold/sign unit sgn(;) 1 Sigmoid unit 1 + exp (−;) Rectified linear unit (ReLU) max (0, ;) Tanh unit tanh (;) Many more activation functions exist (sinusoid, sinc, gaussian, polynomial…) 28

A neural network Output A function that converts inputs to outputs defined by a directed acyclic graph H w FG – Nodes organized in layers, correspond to Hidden neurons I w FG – Edges carry output of one neuron to another, associated with weights Input • To define a neural network, we need to specify: – The structure of the graph • How many nodes, the connectivity – The activation function on each node – The edge weights 30

A neural network Output A function that converts inputs to outputs defined by a directed acyclic graph H w FG – Nodes organized in layers, correspond to Hidden neurons I w FG – Edges carry output of one neuron to another, associated with weights Input • To define a neural network, we need to specify: – The structure of the graph • How many nodes, the connectivity – The activation function on each node – The edge weights 31

A neural network Output A function that converts inputs to outputs defined by a directed acyclic graph H w FG – Nodes organized in layers, correspond to Hidden neurons I w FG – Edges carry output of one neuron to another, associated with weights Input • To define a neural network, we need to Called the architecture specify: of the network – The structure of the graph Typically predefined, part of the design of • How many nodes, the connectivity the classifier – The activation function on each node – The edge weights 32

A neural network Output A function that converts inputs to outputs defined by a directed acyclic graph H w FG – Nodes organized in layers, correspond to Hidden neurons I w FG – Edges carry output of one neuron to another, associated with weights Input • To define a neural network, we need to Called the architecture specify: of the network – The structure of the graph Typically predefined, part of the design of • How many nodes, the connectivity the classifier – The activation function on each node – The edge weights Learned from data 33

very A brief history of neural networks 1943: McCullough and Pitts showed how linear threshold units can • compute logical functions 1949: Hebb suggested a learning rule that has some physiological • plausibility 1950s: Rosenblatt, the Peceptron algorithm for a single threshold neuron • 1969: Minsky and Papert studied the neuron from a geometrical • perspective 1980s: Convolutional neural networks (Fukushima, LeCun), the • backpropagation algorithm (various) Early 2000s-today: More compute, more data, deeper networks • 34 See also: http://people.idsia.ch/~juergen/deep-learning-overview.html

What functions do neural networks express? 35

A single neuron with threshold activation Prediction = sgn (b +w 1 x 1 + w 2 x 2 ) b +w 1 x 1 + w 2 x 2 =0 + ++ + + + + + - - - - - - - - - - - - - - - - - - 36

Two layers, with threshold activations In general, convex polygons 37 Figure from Shai Shalev-Shwartz and Shai Ben-David, 2014

Three layers with threshold activations In general, unions of convex polygons 38 Figure from Shai Shalev-Shwartz and Shai Ben-David, 2014

Neural networks are universal function approximators Any continuous function can be approximated to arbitrary accuracy using • one hidden layer of sigmoid units [Cybenko 1989] Approximation error is insensitive to the choice of activation functions • [DasGupta et al 1993] Two layer threshold networks can express any Boolean function • Exercise : Prove this – VC dimension of threshold network with edges E: JK = L(|N| log |N|) • VC dimension of sigmoid networks with nodes V and edges E: • Upper bound: Ο J H N H – Lower bound: Ω N H – Exercise : Show that if we have only linear units, then multiple layers does not change the expressiveness 39

Neural networks are universal function approximators Any continuous function can be approximated to arbitrary accuracy using • one hidden layer of sigmoid units [Cybenko 1989] Approximation error is insensitive to the choice of activation functions • [DasGupta et al 1993] Two layer threshold networks can express any Boolean function • Exercise : Prove this – VC dimension of threshold network with edges E: JK = L(|N| log |N|) • VC dimension of sigmoid networks with nodes V and edges E: • Upper bound: Ο J H N H – Lower bound: Ω N H – 40

Neural networks are universal function approximators Any continuous function can be approximated to arbitrary accuracy using • one hidden layer of sigmoid units [Cybenko 1989] Approximation error is insensitive to the choice of activation functions • [DasGupta et al 1993] Two layer threshold networks can express any Boolean function • Exercise : Prove this – VC dimension of threshold network with edges E: JK = L(|N| log |N|) • VC dimension of sigmoid networks with nodes V and edges E: • Upper bound: Ο J H N H – Lower bound: Ω N H – Exercise : Show that if we have only linear units, then multiple layers does not change the expressiveness 41

Neural Networks: Introduction Machine Learning Based on slides and - PowerPoint PPT Presentation

Neural Networks: Introduction Machine Learning Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, 1 Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others Where are we? Learning algorithms General learning

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

MULTILAYER NEURAL NETWORKS Jeff Robble, Brian Renzenbrink, Doug Roberts Multilayer Neural

Deep Neural Network Mathematical Mysteries for High Dimensional Learning Stphane Mallat

Fast Homomorphic Evaluation of Deep Discretized Neural Networks Florian Bourse Michele Minelli

Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides

Lecture 1: Neurons Lecture 2: Coding with spikes Lecture 3: Tuning curves and receptive fields

Neural Networks: Design Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science,

The Potjans-Diesmann local microcircuit model using different neuron classes for excitatory and

Universality and Individuality in Recurrent Neural Networks Niru Maheswaranathan, Alex Williams,

Neural Networks: Introduction Machine Learning Based on slides and - PowerPoint PPT Presentation

Neural Networks: Introduction Machine Learning Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, 1 Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others Where are we? Learning algorithms General learning

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

MULTILAYER NEURAL NETWORKS Jeff Robble, Brian Renzenbrink, Doug Roberts Multilayer Neural

Deep Neural Network Mathematical Mysteries for High Dimensional Learning Stphane Mallat

Fast Homomorphic Evaluation of Deep Discretized Neural Networks Florian Bourse Michele Minelli

Logistic Regression &amp; Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides

Lecture 1: Neurons Lecture 2: Coding with spikes Lecture 3: Tuning curves and receptive fields

Neural Networks: Design Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science,

The Potjans-Diesmann local microcircuit model using different neuron classes for excitatory and

Universality and Individuality in Recurrent Neural Networks Niru Maheswaranathan, Alex Williams,

Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides