Neural Net Backpropagation 3/20/17 Recall: Limitations of - PowerPoint PPT Presentation

Mar 02, 2024 •579 likes •714 views

Neural Net Backpropagation 3/20/17 Recall: Limitations of Perceptrons vs. AND and OR are linearly separable. XOR isnt What is the output of the network? ( 0 x < 0 f ( x ) = 1 x 0 1 f ( x ) = 1 + e x ( 0 x < 0 f ( x

Neural Net Backpropagation 3/20/17
Recall: Limitations of Perceptrons vs. • AND and OR are linearly separable. XOR isn’t
What is the output of the network? ( 0 x < 0 f ( x ) = 1 x ≥ 0 1 f ( x ) = 1 + e − x ( 0 x < 0 f ( x ) = x ≥ 0 x
How can we train these networks? Two reasons the perceptron algorithm won’t work: 1. Non-threshold activation functions. 2. Multiple layers (what’s the correction for hidden nodes?). Key idea: stochastic gradient descent (SGD). • Compute the error on a random training example. • Compute the derivative of the error with respect to each weight. • Update weights in the direction that reduces error.
Problem: SGD on threshold functions • The derivative of this function is always 0. • We can’t “move in the direction of the gradient”.
Better Activation Functions sigmoid tanh 1 tanh( x ) = 1 + e − 2 x RELU σ ( x ) = 1 + e − x 1 − e − 2 x ( 0 x < 0 RELU( x ) = x ≥ 0 x
Derivatives of Activation Functions sigmoid tanh 1 tanh( x ) = 1 + e − 2 x σ ( x ) = 1 + e − x 1 − e − 2 x d σ ( x ) d tanh( x ) = 1 − tanh 2 ( x ) = σ ( x )(1 − σ ( x )) dx dx RELU ( 0 x < 0 RELU( x ) = x ≥ 0 x ( d RELU( x ) 0 x ≤ 0 = dx 1 x > 0
Error Gradient • Define training error as squared difference between a node’s output and the target: x ) = ( t − o ) 2 E ( ~ w, ~ • Compute gradient of error with respect to weights: sigmoid ∂ E 1 X w · ~ ~ x = w i x i o = 1 + e − ~ w · ~ ∂ w i x i … … … algebra ensues … … … ∂ E = − o (1 − o )( t − o ) x i ∂ w i
Output Node Gradient Descent Step sigmoid α = . 5 w i + = − α ∂ E w i + = α ( o )(1 − o )( t − o ) x i ∂ w i w 0 += . 5 · . 7(1 − . 7)( . 9 − . 7)2 → w i = 1 . 04 w 1 += . 5 · . 7(1 − . 7)( . 9 − . 7)1 . 2 → w i = − . 97
What about hidden layers? • Use the chain rule to compute error derivatives for previous layers. • This turns out to be much easier than it sounds. Let 𝜀 k be the error we computed for output-node k . sigmoid δ k = o k (1 − o k )( t k − o k ) The error for hidden node h comes from the sum of its contribution to the errors for each output node. X w hk δ k k ∈ output
Hidden Node Gradient Descent Step • Compute the contribution to next-layer errors: X δ h = o h (1 − o h ) w hk δ k k ∈ next layer • Update incoming weights using 𝜀 h as the error: w i + = αδ h x i
Backpropagation Algorithm for 1:training runs for example in shuffled training data: run example through network compute error for each output node for each layer (starting from output): for each node in layer: gradient descent update on incoming weights
Example Backpropagation Update 1 σ ( x ) = w i + = α ( o )(1 − o )( t − o ) x i 1 + e − x X δ h = o h (1 − o h ) w hk δ k k ∈ next layer

Recommend

Backpropagation Why backpropagation Neural networks are sequences of parametrized functions

Backpropagation Why backpropagation Neural networks are sequences of parametrized functions x ($; !) linear conv subsample conv subsample filters filters weights Parameters ! Why backpropagation Neural networks are

767 views • 45 slides

CSC321 Lecture 6: Backpropagation Roger Grosse Roger Grosse CSC321 Lecture 6: Backpropagation 1

CSC321 Lecture 6: Backpropagation Roger Grosse Roger Grosse CSC321 Lecture 6: Backpropagation 1 / 21 Overview Weve seen that multilayer neural networks are powerful. But how can we actually learn them? Backpropagation is the central

577 views • 21 slides

MLPs with Backpropagation CS 472 Backpropagation 1 Multilayer Nets? Linear Systems F(cx) =

MLPs with Backpropagation CS 472 Backpropagation 1 Multilayer Nets? Linear Systems F(cx) = cF(x) F(x+y) = F(x) + F(y) I N M Z Z = (M(NI)) = (MN)I = PI CS 472 Backpropagation 2 Early Attempts Committee Machine Randomly

812 views • 70 slides

Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey

Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation The backpropagation algorithm

475 views • 26 slides

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Feed-forward Networks Network Training Error Backpropagation Deep Learning Feed-forward Networks Network Training Error Backpropagation Deep Learning Neural Networks Neural networks arise from attempts to model Neural Networks

380 views • 9 slides

Learning From Data Lecture 21 Neural Networks: Backpropagation Forward propagation: algorithmic

Learning From Data Lecture 21 Neural Networks: Backpropagation Forward propagation: algorithmic computation h ( x ) e ( x ) Backpropagation: algorithmic computation of weights M. Magdon-Ismail CSCI 4100/6100 recap: The Neural Network

294 views • 14 slides

Neural Networks and Backpropagation Neural Net Readings: Matt Gormley Murphy

10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Neural Networks and Backpropagation Neural Net Readings: Matt Gormley Murphy

1.2k views • 77 slides

Backpropagation Matt Gormley Lecture 12 Oct 10, 2018 1 Q&A 3 BACKPROPAGATION 4 A

10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 12 Oct 10, 2018 1 Q&A 3 BACKPROPAGATION 4 A Recipe for Background

502 views • 39 slides

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural IR tasks Neural IR architecture Feature Representations Neural IR query auto completion Neural IR query suggestion Neural IR document

1.48k views • 18 slides

Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward Networks Network Training

Feed-forward Networks Network Training Error Backpropagation Applications Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Neural networks

956 views • 46 slides

Neural Networks Greg Mori - CMPT 419/726 Bishop PRML Ch. 5 Feed-forward Networks Network

Feed-forward Networks Network Training Error Backpropagation Deep Learning Neural Networks Greg Mori - CMPT 419/726 Bishop PRML Ch. 5 Feed-forward Networks Network Training Error Backpropagation Deep Learning Neural Networks Neural

712 views • 54 slides

Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Feed-forward Networks Network

Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Neural

462 views • 43 slides

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg Durrett Neural Networks Neural Networks Linear classification: argmax y w > f ( x, y ) possible because Linear Neural we transformed

316 views • 4 slides

Neural Networks + Backpropagation Last Class Softmax Classifier Generalization /

CS4501: Introduction to Computer Vision Neural Networks + Backpropagation Last Class Softmax Classifier Generalization / Overfitting Pytorch Todays Class Global Features The perceptron model Neural Networks

838 views • 34 slides

Neural Networks: Backpropagation Machine Learning Based on slides and material from Geoffrey

Neural Networks: Backpropagation Machine Learning Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, 1 Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others This lecture What is a neural network?

1.1k views • 55 slides

MultiLayer Neural Networks Xiaogang Wang xgwang@ee.cuhk.edu.hk January 15, 2019 cuhk Xiaogang

Feedforward Operation Backpropagation Discussions MultiLayer Neural Networks Xiaogang Wang xgwang@ee.cuhk.edu.hk January 15, 2019 cuhk Xiaogang Wang MultiLayer Neural Networks Feedforward Operation Backpropagation Discussions Outline

978 views • 58 slides

CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou

CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Matrix Data Text Set Data Sequence Time Series Graph & Images Data Data

310 views • 28 slides

Gradients of Deep Networks Chris Cremer March 29 2017 Neural Net $ %

Gradients of Deep Networks Chris Cremer March 29 2017 Neural Net $ % & ( Output Input Hidden Hidden Hidden " X Activation Activation Activation & % $ ) = ( ) -

698 views • 20 slides

Deep Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Deep Networks Need for

Deep Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Deep Networks Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels

1.45k views • 34 slides

Anartificialneuron Artificialneuralnetworks y = f ( S ) x 0 =+1 Background

Anartificialneuron Artificialneuralnetworks y = f ( S ) x 0 =+1 Background n n = = w 0 = S w x w x x 1 i i i i

207 views • 7 slides

Classifiers: Support Vector Machine 1 MACHINE LEARNING What is Classification? Female Adult

MACHINE LEARNING MACHINE LEARNING Classifiers: Support Vector Machine 1 MACHINE LEARNING What is Classification? Female Adult Children Detecting facial attributes He & Zhang, Pattern Recognition, 2011 Sony (Make Believe) Training set

914 views • 80 slides

AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING

AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is continuous, we call it

684 views • 37 slides

Data Mining Lecture Notes for Chapter 4 Artificial Neural Networks Introduction to Data Mining ,

Data Mining Lecture Notes for Chapter 4 Artificial Neural Networks Introduction to Data Mining , 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Introduction to Data Mining, 2 nd Edition 10/12/2020 1 1 Artificial Neural Networks (ANN)

447 views • 13 slides

libSVM LING572 Advanced Statistical Methods for NLP February 18, 2020 1 Documentation

libSVM LING572 Advanced Statistical Methods for NLP February 18, 2020 1 Documentation http://www.csie.ntu.edu.tw/~cjlin/libsvm/ The libSVM directory on Patas: /NLP_TOOLS/ml_tools/svm/libsvm/latest/ README FAQ.html

562 views • 23 slides