Neural Networks with Googles TensorFlow Shuo Zhang Computational - PowerPoint PPT Presentation

Neural Networks with Google’s TensorFlow Shuo Zhang Computational discourse analysis 11/22/16

Overview 1. Neural Networks basics 2. Neural Networks specifics 3. Neural Networks with Googe’s TensorFlow 4. Coreference: Singleton classification example

Resources • Deep learning course (Google) @ Udacity • Machine learning course (Stanford, Andrew Ng) @ coursera • Neural Network course (Geoffrey Hinton) @ coursera

1. NN basics

From linear to non-linear classifier

Pros and cons of linear models Pros: Cons: Conclusion: • Fast • Limited to modeling We want to use additive features parameters within linear • Numerically stable functions but able to • Multiplicative or higher efficiently do non-linear • Derivative is constant order features leas to mapping. huge parameter space, not suitable for non- linear mapping

From logistic regression to neural networks

Inserting a non-linear layer: Rectified Linear Unit(ReLU)

Intuition: how NN makes non-linear mapping possible

Type of neural network • Feed forward • Feedback • Self Organizing Map(SOM) • ..

2. NN specifics

Multinomial logistic regression as the basic unit in NN

Softmax – turn outputs of linear functions into probability vectors

One-hot encoding

Cross entropy – measuring similarity between prediction and gold label

Putting it together again

MLR to NN

ReLU – a non-linear activation function to put in the hidden layer ReLU is one of many choices of a non-linear activation function. https://en.wikipedia.org/wiki/ Activation_function

Training a neural network • Basically similar to training a linear model by optimizing a convex function using a method like gradient descent • Example cost function for logistic based activation

Cost function – this is universal for linear classifier or NN • Cost function is a function of the parameters that captures the difference between predicted and gold label, therefore we want to minimize it. • How to minimize? Using gradient descent, at each iteration, adjust the weights. • How to adjust weights? Subtracting gradient (derivative) will move you toward the minimum.

Gradient descent • Keep in mind that W is a matrix, so we need to compute partial derivative with respect to each element of W, and sum them up.

Gradient Descent flavors • Batch GD: classic approach, summing over derivative for all training examples at each iteration in order to perform one update to weights, very slow, but more stable, almost never used today • Stochastic GD: only takes one example at each iteration and use the gradient computed from that example to adjust weights, fast, but less stable behavior • Mini-batch GD: (in between) takes a mini-batch of examples (such as from 100 to 2000) and sum up those terms derivatives to perform update, balance between stability and speed (also good results), most used today

Neural Network training: forward backward propagation Intuition from linear classifier: Repeat: • Compute an output • Compute error • Adjust weights (my implementation in Octave)

3. Neural Networks with Googe’s TensorFlow https://www.youtube.com/watch?v=oZikw5k_2FM

Setup https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html

Get started

Hyper parameter tuning (loss curve) • Number of hidden nodes • Learning rate • Batch size • Number of steps • Overfitting

Google Udacity course example:notMNIST

Example code for notMNIST dataset (Udacity) • https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/ udacity (This set of ipython notebook is not only partial implementation, since it is meant to be an assignment to be completed. To view a complete implementation, refer to the .ipynb and html files I uploaded on the corpling server).

Neural Networks with Googles TensorFlow Shuo Zhang Computational - PowerPoint PPT Presentation

Neural Networks with Googles TensorFlow Shuo Zhang Computational discourse analysis 11/22/16 Overview 1. Neural Networks basics 2. Neural Networks specifics 3. Neural Networks with Googes TensorFlow 4. Coreference: Singleton

TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini paolo.dragone@unitn.it

C-FX-02-V1.0 DSV 4.0 2 45 15 TensorFlow TensorBoard TensorFlow

Getting Started with TensorFlow Part I: TensorFlow Graphs and Sessions Nick Winovich Department

TensorFlow w/XLA: TensorFlow, Compiled! Expressiveness with performance Pre-release

A Trip Through the NGC TensorFlow Container GTC 2019 S9256 AGENDA A Trip Through the TensorFlow

Distributed TensorFlow Stony Brook University CSE545, Fall 2017 Goals Understand

TensorFlow: a Framework for Scalable Machine Learning ACM Learning Center, 2016 You probably

Some resources for ML/TensorFlow TensorFlow resources A good tutorial (about 2:40:00 long)

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

TensorFlow Probability Joshua V. Dillon Software Engineer Google Research What is TensorFlow

TensorFlow and Recurrent Neural Networks CSE392 - Spring 2019 Special Topic in CS Task

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

TensorFlow Graph Optimizations Tatiana Shpeisman Rasmus Munk Larsen shpeisman@google.com

Machine Learning at Google Scale ML APIs and TensorFlow Michel Pereira Google Cloud Customer

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at Google JBD, Google (@rakyll)

CS885 Reinforcement Learning Lecture 4b: May 11, 2018 Deep Q-networks [SutBar] Sec. 9.4, 9.7,

Deep Neural Networks and Mixed Integer Linear Optimization Matteo Fischetti, University of

ECE 5984: Introduction to Machine Learning Topics: Neural Networks Backprop Readings:

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, Prof. Leal-Taix 1 Lecture 2

dt < | ( ) | h t (this has to do with system stability system stability (ECE

LSI system Input v v1 x v1 x v2 x v2 x + + L + v3 x + v3 x v4 x + v4 x + Output

Linear Prediction Analysis of Speech Sounds Berlin Chen 2003 References: 1. X. Huang et. al.,

STAT 113 Simple Linear Regression Colin Reimer Dawson Oberlin College Sept. 16, 2015 Outline

Neural Networks with Googles TensorFlow Shuo Zhang Computational - PowerPoint PPT Presentation

Neural Networks with Googles TensorFlow Shuo Zhang Computational discourse analysis 11/22/16 Overview 1. Neural Networks basics 2. Neural Networks specifics 3. Neural Networks with Googes TensorFlow 4. Coreference: Singleton

TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini paolo.dragone@unitn.it

C-FX-02-V1.0 DSV 4.0 2 45 15 TensorFlow TensorBoard TensorFlow

Getting Started with TensorFlow Part I: TensorFlow Graphs and Sessions Nick Winovich Department

TensorFlow w/XLA: TensorFlow, Compiled! Expressiveness with performance Pre-release

A Trip Through the NGC TensorFlow Container GTC 2019 S9256 AGENDA A Trip Through the TensorFlow

Distributed TensorFlow Stony Brook University CSE545, Fall 2017 Goals Understand

TensorFlow: a Framework for Scalable Machine Learning ACM Learning Center, 2016 You probably

Some resources for ML/TensorFlow TensorFlow resources A good tutorial (about 2:40:00 long)

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

TensorFlow Probability Joshua V. Dillon Software Engineer Google Research What is TensorFlow

TensorFlow and Recurrent Neural Networks CSE392 - Spring 2019 Special Topic in CS Task

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

TensorFlow Graph Optimizations Tatiana Shpeisman Rasmus Munk Larsen shpeisman@google.com

Machine Learning at Google Scale ML APIs and TensorFlow Michel Pereira Google Cloud Customer

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at Google JBD, Google (@rakyll)

CS885 Reinforcement Learning Lecture 4b: May 11, 2018 Deep Q-networks [SutBar] Sec. 9.4, 9.7,

Deep Neural Networks and Mixed Integer Linear Optimization Matteo Fischetti, University of

ECE 5984: Introduction to Machine Learning Topics: Neural Networks Backprop Readings:

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, Prof. Leal-Taix 1 Lecture 2

dt &lt; | ( ) | h t (this has to do with system stability system stability (ECE

LSI system Input v v1 x v1 x v2 x v2 x + + L + v3 x + v3 x v4 x + v4 x + Output

Linear Prediction Analysis of Speech Sounds Berlin Chen 2003 References: 1. X. Huang et. al.,

STAT 113 Simple Linear Regression Colin Reimer Dawson Oberlin College Sept. 16, 2015 Outline

dt < | ( ) | h t (this has to do with system stability system stability (ECE