Neural Networks Many slides attributable to: Prof. Mike Hughes - PowerPoint PPT Presentation

Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Neural Networks Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI), Emily Fox (UW), Finale Doshi-Velez (Harvard) James, Witten, Hastie, Tibshirani (ISL/ESL books) 2

Objectives Today : Neural Networks day 10 • How to learn feature representations • Feed-forward neural nets • Single neuron = linear function + activation • Multi-layer perceptrons (MLPs) • Universal approximation • The Rise of Deep Learning: • Success stories on Images and Language Mike Hughes - Tufts COMP 135 - Fall 2020 3

What will we learn? Evaluation Supervised Training Learning Data, Label Pairs Performance { x n , y n } N measure Task n =1 Unsupervised Learning data label x y Reinforcement Learning Prediction Mike Hughes - Tufts COMP 135 - Fall 2020 4

Task: Binary Classification y is a binary variable Supervised (red or blue) Learning binary classification x 2 Unsupervised Learning Reinforcement Learning x 1 Mike Hughes - Tufts COMP 135 - Fall 2020 5

Example: Hotdog or Not https://www.theverge.com/tldr/2017/5/14/15639784/hbo- silicon-valley-not-hotdog-app-download Mike Hughes - Tufts COMP 135 - Fall 2020 6

Text Sentiment Classification Mike Hughes - Tufts COMP 135 - Fall 2020 7

Image Classification Mike Hughes - Tufts COMP 135 - Fall 2020 8

Feature Transform Pipeline Data, Label Pairs { x n , y n } N n =1 Feature, Label Pairs Performance Task { φ ( x n ) , y n } N measure n =1 label data φ ( x ) y x Mike Hughes - Tufts COMP 135 - Fall 2020 9

Predicted Probas vs Binary Labels Mike Hughes - Tufts COMP 135 - Fall 2020 10

Decision Boundary is Linear { x ∈ R 2 : σ ( w T ˜ x ) = 0 . 5 } ←→ { x ∈ R 2 : w T ˜ x = 0 } Mike Hughes - Tufts COMP 135 - Fall 2020 11

Logistic Regr. Network Diagram 0 or 1 Credit: Emily Fox (UW) https://courses.cs.washington.edu/courses/cse41 6/18sp/slides/ Mike Hughes - Tufts COMP 135 - Fall 2020 12

A “Neuron” or “Perceptron” Unit Non-linear Linear function activation with weights w function Credit: Emily Fox (UW) Mike Hughes - Tufts COMP 135 - Fall 2020 13

“Inspired” by brain biology Slide Credit: Bhiksha Raj (CMU) Mike Hughes - Tufts COMP 135 - Fall 2020 14

Challenge: Find w for these functions X_1 X_2 y X_1 X_2 y 0 0 0 0 0 0 0 1 0 0 1 1 1 0 0 1 0 1 1 1 1 1 1 1 Credit: Emily Fox (UW) Mike Hughes - Tufts COMP 135 - Fall 2020 15

Challenge: Find w for these functions X_1 X_2 y X_1 X_2 y 0 0 0 0 0 0 0 1 0 0 1 1 1 0 0 1 0 1 1 1 1 1 1 1 Mike Hughes - Tufts COMP 135 - Fall 2020 Credit: Emily Fox (UW) 16

What we can’t do with linear decision boundary classifiers X_1 X_2 y 0 0 0 0 1 1 1 0 1 1 1 0 Mike Hughes - Tufts COMP 135 - Fall 2020 17

Idea: Compose Neurons together! φ ( x ) y x transform classify Mike Hughes - Tufts COMP 135 - Fall 2020 18

Can you find w to solve XOR? AND/ ? ? OR ? ? ? ? ? ? ? ? ? Mike Hughes - Tufts COMP 135 - Fall 2020 19

Can you find w to solve XOR? ? ? ? ? ? ? ? ? ? Mike Hughes - Tufts COMP 135 - Fall 2020 20

Can you find w to solve XOR? Mike Hughes - Tufts COMP 135 - Fall 2020 21

1D Input + 3 hidden units Mike Hughes - Tufts COMP 135 - Fall 2020 22

1D Input + 3 hidden units Example functions (before final threshold) f ( x 1 ) f ( x 1 ) Intuition: Piece-wise step function Partitioning input space into regions Mike Hughes - Tufts COMP 135 - Fall 2020 23

MLPs can approximate any functions with enough hidden units! Mike Hughes - Tufts COMP 135 - Fall 2020 24

Neuron Design What’s wrong with hard step activation function? Non-linear Linear function activation with weights w function Credit: Emily Fox (UW) Mike Hughes - Tufts COMP 135 - Fall 2020 25

Neuron Design What’s wrong with hard step activation function? Not smooth! Gradient is zero almost everywhere, so hard to train weights! Non-linear Linear function activation with weights w function Credit: Emily Fox (UW) Mike Hughes - Tufts COMP 135 - Fall 2020 26

Which Activation Function? Non-linear Linear function activation with weights w function Credit: Emily Fox (UW) Mike Hughes - Tufts COMP 135 - Fall 2020 27

Activation Functions Advice Credit: Emily Fox (UW) Mike Hughes - Tufts COMP 135 - Fall 2020 28

Exciting Applications: Computer Vision Mike Hughes - Tufts COMP 135 - Fall 2020 29

Object Recognition from Images Mike Hughes - Tufts COMP 135 - Fall 2020 30

Deep Neural Networks for Object Recognition Scores for each possible object category Decision: “ leopard ” Mike Hughes - Tufts COMP 135 - Fall 2020 31

Deep Neural Networks for Object Recognition Scores for each possible object category Decision: “ mushroom ” Mike Hughes - Tufts COMP 135 - Fall 2020 32

Each Layer Extracts “Higher Level” Features Mike Hughes - Tufts COMP 135 - Fall 2020 33

More layers = less error! ImageNet challenge 1000 categories, 1.2 million images in training set top 5 classification error (%) shallow 2010 2011 2012 2013 2014 2015 Credit: KDD Tutorial by Sun, Xiao, & Choi: http://dl4health.org/ Figure idea originally from He et. al., CVPR 2016 Mike Hughes - Tufts COMP 135 - Fall 2020 34

2012 ImageNet Challenge Winner Mike Hughes - Tufts COMP 135 - Fall 2020 35

State of the art Results Mike Hughes - Tufts COMP 135 - Fall 2020 36

Semantic Segmentation Mike Hughes - Tufts COMP 135 - Fall 2020 37

Object Detection Mike Hughes - Tufts COMP 135 - Fall 2020 38

Exciting Applications: Natural Language (Spoken and Written) Mike Hughes - Tufts COMP 135 - Fall 2020 39

Reaching Human Performance in Speech-to-Text https://arxiv.org/pdf/1610.05256.pdf Mike Hughes - Tufts COMP 135 - Fall 2020 40

Gains in Translation Quality https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html Mike Hughes - Tufts COMP 135 - Fall 2020 41

Any Disadvantages? Mike Hughes - Tufts COMP 135 - Fall 2020 42

Deep Neural Networks can overfit! 1 layer Many layers Few units / layer Many units / layer Underfitting Overfitting Mike Hughes - Tufts COMP 135 - Fall 2020 43

Ways to avoid overfitting • More training data! • L2 / L1 penalties on weights • More tricks (next week) …. • Early stopping • Dropout • Data augmentation Mike Hughes - Tufts COMP 135 - Fall 2020 44

Objectives Today : Neural Networks Unit 1/2 • How to learn feature representations • Feed-forward neural nets • Single neuron = linear function + activation • Multi-layer perceptrons (MLPs) • Universal approximation • The Rise of Deep Learning: • Success stories on Images and Language Mike Hughes - Tufts COMP 135 - Fall 2020 45

Neural Networks Many slides attributable to: Prof. Mike Hughes - PowerPoint PPT Presentation

Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Neural Networks Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI), Emily Fox (UW), Finale Doshi-Velez (Harvard) James, Witten, Hastie,

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

I is for Turtle, Bagels, Loop Tracing, Files Part 1 of 4 Identity Who are you?

Forward Guidance by Inflation-Targeting Central Banks Michael Woodford Columbia University

6.003: Signals and Systems Discrete Approximation of Continuous-Time Systems September 29, 2011 1

R and Reproducibility A Proposal David Smith Revolu0on

UBC CS 538B Distributed Systems Abstractions with Ivan Beschastnikh COVID Year 2020, Fall term

Epipolar Geometry EECS 442 David Fouhey Fall 2019, University of Michigan

AIRS Forward Model Validation and Status AIRS Science Team Meeting: Nov/Dec 2004 L. Strow, S.

The Purpose History The Mission The Mission The Commission will bring together persons ...and