Machine Learning 2007: Lecture 7 Instructor: Tim van Erven - PowerPoint PPT Presentation

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website: www.cwi.nl/˜erven/teaching/0708/ml/ October 18, 2007 1 / 26

Overview Organisational Organisational Matters ● Matters Answers Exercises 2 ● Answers Exercises 2 Linear Functions as Inner Products ● Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification ● Vector Valued Neural Networks and the Perceptron ● Outputs in Regression and Classification ✦ Neural Networks Neural Networks and the Perceptron ✦ The Perceptron Convex Functions ✦ Implementing Boolean Functions with a Perceptron Gradient Descent Convex Functions ● Gradient Descent (part 1) ● 2 / 26

Course Organisation Organisational Room of the intermediate exam changed to: Q105 . ● Matters Not necessary to enroll on tisvu. ● Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

Course Organisation Organisational Room of the intermediate exam changed to: Q105 . ● Matters Not necessary to enroll on tisvu. ● Answers Exercises 2 Next lecture (in two weeks) will be on Wednesday at ● Linear Functions as Inner Products 13.30-15.15 in room KC159. Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

Course Organisation Organisational Room of the intermediate exam changed to: Q105 . ● Matters Not necessary to enroll on tisvu. ● Answers Exercises 2 Next lecture (in two weeks) will be on Wednesday at ● Linear Functions as Inner Products 13.30-15.15 in room KC159. Vector Valued Do not submit Office 2007 (.docx) files for the homework. Pdf ● Outputs in Regression and Classification is preferred; older Office (.doc) is acceptable. Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

Course Organisation Organisational Room of the intermediate exam changed to: Q105 . ● Matters Not necessary to enroll on tisvu. ● Answers Exercises 2 Next lecture (in two weeks) will be on Wednesday at ● Linear Functions as Inner Products 13.30-15.15 in room KC159. Vector Valued Do not submit Office 2007 (.docx) files for the homework. Pdf ● Outputs in Regression and Classification is preferred; older Office (.doc) is acceptable. Neural Networks and the Perceptron Mitchell: Convex Functions Read: Chapter 4, sections 4.1–4.4. ● Gradient Descent This Lecture: Explanation of linear functions as inner products is needed to ● understand Mitchell. Neural networks are in Mitchell. I have some extra pictures. ● Convex functions are not discussed in Mitchell. ● I will give more background on gradient descent. ● 3 / 26

Linear Functions as Inner Products Linear Function: Organisational Matters Answers Exercises 2 h w ( x ) = w 0 + w 1 x 1 + . . . + w d x d Linear Functions as Inner Products Vector Valued x = ( x 1 , . . . , x d ) ⊤ is a d -dimensional feature vector. ● Outputs in Regression w = ( w 0 , w 1 , . . . , w d ) ⊤ is a d + 1 -dimensional weight vector. and Classification ● Neural Networks and the Perceptron Convex Functions Gradient Descent 6 / 26

Linear Functions as Inner Products Linear Function: Organisational Matters Answers Exercises 2 h w ( x ) = w 0 + w 1 x 1 + . . . + w d x d Linear Functions as Inner Products Vector Valued x = ( x 1 , . . . , x d ) ⊤ is a d -dimensional feature vector. ● Outputs in Regression w = ( w 0 , w 1 , . . . , w d ) ⊤ is a d + 1 -dimensional weight vector. and Classification ● Neural Networks and the Perceptron As Inner Products (a standard trick): Convex Functions We may change x into a d + 1 -dimensional vector x ′ by adding an Gradient Descent imaginary extra feature x 0 , which always has value 1 : x ′ = (1 , x 1 , . . . , x d ) ⊤ x = ( x 1 , . . . , x d ) ⊤ ⇒ d � w i x ′ i = � w , x ′ � h w ( x ) = i =0 Mitchell writes w · x ′ for � w , x ′ � . ● 6 / 26

Vector Valued Outputs Reminder: Organisational Matters Regression: Predict the label y for any feature vector x . Answers Exercises 2 ● Linear Functions as Typically y can take infinitely many values. Inner Products Classification: Predict the class label y for any new feature ● Vector Valued Outputs in Regression vector x . Only finitely many categories for y . and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 8 / 26

Vector Valued Outputs Reminder: Organisational Matters Regression: Predict the label y for any feature vector x . Answers Exercises 2 ● Linear Functions as Typically y can take infinitely many values. Inner Products Classification: Predict the class label y for any new feature ● Vector Valued Outputs in Regression vector x . Only finitely many categories for y . and Classification Neural Networks and Vector Valued Outputs: the Perceptron Convex Functions In our definition the label y is a single value. ● Gradient Descent This can be generalised to a label vector y . ● Neural networks typically output label vectors. ● 8 / 26

Biology A Neuron [Wikimedia Commons]: Organisational Matters Dendrite Axon Terminal Answers Exercises 2 Node of Linear Functions as Inner Products Ranvier Cell body Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Schwann cell Axon Convex Functions Myelin sheath Gradient Descent Nucleus The Brain: The brain is a complex network of approximately ● 10 11 = 100 000 000 000 neurons. On average each neuron is connected to approximately ● 10 4 = 10 000 other neurons. Each neuron has many input channels (dendrites) and one ● output channel (axon). 10 / 26

Artificial Neurons An Artificial Neuron: Organisational Matters An (artificial) neuron is some function h that gets a feature vector Answers Exercises 2 Linear Functions as x as input and outputs a (single) label y . Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 11 / 26

Artificial Neurons An Artificial Neuron: Organisational Matters An (artificial) neuron is some function h that gets a feature vector Answers Exercises 2 Linear Functions as x as input and outputs a (single) label y . Inner Products Vector Valued The Perceptron: Outputs in Regression and Classification The most famous type of (artificial) neuron is the perceptron: Neural Networks and the Perceptron � Convex Functions 1 if w 0 + w 1 x 1 + . . . w d x d > 0 , h w ( x ) = Gradient Descent − 1 otherwise. Applies a threshold to a linear function of x . ● Has parameters w . ● 11 / 26

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven - PowerPoint PPT Presentation

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website: www.cwi.nl/erven/teaching/0708/ml/ October 18, 2007 1 / 26 Overview Organisational Organisational Matters Matters Answers Exercises 2

Machine Learning 2007: Lecture 2 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Machine Learning 2007: Lecture 11 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Machine Learning 2007: Lecture 8 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Machine Learning 2007: Lecture 3 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Machine Learning 2007: Lecture 4 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Machine Learning 2007: Slides 1 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Machine Learning 2007: Slides 1 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Mixability in Statistical Learning Tim van Erven Joint work with: Peter Grnwald, Mark Reid, Bob

Learning Faster from Easy Data II Wouter Koolen Tim van Erven Aim of the Workshop

Follow the leader if you can, Hedge if you must Tim van Erven NIPS, 2013 Joint work with:

The Catch-up Phenomenon in Bayesian and MDL Model Selection Tim van Erven www.timvanerven.nl 23

Follow the Leader with Dropout Perturbations Tim van Erven COLT, 2014 Joint work with: Wojciech

Making Regional Forecasts Add Up 1,2 Tim van Erven Joint work with: Jairo Cugliari 2 1 2

x ? Machine Learning 5/4/20 Tim Althoff, UW CS547: Machine Learning for Big Data,

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

A Compiler for Scalable Placement and Routing of Brain-like Architectures Narayan Srinivasa

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Statistical Natural Language Processing Recap: logistic regression Learning in ANNs

BASICS OF ARTIFICIAL NEURAL NETWORKS Tilo Burghardt | tilo@cs.bris.ac.uk 35 Slides Agenda for

Statistical analysis for the Johnson-Mehl germination-growth model Jesper Mller, Mohammad

Sensory system Interrelations among the tactile sensations of Touch,

Executable Symbolic Models Of Neural Processes Sriram M Iyengar 1 , Carolyn Talcott 2 , Riccardo

Hormonal regulation: hypothalamus and anterior pituitary Thyroid gland histology Thyroid