Basis of Neural Networks School of Data Science, Fudan - PowerPoint PPT Presentation

Mar 03, 2024 •444 likes •1.05k views

DATA130006 Text Management and Analysis Basis of Neural Networks School of Data Science, Fudan University Dec. 20 th , 2017 General Neural Architectures for NLP 1. Represent the words/features with dense

DATA130006 Text Management and Analysis Basis of Neural Networks 魏忠钰复旦大学大数据学院 School of Data Science, Fudan University Dec. 20 th , 2017
General Neural Architectures for NLP 1. Represent the words/features with dense vectors (embeddings) by lookup table’ 2. Concatenate the vectors 3. Multi-layer neural networks § Classification § Matching § ranking R. Collobert et al. “Natural language processing (almost) from scratch”
Machine Learning § Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. (from Wikipedia)
Formal Specification of Machine Learning § Input Data: 𝑦 " , 𝑧 " , 1 ≤ 𝑗 ≤ 𝑛 § Model § Linear Model: 𝑧 = 𝑔 𝑦 = 𝑥 , 𝑦 + 𝑐 § Generalized Linear Model: 𝑧 = 𝑔 𝑦 = 𝑥 , 𝜚(𝑦) + 𝑐 § Non-linear Model: Neural Network § Criterion: § Loss Function: § L(y, f(x)) à Optimization § 𝑅 𝜄 = 4 5 5 ∑ 𝑀(𝑧 " , 𝑔(𝑦 " , 𝜄)) à Minimization "84 § Regularization: 𝜄 § Objective Function: Q 𝜄 + 𝜇 𝜄 ;
Linear Classifier 𝑔 𝑦, 𝑋 = 𝑋𝑦 + b
Generalized Linear Classification § Hypothesis is a logistic function of a linear combination of inputs 𝑧 = 𝑔 𝑦 = 𝑥 , 𝑦 + 𝑐 4 F x = 4?@AB (D) § We can interpret F(x) as P(y=1|x) § Then the log-odds ratio, In P(y=1|x) P(y=0|x) = 𝑥 , 𝑦 is linear
Softmax § Softmax regression is a generalization of logistic regression to multi-class classification problems § With softmax, the posterior probability of y = c is: , 𝑦) exp (𝑥 M , 𝑦 = 𝑄 𝑧 = 𝑑 𝑦 = 𝑡𝑝𝑔𝑢𝑛𝑏𝑦 𝑥 M P , 𝑦) ∑ exp (𝑥 " "84 § To present class c by one-hot vector 𝑧 = [𝐽 1 = 𝑑 , 𝐽 2 = 𝑑 , … , 𝐽(𝐷 = 𝑑)] , § Where I() is indicator function
Examples of word classification x = � D * 1 � W = � K * D � b = � K * 1 � •
How to learn W? 𝑅 𝜄 = 1 5 𝑛 W 𝑀(𝑧 " , 𝑔(𝑦 " , 𝜄)) "84 § Hinge Loss (SVM) § Softmax loss: cross-entropy loss
SVM vs Softmax (Quiz)
Parameter Learning § In ML, our objective is to learn the parameter 𝜄 to minimize the loss function. § How to learn 𝜄 ?
Gradient Descent § Gradient Descent: § 𝜇 is also called Learning Rate in ML.
Gradient Descent
Learning Rate
Gradient Descent
Stochastic Gradient Descent (SGD)
Computational graphs
Backpropagation: a simple example
Biological Neuron
Artificial Neuron
Activation Functions
Activation Functions
Feedforward Neural Network
Neural Network
Feedforward Computing

Recommend

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural Networks can represent complex decision boundaries decision boundaries Variable size. Any boolean function can be Variable size. Any boolean

358 views • 14 slides

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks and Handwriting Recognition Steven Sloss Math 164 Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven Sloss Structure Training Neural Networks Math 164 Motivation Problem

889 views • 41 slides

CHAPTER IX IX CHAPTER Radial Basis Function Networks Radial Basis Function Networks CHAPTER IX

Ugur HALICI - METU EEE - ANKARA 12/12/2005 CHAPTER IX IX CHAPTER Radial Basis Function Networks Radial Basis Function Networks CHAPTER IX : IX : Radial Basis Function Networks Radial Basis Function Networks CHAPTER Introduction Radial

309 views • 8 slides

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Feed-forward Networks Network Training Error Backpropagation Deep Learning Feed-forward Networks Network Training Error Backpropagation Deep Learning Neural Networks Neural networks arise from attempts to model Neural Networks

380 views • 9 slides

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Recurrent Neural Networks Long Short-Term Memory Temporal Convolutional Networks Examples Recurrent Neural Networks Long Short-Term Memory Temporal Convolutional Networks Examples Sequential Data with Neural Networks Recurrent Neural

303 views • 4 slides

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural IR tasks Neural IR architecture Feature Representations Neural IR query auto completion Neural IR query suggestion Neural IR document

1.48k views • 18 slides

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I : Recurrent Neural Networks CHAPTER I Recurrent Neural Networks Introduction In this chapter first the

404 views • 27 slides

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory Associative Memory CHAPTER III : III : Neural Networks as Associative Memory CHAPTER Neural Networks as

513 views • 22 slides

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use UMaine

412 views • 9 slides

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks have become one of the major thrust areas recently in various pattern recognition, prediction, and analysis problems In many problems they have

852 views • 33 slides

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks and their Application to Go A. Bausch Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training neural networks Problems AlphaGo Anne-Marie Bausch The Game of Go Policy Network

280 views • 24 slides

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks have become one of the major thrust areas recently in various pattern recognition, prediction, and analysis problems In many problems they have

1.17k views • 91 slides

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg Durrett Neural Networks Neural Networks Linear classification: argmax y w > f ( x, y ) possible because Linear Neural we transformed

316 views • 4 slides

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography Hopfield, J. J., "Neural networks and physical systems with emergent collective computational abilities," Proceedings of the National Academy

367 views • 19 slides

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural networks have become one of the major thrust areas recently in various pattern recognition, prediction, and analysis problems In many problems they

1.63k views • 119 slides

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova November 21, 2016 Neural Networks 2/20 Neural Networks 3/20 Neural Networks Neural computing requires a number of neurons , to be connected

813 views • 21 slides

GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof.

GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes T ubingen, Germany T ubingen Summer 2011 Oliver Stegle GWAS IV: Bayesian linear models Summer

1.16k views • 88 slides

Week 3: Linear Regression Instructor: Sergey Levine 1 The regression problem We saw how we can

Week 3: Linear Regression Instructor: Sergey Levine 1 The regression problem We saw how we can estimate the parameters of probability distributions over a random variable x . However, in a supervised learning setting, we might be interested in

64 views • 5 slides

Linear Models Machine Learning 1 Checkpoint: The bigger picture Supervised learning:

Linear Models Machine Learning 1 Checkpoint: The bigger picture Supervised learning: instances, concepts, and hypotheses Specific learners Learning Hypothesis/ Labeled algorithm Model h Decision trees data New example

519 views • 41 slides

Hypothesis Testing and statistical preliminaries Stony Brook University CSE545, Spring 2019

Hypothesis Testing and statistical preliminaries Stony Brook University CSE545, Spring 2019 Hypothesis Testing: Random Variables Distributions Hypothesis Testing Framework Comparing Variables: Simple Linear Regression,

1.37k views • 105 slides

Machine learning theory Regression Hamid Beigy Sharif university of technology June 1, 2020

Machine learning theory Regression Hamid Beigy Sharif university of technology June 1, 2020 Table of contents 1. Introduction 2. Generalization bounds 3. Pseudo-dimension bounds 4. Regression algorithms 5. Summary 1/35 Introduction The

680 views • 42 slides

L ECTURE 4: L INEAR CLASSIFIERS Prof. Julia Hockenmaier juliahmr@illinois.edu Announcements

CS446 Introduction to Machine Learning (Spring 2015) University of Illinois at Urbana-Champaign http://courses.engr.illinois.edu/cs446 L ECTURE 4: L INEAR CLASSIFIERS Prof. Julia Hockenmaier juliahmr@illinois.edu Announcements Homework 1 will

548 views • 42 slides

Statistical methods in bioinformatics Brief introduction, statistical models, dimension

u n i v e r s i t y o f c o p e n h a g e n m a r c h 3 1 s t , 2 0 2 0 Faculty of Health Sciences Statistical methods in bioinformatics Brief introduction, statistical models, dimension reductions. Claus Thorn Ekstrm Biostatistics,

662 views • 65 slides

Announcements Homework 1: Due today Office hours Come to office hours before your presentation!

Active Learning and Optimized Information Gathering Lecture 8 Active Learning CS 101.2 Andreas Krause Announcements Homework 1: Due today Office hours Come to office hours before your presentation! Andreas: Monday 3pm-4:30pm , 260

559 views • 41 slides