Support Vector Machines Support Vector Machines CSC 411 Tutorial - PowerPoint PPT Presentation

Support Vector Machines Support Vector Machines CSC 411 Tutorial April 1, 2015 Tutor: Shenlong Wang Many thanks to Renjie Liao, Jake Snell, Yujia Li and Kevin Swersky for much of the following material. 1 of 36

2 of 36

Brief Review of SVMs Brief Review of SVMs Out[9]: 3 of 36 Click here to toggle on/off the raw code.

Geometric Intuition Geometric Intuition Out[13]: 4 of 36

Geometric Intuition Geometric Intuition Out[14]: 5 of 36

Margin Derivation Margin Derivation Out[16]: d * w / |w| 6 of 36

� � � � � �� Margin Derivation Margin Derivation Compute the distance of an arbitrary point in the (+) class to the separating hyperplane. � � � � � � � � � � � � If we let denote the class of , then the distance becomes � � � � � � We can set for the point closest to the decision boundary, leading to the problem: � � � � � � 7 of 36

�� SVM Problem SVM Problem � � � � � � But scaling and doesn't change . or equivalently: � � � � � � 8 of 36

� � � � � � � � � � � � � � � � � � � � � � � � Non-linear SVMs Non-linear SVMs For a linear SVM, . We can just as well work in an alternate feature space: . http://i.imgur.com/WuxyO.png Out[17]: 9 of 36

Non-linear SVMs Non-linear SVMs http://www.youtube.com/watch?v=3liCbRZPrZA Out[19]: SVM with polynomial kernel visualization 10 of 36

Non-linear SVMs Non-linear SVMs Demo (by Andrej Karparthy and LIBSVM): http://cs.stanford.edu/people/karpathy/svmjs/demo/ https://www.csie.ntu.edu.tw/~cjlin/libsvm/ 11 of 36

SVMs vs Logistic Regression SVMs vs Logistic Regression 12 of 36

Logistic Regression Logistic Regression [<matplotlib.lines.Line2D at 0x7fb3ad1af0f0>] Out[21]: 13 of 36

� �� Logistic Regression Logistic Regression Train to maximize likelihood � � � � � � � � Assign probability to each outcome Linear decision boundary 14 of 36

SVMs SVMs Out[22]: 15 of 36

� �� SVMs SVMs Train to find the maximum margin Enforce a margin of separation � � � � � � � � � � � � Linear decision boundary 16 of 36

Comparison Comparison Logistic regression wants to maximize the probability of the data. The greater the distance from each point to the decision boundary, the better. SVMs want to maximize the distance from the closest points (support vectors) to the decision boundary. Doesn't care about points that aren't support vectors. 17 of 36

� � � � � � �� A Different Take A Different Take Consider an alternate form of the logistic regression decision function: 18 of 36

� � � � � � � � � �� A Different Take A Different Take Suppose we don't actually care about the probabilities. All we want to do is make the right decision. We can put a constraint on the likelihood ratio, for some constant : 19 of 36

� � � �� A Different Take A Different Take Take the log of both sides: Recalling that and : � � � � � � � � � � � � But is arbitrary, so set it s.t. : � � � � Similiary the negative sample case should be: � � � � Try to derive it by yourself. 20 of 36

� � �� A Different Take A Different Take So now we have . But this may not have a � � � � unique solution, so put a quadratic penalty on the weights to make the solution unique: � � � � This gives us a SVM! By asking logistic regression to make the right decisions instead of maximizing the probability of the data, we derived an SVM. 21 of 36

� � � � �� Likelihood Ratio Likelihood Ratio The likelihood ratio drives this derivation: Different classifiers assign different costs to . 22 of 36

�� LR Cost LR Cost Choose (for a positive example) <matplotlib.text.Text at 0x7fb3ad135748> Out[23]: 23 of 36

� � � � �� LR Cost LR Cost Minimizing is the same as minimizing the negative log-likelihood objective for logistic regression! 24 of 36

�� SVM with Slack Variables SVM with Slack Variables If the data is not linearly separable, we can introduce slack variables. � �� 25 of 36

SVM with Slack Variables SVM with Slack Variables Out[24]: 26 of 36

� � � �� SVM Cost SVM Cost Choose �� <matplotlib.text.Text at 0x7fb3ad09c208> Out[25]: 27 of 36

Plotted in terms of Plotted in terms of � <matplotlib.legend.Legend at 0x7fb3ad019dd8> Out[26]: 28 of 36

� � � � � Plotted in terms of Plotted in terms of <matplotlib.legend.Legend at 0x7fb3acf98710> Out[27]: 29 of 36

Exploiting the Connection between LR and SVMs Exploiting the Connection between LR and SVMs 30 of 36

� � � � � � � � �� Kernel Trick for LR Kernel Trick for LR In the dual form, the SVM decision boundary is We could plug this into the LR cost: 31 of 36

� � � � � � � � � � � � � � � �� Multi-class SVMS Multi-class SVMS Recall multi-class logistic regression 32 of 36

� � � � � � � � � � �� Multi-class SVMS Multi-class SVMS Suppose instead we just want the decision rule to satisfy Taking logs as before, 33 of 36

� � � � � � � �� Multi-class SVMS Multi-class SVMS Now we have the quadratic program for multi-class SVMs. � � � 34 of 36

LR and SVMs are closely linked LR and SVMs are closely linked Both can be viewed as taking a probabilistic model and miminizing some cost associated with the likelihood ratio. This allows use to extend both models in principled ways. 35 of 36

Support Vector Machines Support Vector Machines CSC 411 Tutorial - PowerPoint PPT Presentation

Support Vector Machines Support Vector Machines CSC 411 Tutorial April 1, 2015 Tutor: Shenlong Wang Many thanks to Renjie Liao, Jake Snell, Yujia Li and Kevin Swersky for much of the following material. 1 of 36 2 of 36 Brief Review of SVMs

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

What is a What are Support Vector Machines Support Vector Machine? Used For? An optimally

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Gender Classification with Support vector machines (SVMs) Support Vector Machines The 3

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning al-

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning 1 Support

Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support

Support Vector Machines for microRNA Identification Liviu Ciortuz, CS Department, University of

Identifying Important Features for Intrusion Detection Using Support Vector Machines and Neural

Support Vector Machines Part 1 CS 760@UW-Madison Goals for the lecture you should understand

Chapter 5: Support Vector Machines Dr. Xudong Liu Assistant Professor School of Computing

Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin

Natural Language Processing and Information Retrieval Support Vector Machines Alessandro

Support Vector Machines Marco Chiarandini Department of Mathematics & Computer Science

Support Vector Machines Support Vector Machines CSC 411 Tutorial - PowerPoint PPT Presentation

Support Vector Machines Support Vector Machines CSC 411 Tutorial April 1, 2015 Tutor: Shenlong Wang Many thanks to Renjie Liao, Jake Snell, Yujia Li and Kevin Swersky for much of the following material. 1 of 36 2 of 36 Brief Review of SVMs

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

What is a What are Support Vector Machines Support Vector Machine? Used For? An optimally

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

Support Vector Machines &amp; Kernelization Barna Saha Most of the slides are made using David

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Gender Classification with Support vector machines (SVMs) Support Vector Machines The 3

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning al-

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning 1 Support

Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support

Support Vector Machines for microRNA Identification Liviu Ciortuz, CS Department, University of

Identifying Important Features for Intrusion Detection Using Support Vector Machines and Neural

Support Vector Machines Part 1 CS 760@UW-Madison Goals for the lecture you should understand

Chapter 5: Support Vector Machines Dr. Xudong Liu Assistant Professor School of Computing

Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin

Natural Language Processing and Information Retrieval Support Vector Machines Alessandro

Support Vector Machines Marco Chiarandini Department of Mathematics &amp; Computer Science

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David

Support Vector Machines Marco Chiarandini Department of Mathematics & Computer Science