Data Mining Support Vector Machines Introduction to Data Mining, 2 - PDF document

Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Introduction to Data Mining, 2 nd Edition 02/17/2020 1 1 Support Vector Machines • Find a linear hyperplane (decision boundary) that will separate the data Introduction to Data Mining, 2 nd Edition 02/17/2020 2 2

Support Vector Machines • One Possible Solution Introduction to Data Mining, 2 nd Edition 02/17/2020 3 3 Support Vector Machines • Another possible solution Introduction to Data Mining, 2 nd Edition 02/17/2020 4 4

Support Vector Machines • Other possible solutions Introduction to Data Mining, 2 nd Edition 02/17/2020 5 5 Support Vector Machines • Which one is better? B1 or B2? • How do you define better? Introduction to Data Mining, 2 nd Edition 02/17/2020 6 6

Support Vector Machines • Find hyperplane maximizes the margin => B1 is better than B2 Introduction to Data Mining, 2 nd Edition 02/17/2020 7 7 Support Vector Machines      w x b 0         w x b 1     w x b 1       1 if w x b 1 2    Margin f ( x )          || w || 1 if w x b 1  Introduction to Data Mining, 2 nd Edition 02/17/2020 8 8

Linear SVM • Linear model:       1 if w x b 1   f ( x )         1 if w x b 1  • Learning the model is equivalent to determining  the values of w and b  – How to find from training data? w and b Introduction to Data Mining, 2 nd Edition 02/17/2020 9 9 Learning Linear SVM 2 • Objective is to maximize:  Margin  || w ||    2 || w || – Which is equivalent to minimizing: L ( w ) 2 – Subject to the following constraints:       1 if w x b 1  i y    i      1 if w x b 1  i or 𝑧 � � w • x � � 𝑐� � 1, 𝑗 � 1,2, . . . , 𝑂  This is a constrained optimization problem – Solve it using Lagrange multiplier method Introduction to Data Mining, 2 nd Edition 02/17/2020 10 10

Example of Linear SVM Support vectors  x1 x2 y 0.3858 0.4687 1 65.5261 0.4871 0.611 -1 65.5261 0.9218 0.4103 -1 0 0.7382 0.8936 -1 0 0.1763 0.0579 1 0 0.4057 0.3529 1 0 0.9355 0.8132 -1 0 0.2146 0.0099 1 0 Introduction to Data Mining, 2 nd Edition 02/17/2020 11 11 Learning Linear SVM • Decision boundary depends only on support vectors – If you have data set with same support vectors, decision boundary will not change – How to classify using SVM once w and b are found? Given a test record, x i       1 if w x b 1   i f ( x )    i      1 if w x b 1  i Introduction to Data Mining, 2 nd Edition 02/17/2020 12 12

Support Vector Machines • What if the problem is not linearly separable? Introduction to Data Mining, 2 nd Edition 02/17/2020 13 13 Support Vector Machines • What if the problem is not linearly separable? – Introduce slack variables   Need to minimize: 2   || w || N      k  L ( w ) C i 2    i 1  Subject to:        1 if w x b 1 -  i i y    i        1 if w x b 1  i i  If k is 1 or 2, this leads to similar objective function as linear SVM but with different constraints (see textbook) Introduction to Data Mining, 2 nd Edition 02/17/2020 14 14

Support Vector Machines • Find the hyperplane that optimizes both factors Introduction to Data Mining, 2 nd Edition 02/17/2020 15 15 Nonlinear Support Vector Machines • What if decision boundary is not linear? Introduction to Data Mining, 2 nd Edition 02/17/2020 16 16

Nonlinear Support Vector Machines • Transform data into higher dimensional space Decision boundary:       w ( x ) b 0 Introduction to Data Mining, 2 nd Edition 02/17/2020 17 17 Learning Nonlinear SVM • Optimization problem: • Which leads to the same set of equations (but involve  (x) instead of x) Introduction to Data Mining, 2 nd Edition 02/17/2020 18 18

Learning NonLinear SVM • Issues: – What type of mapping function  should be used? – How to do the computation in high dimensional space?  Most computations involve dot product  (x i )   (x j )  Curse of dimensionality? Introduction to Data Mining, 2 nd Edition 02/17/2020 19 19 Learning Nonlinear SVM • Kernel Trick: –  (x i )   (x j ) = K(x i , x j ) – K(x i , x j ) is a kernel function (expressed in terms of the coordinates in the original space)  Examples: Introduction to Data Mining, 2 nd Edition 02/17/2020 20 20

Example of Nonlinear SVM SVM with polynomial degree 2 kernel Introduction to Data Mining, 2 nd Edition 02/17/2020 21 21 Learning Nonlinear SVM • Advantages of using kernel: – Don’t have to know the mapping function  – Computing dot product  (x i )   (x j ) in the original space avoids curse of dimensionality • Not all functions can be kernels – Must make sure there is a corresponding  in some high-dimensional space – Mercer’s theorem (see textbook) Introduction to Data Mining, 2 nd Edition 02/17/2020 22 22

Characteristics of SVM • The learning problem is formulated as a convex optimization problem – Efficient algorithms are available to find the global minima – Many of the other methods use greedy approaches and find locally optimal solutions – High computational complexity for building the model • Robust to noise • Overfitting is handled by maximizing the margin of the decision boundary, • SVM can handle irrelevant and redundant better than many other techniques • The user needs to provide the type of kernel function and cost function • Difficult to handle missing values • What about categorical variables? Introduction to Data Mining, 2 nd Edition 02/17/2020 23 23

Data Mining Support Vector Machines Introduction to Data Mining, 2 - PDF document

Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Introduction to Data Mining, 2 nd Edition 02/17/2020 1 1 Support Vector Machines Find a linear hyperplane (decision

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

Today's Specials Detailed look at Lagrange Multipliers Forward-Backward and Viterbi

Distributed Optimization Algorithms for Networked Systems Michael M. Zavlanos Mechanical

Finding max/min under constraint The behaviour of economic actors is often constrained by the

on the light front Sergei Alexandrov Laboratoire Charles Coulomb Montpellier work in progress

Dark matter heavyweights SUSY and Q-balls Inflation+SUSY Q-balls stable Q-balls as

Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 S. Gadat Big Data

9 MultiFreedom Constraints II IFEM Ch 9 Slide 1 Introduction to FEM Penalty Function

Data Mining Support Vector Machines Introduction to Data Mining, 2 - PDF document

Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Introduction to Data Mining, 2 nd Edition 02/17/2020 1 1 Support Vector Machines Find a linear hyperplane (decision

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Support Vector Machines &amp; Kernelization Barna Saha Most of the slides are made using David

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

Today's Specials Detailed look at Lagrange Multipliers Forward-Backward and Viterbi

Distributed Optimization Algorithms for Networked Systems Michael M. Zavlanos Mechanical

Finding max/min under constraint The behaviour of economic actors is often constrained by the

on the light front Sergei Alexandrov Laboratoire Charles Coulomb Montpellier work in progress

Dark matter heavyweights SUSY and Q-balls Inflation+SUSY Q-balls stable Q-balls as

Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 S. Gadat Big Data

9 MultiFreedom Constraints II IFEM Ch 9 Slide 1 Introduction to FEM Penalty Function

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David