SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch - PowerPoint PPT Presentation

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

LOGISTICS LOGISTICS TAs and Office hours Tuesday: Dr. Bloch (College of Architecture Cafe) - 11am - 11:55am Tuesday: TJ (VL C449 Cubicle D) - 1:30pm - 2:45pm Thursday: Hossein (VL C449 Cubicle B): 10:45pm - 12:00pm Friday: Brighton (TSRB 523a) - 12pm-1:15pm Projects Thanks for forming teams Start working on your proposals! Discussion: proposal deadline extension Midterm March 5th Sample midterm posted ( do not share ) Open notes 2

RECAP: KARUSH-KUHN TUCKER CONDITIONS RECAP: KARUSH-KUHN TUCKER CONDITIONS Assume , , are all differentiable f { } g i { h j } Consider x , λ , μ Stationarity: p m 0 = ∇ f ( x ) + ∑ λ i ∇ ( x ) + g i ∑ μ j ∇ h j ( x ) i =1 j =1 Primal feasibility: ∀ i ∈ [1; m ] g i ( x ) ≤ 0 ∀ j ∈ [1; p ] h j ( x ) = 0 Dual feasibility: ∀ i ∈ [1; m ] λ i ≥ 0 Complementary slackness: ∀ i ∈ [1; m ] λ i g i ( x ) = 0 3

KKT CONDITIONS: NECESSITY AND SUFFICIENCY KKT CONDITIONS: NECESSITY AND SUFFICIENCY Theorem (KKT necessity) If and are primal and dual solutions with zero duality gap, then and satisfy x ∗ λ ∗ μ ∗ x ∗ λ ∗ μ ∗ ( , ) ( , ) the KKT conditions. Theorem (KKT sufficiency) If the original problem is convex and and satisfy the KKT conditions, then is primal optimal, ~ μ ~ ~ ~ x ( , λ ) x is dual optimal, and the duality gap is zero. ~ μ ~ ( , λ ) If a constrained optimization problem is differentiable and convex KKT conditions are necessary and sufficient for primal/dual optimality (with zero duality gap) we can use the KKT conditions to find a solution to our optimization problem We’re in luck: the optimal so�-margin hyperplane falls in this category! 4

OPTIMAL SOFT-MARGIN HYPERPLANE REVISITED OPTIMAL SOFT-MARGIN HYPERPLANE REVISITED The optimal so�-margin hyperplane is the solution of the following N 1 C 2 ∥ w ∥ 2 y i w ⊺ x i argmin + N ∑ ξ i s.t. ∀ i ∈ [1; N ] ( + b ) ≥ 1 − ξ i and ξ i ≥ 0 2 w , b , ξ i =1 Optimization problem is differentiable and convex KKT conditions are necessary and sufficient, duality gap is zero We will kernelize the dual problem The Lagrangian is N N N 1 C 2 w ⊺ y i w ⊺ x i L ( w , b , ξ , λ , μ ) ≜ w + N ∑ ξ i + ∑ λ i (1 − ξ i − ( + b )) − ∑ μ i ξ i i =1 i =1 i =1 with . λ ≥ 0 , μ ≥ 0 The Lagrange dual function is L D ( λ , μ ) = min L ( w , b , ξ , λ , μ ) w , b , ξ The dual problem is λ ≥ 0 , μ ≥ 0 L D max ( λ , μ ) 5

OPTIMAL SOFT-MARGIN HYPERPLANE: KERNELIZATION OPTIMAL SOFT-MARGIN HYPERPLANE: KERNELIZATION Let’s simplify using the KKT conditions L D ( λ , μ ) Lemma (Simplification of dual function) The dual function is N N N 1 λ i λ j y i y j x ⊺ L D ( λ , μ ) = − 2 ∑ ∑ i x j + ∑ λ i i =1 j =1 i =1 Lemma (Simplification of dual problem) The dual optimization problem function is N N N ∑ N 1 ∀ i ∈ [1; N ] i =1 λ i y i = 0 λ i λ j y i y j x ⊺ max − 2 ∑ ∑ i x j + ∑ λ i s .t. { C λ , μ ∀ i ∈ [1; N ] 0 ≤ λ i ≤ i =1 j =1 i =1 N We can very efficiently solve for λ ∗ 8

OPTIMAL SOFT-MARGIN HYPERPLANE: PRIMAL SOLUTIONS OPTIMAL SOFT-MARGIN HYPERPLANE: PRIMAL SOLUTIONS Assume that we now know , how do we find ? λ ∗ μ ∗ w ∗ b ∗ ( , ) ( , ) Lemma (Finding primal solutions) N w ∗ ⊺ x i w ∗ b ∗ λ ∗ = ∑ i y i x i and = y i − i =1 for some such that C λ ∗ i ∈ [1; N ] 0 < < i N The only data points that matter are those for which λ ∗ ≠ 0 i By completementary slackness they are the ones for which y i w ∗ ⊺ x i ξ ∗ ( + b ) = 1 − i These points are called support vectors Points are on or inside the margin In practice, the number of support vectors is o�en ≪ N 12

   13

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch - PowerPoint PPT Presentation

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1 LOGISTICS LOGISTICS TAs and Office hours Tuesday: Dr. Bloch (College of Architecture Cafe) - 11am - 11:55am Tuesday: TJ (VL C449 Cubicle D) -

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Generating a radially separable dataset DataCamp Support Vector Machines in R Generating a 2d

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

Support Vector Machines Support Vector Machines CSC 411 Tutorial April 1, 2015 Tutor: Shenlong

Spring term, 2019 Ling 5201 Syntax I 1: Valence, rules and proof Robert Levine Ohio State

Machine Learning - MT 2017 13 Support Vector Machines II Christoph Haase University of Oxford

Topics in Combinatorial OPtimization Orlando Lee Unicamp 19 de mar co de 2014 Orlando Lee

PROPERTIES OF SOME ALGEBRAICALLY DEFINED DIGRAPHS Aleksandr Kodess, Felix Lazebnik Department of

Kenneth O. May: A set of independent necessary and sufficient conditions for simple majority

Partially Localizable Networks by Goldenberg, Krishnamurthy, Maness, Yang, Young, Morse,

Lin ear-Com plexity Hexahedral Mesh Gen eration David Eppstein Dept. In form ation an d Com

Phenomenology 2010 Symposium University of Wisconsin, Madison 10-12 May, 2010 Marek Zralek

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch - PowerPoint PPT Presentation

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1 LOGISTICS LOGISTICS TAs and Office hours Tuesday: Dr. Bloch (College of Architecture Cafe) - 11am - 11:55am Tuesday: TJ (VL C449 Cubicle D) -

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Support Vector Machines &amp; Kernelization Barna Saha Most of the slides are made using David

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Generating a radially separable dataset DataCamp Support Vector Machines in R Generating a 2d

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

Support Vector Machines Support Vector Machines CSC 411 Tutorial April 1, 2015 Tutor: Shenlong

Spring term, 2019 Ling 5201 Syntax I 1: Valence, rules and proof Robert Levine Ohio State

Machine Learning - MT 2017 13 Support Vector Machines II Christoph Haase University of Oxford

Topics in Combinatorial OPtimization Orlando Lee Unicamp 19 de mar co de 2014 Orlando Lee

PROPERTIES OF SOME ALGEBRAICALLY DEFINED DIGRAPHS Aleksandr Kodess, Felix Lazebnik Department of

Kenneth O. May: A set of independent necessary and sufficient conditions for simple majority

Partially Localizable Networks by Goldenberg, Krishnamurthy, Maness, Yang, Young, Morse,

Lin ear-Com plexity Hexahedral Mesh Gen eration David Eppstein Dept. In form ation an d Com

Phenomenology 2010 Symposium University of Wisconsin, Madison 10-12 May, 2010 Marek Zralek

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David