Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D - PowerPoint PPT Presentation

Convex Programs COMPSCI 371D — Machine Learning COMPSCI 371D — Machine Learning Convex Programs 1 / 16

Logistic Regression → Support Vector Machines Support Vector Machines (SVMs) and Convex Programs • SVMs are linear predictors in their original form • Defined for both regression and classification • Multi-class versions exist • We will cover only binary SVM classification • Why do we need another linear classifier? • We’ll need some new math: Convex Programs • Optimization of convex functions with affine constraints COMPSCI 371D — Machine Learning Convex Programs 2 / 16

Logistic Regression → Support Vector Machines Outline 1 Logistic Regression → Support Vector Machines 2 Local Convex Minimization → Convex Programs 3 Shape of the Solution Set 4 The Karush-Kuhn-Tucker Conditions COMPSCI 371D — Machine Learning Convex Programs 3 / 16

Logistic Regression → Support Vector Machines Logistic Regression → SVMs • A logistic-regression classifier places the decision boundary somewhere (and approximately) between the two classes • Loss is never zero → Exact location of the boundary can be determined by samples that are very distant from the boundary (even on the correct side of it) • SVMs place the boundary “exactly half-way” between the two classes (with exceptions to allow for non linearly-separable classes) • Only samples close to the boundary matter: These are the support vectors • A “kernel trick” allows going beyond linear classifiers • We only look at the binary case COMPSCI 371D — Machine Learning Convex Programs 4 / 16

Logistic Regression → Support Vector Machines Roadmap for SVMs • SVM training minimizes a convex function with constraints • Convex: Unique minimum risk • Constraints: Define a convex program as minimizing a convex function subject to affine constraints • Representer theorem : The SVM hyperplane normal vector is a linear combination of a subset of training samples ( x n , y n ) . The x n are the support vectors . • The proof of the representer theorem is based on a characterization of the solutions of a convex program • Characterization for an unconstrained problem: ∇ f ( u ) = 0 • Characterization for a convex program: The Karush-Kuhn-Tucker (KKT) conditions • The representer theorem leads to the kernel trick , through which SVMs can be turned into nonlinear classifiers • Decision boundary is no longer necessarily a hyperplane COMPSCI 371D — Machine Learning Convex Programs 5 / 16

Logistic Regression → Support Vector Machines Roadmap Summary Convex program → SVM formulation KKT conditions → representer theorem → kernel trick COMPSCI 371D — Machine Learning Convex Programs 6 / 16

Local Convex Minimization → Convex Programs Local Convex Minimization → Convex Programs • Convex function f ( u ) : R m → R • f differentiable, with continuous first derivatives • Unconstrained minimization: u ∗ ∈ arg min u ∈ R m f ( u ) • Constrained minimization: u ∗ ∈ arg min u ∈ C f ( u ) • C = { u ∈ R m : A u + b ≥ 0 } • f is a convex function • C is a convex set : If u , v ∈ C , then for t ∈ [ 0 , 1 ] t u + ( 1 − t ) v ∈ C • The specific C is bounded by hyperplanes • This is a convex program COMPSCI 371D — Machine Learning Convex Programs 7 / 16

Local Convex Minimization → Convex Programs Convex Program u ∗ ∈ arg min u ∈ C f ( u ) where = { u ∈ R m : c ( u ) ≥ 0 } . def C • f differentiable, with continuous gradient, and convex • k inequalities in C are affine: c ( u ) = A u + b ≥ 0 . COMPSCI 371D — Machine Learning Convex Programs 8 / 16

Shape of the Solution Set Shape of the Solution Set • Just as for the unconstrained problem: • There is one f ∗ but there can be multiple u ∗ (a flat valley) • The set of solution points u ∗ is convex • if f is strictly convex at u ∗ , then u ∗ is the unique solution point COMPSCI 371D — Machine Learning Convex Programs 9 / 16

Shape of the Solution Set Zero Gradient → KKT Conditions • For the unconstrained problem, the solution is characterized by ∇ f ( u ) = 0 • Constraints can generate new minima and maxima • Example: f ( u ) = e u f(u) f(u) f(u) 0 1 0 1 0 1 u u u • What is the new characterization? • Karush-Kuhn-Tucker conditions , necessary and sufficient COMPSCI 371D — Machine Learning Convex Programs 10 / 16

The Karush-Kuhn-Tucker Conditions Regular Points s ∇ f u H − H + COMPSCI 371D — Machine Learning Convex Programs 11 / 16

The Karush-Kuhn-Tucker Conditions Corner Points c 2 C C c 2 ∇ f s ∇ f c 1 c 1 u u H − H + H − H + COMPSCI 371D — Machine Learning Convex Programs 12 / 16

The Karush-Kuhn-Tucker Conditions The Convex Cone of the Constraint Gradients c 2 ∇ f c 1 u H − H + COMPSCI 371D — Machine Learning Convex Programs 13 / 16

The Karush-Kuhn-Tucker Conditions Inactive Constraints Do Not Matter c 2 c 3 C ∇ f c 1 u c 1 u c 2 v C H − H + COMPSCI 371D — Machine Learning Convex Programs 14 / 16

The Karush-Kuhn-Tucker Conditions Conic Combinations c 2 n 1 n 2 v ∇ f c 1 u a 1 a 2 H − H + { v : v = α 1 a 1 + α 2 a 2 with α 1 , α 2 ≥ 0 } COMPSCI 371D — Machine Learning Convex Programs 15 / 16

The Karush-Kuhn-Tucker Conditions The KKT Conditions u ∈ C is a solution to a convex program iff there exist α i s.t. � ∇ f ( u ) = α i ∇ c i ( u ) with α i ≥ 0 i ∈A ( u ) where A ( u ) = { i : c i ( u ) = 0 } is the active set at u c 2 ∇ f c 1 u H − H + Convention: � i ∈∅ = 0 (so condition also holds in interior of C ) COMPSCI 371D — Machine Learning Convex Programs 16 / 16

Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D - PowerPoint PPT Presentation

Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Convex Programs 1 / 16 Logistic Regression Support Vector Machines Support Vector Machines (SVMs) and Convex Programs SVMs are linear predictors in

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

Convex Analysis Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS133 Computational Geometry Convex Hull 4/12/2018 1 Convex Hull Given a set of n points,

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

14. Convex programming Convex sets and functions Convex programs Hierarchy of

3.1 Online Convex Programming Definition 3.1.1 (Convex Set) A set of vectors X R n is convex

Minimizing within convex bodies using a convex hull method Edouard Oudet Thomas

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

Issues raised and Proposed Solutions for TMMBR t draft-ietf-avt-avpf-ccm-04 t Magnus

Transformation of corner singularities in presence of small or large parameters Monique Dauge

Optical flow Cordelia Schmid Motion field The motion field is the projection of the 3D scene

Lecture 10: Accommodations have been emailed. UC Berkeley EECS If you have not gotten an

Financial Stability Report November 2020 The implemented support measures have ensured that the

political representation in the complex EU system Thematic Area 1. Modes of democratic

Show the Right Numbers ggplot IMPLEMENTS A GRAMMAR OF GRAPHICS The grammar is a set of rules

Unit 6: Introduction to linear regression MT 2 scores posted in Sakai! 1. Introduction to