Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila - PowerPoint PPT Presentation

Kernel Properties Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties - Convexity

Kernel Properties Kernel Properties data is not linearly separable ! use feature vector of the data Φ ( x ) in another space we can even use infinite feature vectors because of the Kernel trick you will not have to explicitly compute the feature vectors Φ ( x ) . (you will Kernelize an algorithms in HW2). Leila Wehbe Kernel Properties - Convexity

Kernel Properties Kernels dot product in feature space k ( x , x 0 ) = h Φ ( x ) , Φ ( x 0 ) i we can write the kernel in matrix form over the data sample: K ij = h Φ ( x ) , Φ ( x 0 ) i = k ( x , x 0 ) . This is called a Gram matrix. K is positive semi-definite, i.e. α K α � 0 for all α 2 R m and all kernel matrices K 2 R m ⇥ m . Proof (from class): m m X X α i α j K ij = α i α j h Φ ( x i ) , Φ ( x j ) i i , j i , j m m m α i Φ ( x i ) || 2 � 0 X X X = h α i Φ ( x i ) , α j Φ ( x j ) i = || i j i Leila Wehbe Kernel Properties - Convexity

Kernel Properties Kernels by mercer’s theorem, any symmetric, square integrable function k : X ⇥ X ! R that satisfies Z k ( x , x 0 ) f ( x ) f ( x 0 ) dxdx 0 � 0 X ⇥ X there exist a feature space Φ ( x ) and a λ � 0 k ( x , x 0 ) = P i λ i φ i ( x ) φ i ( x 0 ) ( we have k ( x , x 0 ) = h Φ 0 ( x ) , Φ 0 ( x 0 ) i ) in discrete space: P P j K ( x i , x j ) c i c j i any Gram matrix derived of a kernel k is positive semi definite $ k is a valid kernel (dot product) Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices k ( x , x 0 ) is a valid kernel show that f ( x ) f ( x 0 ) k ( x , x 0 ) is a kernel Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Answer: f ( x ) f ( y ) k ( x , y ) = f ( x ) f ( y ) < φ ( x ) , φ ( y ) > = < f ( x ) φ ( x ) , f ( y ) φ ( y ) > = < φ 0 ( x ) , φ 0 ( y ) > Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices k 1 ( x , x 0 ) , k 2 ( x , x 0 ) are valid kernels show that c 1 ⇤ k 1 ( x , x 0 ) + c 2 ⇤ k 2 ( x , x 0 ) , where c 1 , c 2 � 0 is a valid Kernel (multiple ways to show it) Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Answer 1: For any function f ( . ) : Z x , x 0 f ( x ) f ( x 0 )[ c 1 k 1 ( x , x 0 ) + c 2 k 2 ( x , x 0 )] dx dx 0 Z Z x , x 0 f ( x ) f ( x 0 ) k 1 ( x , x 0 ) dx dx 0 + c 2 x , x 0 f ( x ) f ( x 0 ) k 2 ( x , x 0 ) dx dx 0 � 0 = c 1 x , x 0 f ( x ) f ( x 0 ) k 1 ( x , x 0 ) dx dx 0 � 0 and R since x , x 0 f ( x ) f ( x 0 ) k 2 ( x , x 0 ) dx dx 0 � 0 since k 1 and k 2 are valid kernels. R Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Answer 2: Here is another way to prove it: Given any final set of instances { x 1 , . . . , x n } , let K 1 (resp., K 2 ) be the n ⇥ n Gram matrix associated with k 1 (resp., k 2 ). The Gram matrix associated with c 1 k 1 + c 2 k 2 is just K = c 1 K 1 + c 2 K 2 . K is PSD because any v 2 R n , v T ( c 1 K 1 + c 2 K 2 ) v = c 1 ( v T K 1 v ) + c 2 ( v T K 2 v ) � 0 as v T K 1 v � 0 and v T K 2 v � 0 follows from K 1 and K 2 being positive semi definite. k is a valid kernel. Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Answer 3: let Φ 1 and Φ 2 be the feature vectors associated with k 1 and k 2 respectively. Take vector Φ which is the concatenation of p c 1 Φ 1 and p c 2 Φ 2 . i.e. Φ ( x ) = [ p c 1 φ 1 1 ( x ) , p c 1 φ 1 2 ( x ) , .... p c 1 φ 1 m ( x ) , p c 2 φ 2 1 ( x ) , p c 2 φ 2 2 ( x ) , .... p c 2 φ 2 m ( x )] . It’s easy to check that N m X X φ 1 i ( x ) ⇥ φ 1 h Φ ( x ) , Φ ( x 0 ) i = φ i ( x ) ⇥ φ i ( x 0 ) = c 1 i ( x 0 ) i = 1 i = 1 = c 1 h Φ 1 ( x ) , Φ 1 ( x 0 ) i + c 2 h Φ 2 ( x ) , Φ 2 ( x 0 ) i = c 1 k 1 ( x , x 0 ) + c 2 k 2 ( x , x 0 ) = k ( x , x 0 ) therefore k is a valid kernel. Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices k 1 , k 2 are valid kernels show that k 1 ( x , x 0 ) � k 2 ( x , x 0 ) is not necessarily a kernel Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Proof by counter example: Consider the kernel k 1 being the identity ( k 1 ( x , x 0 ) = 1 iff x = x 0 and = 0 otherwise), and k 2 being twice the identity ( k 1 ( x , x 0 ) = 2 iff x = x 0 and = 0 otherwise). Let K 1 = I p be the p ⇥ p identity matrix and K p = 2 I p be 2 times that identity matrix. K 1 and K 2 are the Gram matrices associated with k 1 and k 2 respectively. Clearly both K 1 and K 2 are positive semi definite, however K 1 � K 2 = � I is not, as its eigenvalues are -1. Therefore k is not a valid kernel. Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices PSD matrices A and B show that AB is not necessarily PSD Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices for PSD matrices A and B , it suffices to show that AB is not ✓ 1 ✓ 2 ◆ ◆ 0 1 symmetric – so just use A = and B = ; here 0 2 1 2 ✓ 2 ◆ 1 AB = which is not symmetric. 2 4 Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices k 1 , k 2 are valid kernels show that the element wise product k ( x i , x j ) = k 1 ( x i , x j ) ⇥ k 2 ( x i , x j ) is a valid kernel. start by showing that if matrices A and B are PSD, then C ij = A ij ⇥ B ij is PSD Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Answer: First show that C s.t. C ij = A ij ⇥ B ij is PSD: One way to show it: Any PSD matrix Q is a covariance matrix. 1 To see this, think of a p-dimensional random variable x with a covariance matrix I p , the identity matrix. ( Q is p ⇥ p ) Because Q is PSD it admits a non-negative symmetric 1 2 . square root Q Then: 1 1 1 1 1 2 = Q 2 = Q cov ( Q 2 x ) = Q 2 cov ( x )) Q 2 I Q And therefore Q is a covariance matrix. We also know that any covariance matrix is PSD. So given 2 A and B PSD, we know that they are covariance matrices. We want to show that C is also a covariance matrix and therefore PSD. Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Let u = ( u 1 , . . . , u n ) T ⇠ N ( 0 p , A ) and 3 v = ( v 1 , . . . , v n ) T ⇠ N ( 0 p , B ) where 0 + p is a p-dimensional vector of zeros Define the vector w = ( u 1 v 1 , . . . , u n v n ) T 4 cov ( w ) = E [( w � µ w )( w � µ w ) T ] = E [ ww T ] This is because µ w i = 0 for all i . This is because u and v are independent so µ w = µ u ⇥ µ v = 0 p cov ( w ) i , j = E [ w i w T j ] = E [( u i v i )( u j v j )] = E [( u i u j )( v i v j )] = E [ u i u j ] E [ v i v j ] This is again because u and v are independent. cov ( w ) i , j = E [ u i u j ] E [ v i v j ] = A i , j ⇥ B i , j = C i , j Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Therefore C is a covariance matrix and therefore PSD 5 Since any kernel matrix created from 6 k ( x i , x j ) = k 1 ( x i , x j ) ⇥ k 2 ( x i , x j ) is PSD, then k is PSD. Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices A is PSD show that A m is PSD Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Answer: Recall A = UDU T First we show that A m = UD m U T . Proof by induction: trivially true for m = 1 . A m + 1 = AA m = UDU T ( UD m U T ) = UD ( U T U ) D m U T = UDD m U T = UD m + 1 U T Hence, the eigenvalues of A m are the diagonal elements of D m , which are λ m i (where { λ i } are the diagonal elements of D ). Since λ i � 0 , these eigenvalues λ m i are also � 0 . This means A m is PSD. Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices k ( x , x 0 ) is a valid kernel show that k ( x , y ) 2  k ( x , x ) k ( y , y ) Leila Wehbe Kernel Properties - Convexity

Kernel Properties Exercices Answer: k ( x , y ) 2 = < φ ( x ) , φ ( y ) > 2 = || φ ( x ) || 2 || φ ( y ) || 2 ( cos ( θ φ ( x ) , φ ( y ) )) 2  || φ ( x ) || 2 || φ ( y ) || 2 = k ( x , x ) k ( y , y ) Leila Wehbe Kernel Properties - Convexity

Convexity Unconstrained Convex Optimization Introduction to Convex Optimization Xuezhi Wang Computer Science Department Carnegie Mellon University 10701-recitation, Jan 29 Introduction to Convex Optimization

Convexity Unconstrained Convex Optimization Outline Convexity 1 Convex Sets Convex Functions Unconstrained Convex Optimization 2 First-order Methods Newton’s Method Introduction to Convex Optimization

Convexity Convex Sets Unconstrained Convex Optimization Convex Functions Outline Convexity 1 Convex Sets Convex Functions Unconstrained Convex Optimization 2 First-order Methods Newton’s Method Introduction to Convex Optimization

Convexity Convex Sets Unconstrained Convex Optimization Convex Functions Convex Sets Definition For x , x 0 2 X it follows that λ x + ( 1 � λ ) x 0 2 X for λ 2 [ 0 , 1 ] Examples Empty set ; , single point { x 0 } , the whole space R n Hyperplane: { x | a > x = b } , halfspaces { x | a > x  b } Euclidean balls: { x | || x � x c || 2  r } + = { A 2 S n | A ⌫ 0 } ( S n is Positive semidefinite matrices: S n the set of symmetric n ⇥ n matrices) Introduction to Convex Optimization

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila - PowerPoint PPT Presentation

Kernel Properties Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties - Convexity Kernel Properties Kernel Properties data is not linearly separable ! use feature vector of the data ( x ) in another

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Optimal covering of a straight line application to discrete convexity Jean-Marc Chassery, Isabelle

Convexity and Polyhedra Carlo Mannino (from Geir Dahl notes on convexity) University of Oslo,

A Tightrope Walk Between Convexity and Non-convexity in Computer Vision Thomas Pock Institute

Discrete convexity and packages Gleb Koshevoy IITP(RAS) and Poncelet Center (CNRS) 12/05/2020,

Convexity and the Kalmbach monad Gejza Jena August 10, 2018 Gejza Jena Convexity and the

3. Convex functions basic properties and examples operations that preserve convexity

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Web Course Web Course Physical Properties of Glass Physical Properties of Glass 1. Properties

Web Course Web Course Physical Properties of Glass Physical Properties of Glass 1. Properties

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

Detecting Unknown Network Attacks using Language Models Konrad Rieck and Pavel Laskov DIMVA

New results on equiangular lines or How I caught a gold fish? Ferenc Szll osi

Character-Aware Neural Language Models Yoon Kim Yacine Jernite David Sontag Alexander Rush

Joint work with Marc Brockschmidt, Alex Gaunt, Alex Polozov, Patrick Fernandes, Mahmoud Khademi

Sch onbergs Theorem and Association Schemes Joint work with Brian Kodalen William J. Martin

The icosahedra of edge length 1 Daniel Robertz (j.w. K.-H. Brakhage, A. Niemeyer, W. Plesken, A.

PPI Network Alignment 02-715 Advanced Topics in Computa8onal Genomics

Drug Development Jeremy M. Berg Fifth Annual Ri.MED Scientific Symposium October 24, 2011 The

Sambuz

Useful Links

Newsletter

Mail Us