lecture 1 introduction to rkhs
play

Lecture 1: Introduction to RKHS MLSS Cadiz, 2016 Gatsby Unit, CSML, - PowerPoint PPT Presentation

Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Lecture 1: Introduction to RKHS MLSS Cadiz, 2016 Gatsby Unit, CSML, UCL May 12, 2016 Lecture 1: Introduction to RKHS Feature space Basics of reproducing


  1. Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Lecture 1: Introduction to RKHS MLSS Cadiz, 2016 Gatsby Unit, CSML, UCL May 12, 2016 Lecture 1: Introduction to RKHS

  2. Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Kernels and feature space (1): XOR example 5 4 3 2 1 x 2 0 −1 −2 −3 −4 −5 −5 −4 −3 −2 −1 0 1 2 3 4 5 x 1 No linear classifier separates red from blue Map points to higher dimensional feature space : � � ∈ R 3 φ ( x ) = x 1 x 2 x 1 x 2 Lecture 1: Introduction to RKHS

  3. Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Kernels and feature space (2): smoothing 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 −0.2 −0.2 −0.2 −0.4 −0.4 −0.4 −0.6 −0.6 −0.6 −0.8 −0.8 −0.8 −1 −1 −1 −0.5 0 0.5 1 1.5 −0.5 0 0.5 1 1.5 −0.5 0 0.5 1 1.5 Kernel methods can control smoothness and avoid overfitting/underfitting . Lecture 1: Introduction to RKHS

  4. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Outline: reproducing kernel Hilbert space We will describe in order: 1 Hilbert space 2 Kernel (lots of examples: e.g. you can build kernels from simpler kernels) 3 Reproducing property Lecture 1: Introduction to RKHS

  5. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Hilbert space Definition (Inner product) Let H be a vector space over R . A function �· , ·� H : H × H → R is an inner product on H if 1 Linear: � α 1 f 1 + α 2 f 2 , g � H = α 1 � f 1 , g � H + α 2 � f 2 , g � H 2 Symmetric: � f , g � H = � g , f � H 3 � f , f � H ≥ 0 and � f , f � H = 0 if and only if f = 0. Lecture 1: Introduction to RKHS

  6. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Hilbert space Definition (Inner product) Let H be a vector space over R . A function �· , ·� H : H × H → R is an inner product on H if 1 Linear: � α 1 f 1 + α 2 f 2 , g � H = α 1 � f 1 , g � H + α 2 � f 2 , g � H 2 Symmetric: � f , g � H = � g , f � H 3 � f , f � H ≥ 0 and � f , f � H = 0 if and only if f = 0. � Norm induced by the inner product: � f � H := � f , f � H Lecture 1: Introduction to RKHS

  7. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Hilbert space Definition (Inner product) Let H be a vector space over R . A function �· , ·� H : H × H → R is an inner product on H if 1 Linear: � α 1 f 1 + α 2 f 2 , g � H = α 1 � f 1 , g � H + α 2 � f 2 , g � H 2 Symmetric: � f , g � H = � g , f � H 3 � f , f � H ≥ 0 and � f , f � H = 0 if and only if f = 0. � Norm induced by the inner product: � f � H := � f , f � H Definition (Hilbert space) Inner product space containing Cauchy sequence limits. Lecture 1: Introduction to RKHS

  8. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Kernel Definition Let X be a non-empty set. A function k : X × X → R is a kernel if there exists an R -Hilbert space and a map φ : X → H such that ∀ x , x ′ ∈ X , � � k ( x , x ′ ) := φ ( x ) , φ ( x ′ ) H . Almost no conditions on X (eg, X itself doesn’t need an inner product, eg. documents). A single kernel can correspond to several possible features. A trivial example for X := R : � x / √ � 2 √ φ 1 ( x ) = x and φ 2 ( x ) = x / 2 Lecture 1: Introduction to RKHS

  9. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: sums, transformations Theorem (Sums of kernels are kernels) Given α > 0 and k, k 1 and k 2 all kernels on X , then α k and k 1 + k 2 are kernels on X . (Proof via positive definiteness: later!) A difference of kernels may not be a kernel ( why? ) Lecture 1: Introduction to RKHS

  10. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: sums, transformations Theorem (Sums of kernels are kernels) Given α > 0 and k, k 1 and k 2 all kernels on X , then α k and k 1 + k 2 are kernels on X . (Proof via positive definiteness: later!) A difference of kernels may not be a kernel ( why? ) Theorem (Mappings between spaces) Let X and � X be sets, and define a map A : X → � X . Define the kernel k on � X . Then the kernel k ( A ( x ) , A ( x ′ )) is a kernel on X . Example: k ( x , x ′ ) = x 2 ( x ′ ) 2 . Lecture 1: Introduction to RKHS

  11. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: products Theorem (Products of kernels are kernels) Given k 1 on X 1 and k 2 on X 2 , then k 1 × k 2 is a kernel on X 1 × X 2 . If X 1 = X 2 = X , then k := k 1 × k 2 is a kernel on X . Proof: Main idea only! H 1 space of kernels between shapes , � 1 � I � � � φ 1 ( x ) = φ 1 ( � ) = , k 1 ( � , △ ) = 0 . 0 I △ H 2 space of kernels between colors , � 0 � I • � � φ 2 ( x ) = φ 2 ( • ) = k 2 ( • , • ) = 1 . I • 1 Lecture 1: Introduction to RKHS

  12. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: products “Natural” feature space for colored shapes : � I � � � I • � � � I △ = φ 2 ( x ) φ ⊤ Φ( x ) = = 1 ( x ) I � I △ I � I △ I • Lecture 1: Introduction to RKHS

  13. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: products “Natural” feature space for colored shapes : � I � � � I • � � � I △ = φ 2 ( x ) φ ⊤ Φ( x ) = = 1 ( x ) I � I △ I � I △ I •   Kernel is: � �   k ( x , x ′ ) = Φ ij ( x )Φ ij ( x ′ ) = tr  φ 1 ( x ) φ ⊤ 2 ( x ) φ 2 ( x ′ ) φ ⊤ 1 ( x ′ )  � �� � i ∈{• , •} j ∈{ � , △} k 2 ( x , x ′ )      φ ⊤ 1 ( x ′ ) φ 1 ( x )  k 2 ( x , x ′ ) = k 1 ( x , x ′ ) k 2 ( x , x ′ ) = tr � �� � k 1 ( x , x ′ ) Lecture 1: Introduction to RKHS

  14. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Sums and products = ⇒ polynomials Theorem (Polynomial kernels) Let x , x ′ ∈ R d for d ≥ 1 , and let m ≥ 1 be an integer and c ≥ 0 be a positive real. Then �� x , x ′ � � m k ( x , x ′ ) := + c is a valid kernel. To prove : expand into a sum (with non-negative scalars) of kernels � x , x ′ � raised to integer powers. These individual terms are valid kernels by the product rule. Lecture 1: Introduction to RKHS

  15. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Infinite sequences The kernels we’ve seen so far are dot products between finitely many features. E.g. � � ⊤ � � x 3 y 3 k ( x , y ) = sin ( x ) log x sin ( y ) log y � � x 3 where φ ( x ) = sin ( x ) log x Can a kernel be a dot product between infinitely many features? Lecture 1: Introduction to RKHS

  16. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Infinite sequences Definition The space ℓ 2 ( square summable sequences) comprises all sequences a := ( a i ) i ≥ 1 for which ∞ � � a � 2 a 2 ℓ 2 = i < ∞ . i = 1 Lecture 1: Introduction to RKHS

  17. What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Infinite sequences Definition The space ℓ 2 ( square summable sequences) comprises all sequences a := ( a i ) i ≥ 1 for which ∞ � � a � 2 a 2 ℓ 2 = i < ∞ . i = 1 Definition Given sequence of functions ( φ i ( x )) i ≥ 1 in ℓ 2 where φ i : X → R is the i th coordinate of φ ( x ) . Then ∞ � k ( x , x ′ ) := φ i ( x ) φ i ( x ′ ) (1) i = 1 Lecture 1: Introduction to RKHS

Recommend


More recommend