Lecture 1: Introduction to RKHS MLSS Cadiz, 2016 Gatsby Unit, CSML, - PowerPoint PPT Presentation

Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Lecture 1: Introduction to RKHS MLSS Cadiz, 2016 Gatsby Unit, CSML, UCL May 12, 2016 Lecture 1: Introduction to RKHS

Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Kernels and feature space (1): XOR example 5 4 3 2 1 x 2 0 −1 −2 −3 −4 −5 −5 −4 −3 −2 −1 0 1 2 3 4 5 x 1 No linear classifier separates red from blue Map points to higher dimensional feature space : � � ∈ R 3 φ ( x ) = x 1 x 2 x 1 x 2 Lecture 1: Introduction to RKHS

Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Kernels and feature space (2): smoothing 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 −0.2 −0.2 −0.2 −0.4 −0.4 −0.4 −0.6 −0.6 −0.6 −0.8 −0.8 −0.8 −1 −1 −1 −0.5 0 0.5 1 1.5 −0.5 0 0.5 1 1.5 −0.5 0 0.5 1 1.5 Kernel methods can control smoothness and avoid overfitting/underfitting . Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Outline: reproducing kernel Hilbert space We will describe in order: 1 Hilbert space 2 Kernel (lots of examples: e.g. you can build kernels from simpler kernels) 3 Reproducing property Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Hilbert space Definition (Inner product) Let H be a vector space over R . A function �· , ·� H : H × H → R is an inner product on H if 1 Linear: � α 1 f 1 + α 2 f 2 , g � H = α 1 � f 1 , g � H + α 2 � f 2 , g � H 2 Symmetric: � f , g � H = � g , f � H 3 � f , f � H ≥ 0 and � f , f � H = 0 if and only if f = 0. Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Hilbert space Definition (Inner product) Let H be a vector space over R . A function �· , ·� H : H × H → R is an inner product on H if 1 Linear: � α 1 f 1 + α 2 f 2 , g � H = α 1 � f 1 , g � H + α 2 � f 2 , g � H 2 Symmetric: � f , g � H = � g , f � H 3 � f , f � H ≥ 0 and � f , f � H = 0 if and only if f = 0. � Norm induced by the inner product: � f � H := � f , f � H Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Hilbert space Definition (Inner product) Let H be a vector space over R . A function �· , ·� H : H × H → R is an inner product on H if 1 Linear: � α 1 f 1 + α 2 f 2 , g � H = α 1 � f 1 , g � H + α 2 � f 2 , g � H 2 Symmetric: � f , g � H = � g , f � H 3 � f , f � H ≥ 0 and � f , f � H = 0 if and only if f = 0. � Norm induced by the inner product: � f � H := � f , f � H Definition (Hilbert space) Inner product space containing Cauchy sequence limits. Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Kernel Definition Let X be a non-empty set. A function k : X × X → R is a kernel if there exists an R -Hilbert space and a map φ : X → H such that ∀ x , x ′ ∈ X , � � k ( x , x ′ ) := φ ( x ) , φ ( x ′ ) H . Almost no conditions on X (eg, X itself doesn’t need an inner product, eg. documents). A single kernel can correspond to several possible features. A trivial example for X := R : � x / √ � 2 √ φ 1 ( x ) = x and φ 2 ( x ) = x / 2 Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: sums, transformations Theorem (Sums of kernels are kernels) Given α > 0 and k, k 1 and k 2 all kernels on X , then α k and k 1 + k 2 are kernels on X . (Proof via positive definiteness: later!) A difference of kernels may not be a kernel ( why? ) Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: sums, transformations Theorem (Sums of kernels are kernels) Given α > 0 and k, k 1 and k 2 all kernels on X , then α k and k 1 + k 2 are kernels on X . (Proof via positive definiteness: later!) A difference of kernels may not be a kernel ( why? ) Theorem (Mappings between spaces) Let X and � X be sets, and define a map A : X → � X . Define the kernel k on � X . Then the kernel k ( A ( x ) , A ( x ′ )) is a kernel on X . Example: k ( x , x ′ ) = x 2 ( x ′ ) 2 . Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: products Theorem (Products of kernels are kernels) Given k 1 on X 1 and k 2 on X 2 , then k 1 × k 2 is a kernel on X 1 × X 2 . If X 1 = X 2 = X , then k := k 1 × k 2 is a kernel on X . Proof: Main idea only! H 1 space of kernels between shapes , � 1 � I � � � φ 1 ( x ) = φ 1 ( � ) = , k 1 ( � , △ ) = 0 . 0 I △ H 2 space of kernels between colors , � 0 � I • � � φ 2 ( x ) = φ 2 ( • ) = k 2 ( • , • ) = 1 . I • 1 Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: products “Natural” feature space for colored shapes : � I � � � I • � � � I △ = φ 2 ( x ) φ ⊤ Φ( x ) = = 1 ( x ) I � I △ I � I △ I • Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space New kernels from old: products “Natural” feature space for colored shapes : � I � � � I • � � � I △ = φ 2 ( x ) φ ⊤ Φ( x ) = = 1 ( x ) I � I △ I � I △ I •   Kernel is: � �   k ( x , x ′ ) = Φ ij ( x )Φ ij ( x ′ ) = tr  φ 1 ( x ) φ ⊤ 2 ( x ) φ 2 ( x ′ ) φ ⊤ 1 ( x ′ )  � �� i ∈{• , •} j ∈{ � , △} k 2 ( x , x ′ )      φ ⊤ 1 ( x ′ ) φ 1 ( x )  k 2 ( x , x ′ ) = k 1 ( x , x ′ ) k 2 ( x , x ′ ) = tr � �� k 1 ( x , x ′ ) Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Sums and products = ⇒ polynomials Theorem (Polynomial kernels) Let x , x ′ ∈ R d for d ≥ 1 , and let m ≥ 1 be an integer and c ≥ 0 be a positive real. Then �� x , x ′ � � m k ( x , x ′ ) := + c is a valid kernel. To prove : expand into a sum (with non-negative scalars) of kernels � x , x ′ � raised to integer powers. These individual terms are valid kernels by the product rule. Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Infinite sequences The kernels we’ve seen so far are dot products between finitely many features. E.g. � � ⊤ � � x 3 y 3 k ( x , y ) = sin ( x ) log x sin ( y ) log y � � x 3 where φ ( x ) = sin ( x ) log x Can a kernel be a dot product between infinitely many features? Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Infinite sequences Definition The space ℓ 2 ( square summable sequences) comprises all sequences a := ( a i ) i ≥ 1 for which ∞ � � a � 2 a 2 ℓ 2 = i < ∞ . i = 1 Lecture 1: Introduction to RKHS

What is a kernel? Feature space Constructing new kernels Basics of reproducing kernel Hilbert spaces Positive definite functions Kernel Ridge Regression Reproducing kernel Hilbert space Infinite sequences Definition The space ℓ 2 ( square summable sequences) comprises all sequences a := ( a i ) i ≥ 1 for which ∞ � � a � 2 a 2 ℓ 2 = i < ∞ . i = 1 Definition Given sequence of functions ( φ i ( x )) i ≥ 1 in ℓ 2 where φ i : X → R is the i th coordinate of φ ( x ) . Then ∞ � k ( x , x ′ ) := φ i ( x ) φ i ( x ′ ) (1) i = 1 Lecture 1: Introduction to RKHS

Lecture 1: Introduction to RKHS MLSS Cadiz, 2016 Gatsby Unit, CSML, - PowerPoint PPT Presentation

Feature space Basics of reproducing kernel Hilbert spaces Kernel Ridge Regression Lecture 1: Introduction to RKHS MLSS Cadiz, 2016 Gatsby Unit, CSML, UCL May 12, 2016 Lecture 1: Introduction to RKHS Feature space Basics of reproducing

Lecture 1: Introduction to RKHS MLSS Tbingen, 2015 Gatsby Unit, CSML, UCL July 22, 2015

Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 L. Rosasco RKHS About this

Lecture 2: Mappings of Probabilities to RKHS and Applications MLSS Cadiz, 2016 Arthur Gretton

Lecture 2: Mappings of Probabilities to RKHS and Applications MLSS T ubingen, 2015 Arthur

Lecture 3: Dependence measures using RKHS embeddings MLSS T ubingen, 2015 Arthur Gretton

Lecture 3: Dependence measures using RKHS embeddings MLSS Cadiz, 2016 Arthur Gretton Gatsby

Krylov subspace methods for Perron-Frobenius operators in RKHS Yuka Hashimoto Takashi Nodera

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Kernel machine methods in genomics Debashis Ghosh Departments of Statistics Penn State

Estimation of the Kernel Mean Embedding (with uncertainty) Paul Rubenstein University of

Tricks for kernel methods in large datasets Matthias Treder Stellenbosch University MML 10 May

Setting Adaptativity of Stochastic Gradient Descent Aymeric Dieuleveut F. Bach, Non parametric

Counterfactual Policy Evaluation in Reproducing Kernel Hilbert Spaces Krikamol Muandet Max

Lecture Capture Introduction to Lecture Capture Learning Outcomes What will lecture capture

INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION INTRODUCTION

Has the honey bee a future? Some facts about honey bees Pollinate 60% of all commercial crops

Lecture 13: Introduction to Deep Learning Aykut Erdem March 2016 Hacettepe University Last

Stewardship and Integrated Pest Management in a Commercial Nursery in Canada Valerie Sikkema

Introduction to First Introduction to First Generation Expert Generation Expert Systems

UNIQUE ISSUES IN LUKEACTS The Omissions of Markan Material 35% of Luke is drawn from Mark

Bugs / Insekten int I = 0; i++; void

Pharmacist eCare Plan Basics Summer 2018 Todays Presenters Trista Pfeiffenberger Kim

Mitochondrial dysfunction and cancer metabolites and beyond Christian Frezza MRC Cancer Unit,