HAL Id: hal-00668212 scientifjques de niveau recherche, publiés ou non, Nathalie Villa-Vialaneix, Fabrice Rossi. Classifjcation and regression based on derivatives : a consis- To cite this version: Nathalie Villa-Vialaneix, Fabrice Rossi consistency result Classifjcation and regression based on derivatives : a publics ou privés. recherche français ou étrangers, des laboratoires émanant des établissements d’enseignement et de destinée au dépôt et à la difgusion de documents https://hal.archives-ouvertes.fr/hal-00668212 L’archive ouverte pluridisciplinaire HAL , est abroad, or from public or private research centers. teaching and research institutions in France or The documents may come from lished or not. entifjc research documents, whether they are pub- archive for the deposit and dissemination of sci- HAL is a multi-disciplinary open access Submitted on 9 Feb 2012 tency result. II Simposio sobre Modelamiento Estadístico, Dec 2010, Valparaiso, Chile. hal-00668212
Classification and regression based on derivatives: a consistency result Nathalie Villa-Vialaneix (Joint work with Fabrice Rossi) http://www.nathalievilla.org II Simposio sobre Modelamiento Estadístico Valparaiso, December, 3 rd Nathalie Villa-Vialaneix 1 / 30 �
Introduction and motivations Outline 1 Introduction and motivations 2 A general consistency result 3 Examples Nathalie Villa-Vialaneix 2 / 30 �
Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R Nathalie Villa-Vialaneix 3 / 30 �
Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. Nathalie Villa-Vialaneix 3 / 30 �
Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. We are given a learning set S n = { ( X i , Y i ) } n i = 1 of n i.i.d. copies of ( X , Y ) . Nathalie Villa-Vialaneix 3 / 30 �
Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. We are given a learning set S n = { ( X i , Y i ) } n i = 1 of n i.i.d. copies of ( X , Y ) . Purpose : Find φ n : X → {− 1 , 1 } or R , that is universally consistent: Classification case : lim n → + ∞ P ( φ n ( X ) � Y ) = L ∗ where L ∗ = inf φ : X→{− 1 , 1 } P ( φ ( X ) � Y ) is the Bayes risk . Nathalie Villa-Vialaneix 3 / 30 �
Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. We are given a learning set S n = { ( X i , Y i ) } n i = 1 of n i.i.d. copies of ( X , Y ) . Purpose : Find φ n : X → {− 1 , 1 } or R , that is universally consistent: � [ φ n ( X ) − Y ] 2 � = L ∗ where Regression case : lim n → + ∞ E � [ φ ( X ) − Y ] 2 � L ∗ = inf φ : X→ R E will also be called the Bayes risk. Nathalie Villa-Vialaneix 3 / 30 �
Introduction and motivations An example Predicting the rate of yellow berry in durum wheat from its NIR spectrum . Nathalie Villa-Vialaneix 4 / 30 �
Introduction and motivations Using derivatives Practically , X ( m ) is often more relevant than X for the prediction. Nathalie Villa-Vialaneix 5 / 30 �
Introduction and motivations Using derivatives Practically , X ( m ) is often more relevant than X for the prediction. Nathalie Villa-Vialaneix 5 / 30 �
Introduction and motivations Using derivatives Practically , X ( m ) is often more relevant than X for the prediction. But X → X ( m ) induces information loss and � � φ ( X ( m ) ) � Y φ : X→{− 1 , 1 } P ( φ ( X ) � Y ) = L ∗ φ : D m X→{− 1 , 1 } P inf ≥ inf and �� � 2 � � [ φ ( X ) − Y ] 2 � φ ( X ( m ) ) − Y = L ∗ . φ : D m X→ R E inf ≥ φ : X→ R P inf Nathalie Villa-Vialaneix 5 / 30 �
Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i Nathalie Villa-Vialaneix 6 / 30 �
Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i The sampling can be non uniform... Nathalie Villa-Vialaneix 6 / 30 �
Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i ... and the data can be corrupted by noise. Nathalie Villa-Vialaneix 6 / 30 �
Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i Then , X ( m ) X ( m ) is estimated from X τ d i , by � τ d , which also induces i information loss : � � � � X ( m ) φ ( � φ ( X ( m ) ) � Y ≥ L ∗ τ d ) � Y ≥ φ : D m X→{− 1 , 1 } P inf φ : D m X→{− 1 , 1 } P inf and �� � 2 � �� � 2 � X ( m ) φ ( � φ ( X ( m ) ) − Y ≥ L ∗ . inf τ d ) − Y ≥ inf φ : D m X→ R E φ : D m X→ R E Nathalie Villa-Vialaneix 6 / 30 �
Introduction and motivations Purpose of the presentation X ( m ) Find a classifier or a regression function φ n ,τ d built from � such τ d that the risk of φ n ,τ d asymptotically reaches the Bayes risk L ∗ : � � X ( m ) φ n ,τ d ( � = L ∗ | τ d |→ + ∞ lim lim τ d ) � Y n → + ∞ P or �� � 2 � X ( m ) φ n ,τ d ( � = L ∗ τ d ) − Y | τ d |→ + ∞ lim lim n → + ∞ E Nathalie Villa-Vialaneix 7 / 30 �
Introduction and motivations Purpose of the presentation X ( m ) Find a classifier or a regression function φ n ,τ d built from � such τ d that the risk of φ n ,τ d asymptotically reaches the Bayes risk L ∗ : � � X ( m ) φ n ,τ d ( � = L ∗ | τ d |→ + ∞ lim lim τ d ) � Y n → + ∞ P or �� � 2 � X ( m ) φ n ,τ d ( � = L ∗ τ d ) − Y | τ d |→ + ∞ lim lim n → + ∞ E Main idea : Use a relevant way to estimate X ( m ) from X τ d (by smoothing splines) and combine the consistency of splines with the consistency of a R | τ d | -classifier or regression function. Nathalie Villa-Vialaneix 7 / 30 �
A general consistency result Outline 1 Introduction and motivations 2 A general consistency result 3 Examples Nathalie Villa-Vialaneix 8 / 30 �
A general consistency result Basics about smoothing splines I Suppose that X is the Sobolev space � [ 0 , 1 ] |∀ j = 1 , . . . , m , D j h exists (weak sense) and D m h ∈ L 2 � H m = h ∈ L 2 Nathalie Villa-Vialaneix 9 / 30 �
A general consistency result Basics about smoothing splines I Suppose that X is the Sobolev space � [ 0 , 1 ] |∀ j = 1 , . . . , m , D j h exists (weak sense) and D m h ∈ L 2 � H m = h ∈ L 2 equipped with the scalar product � m � u , v � H m = � D m u , D m v � L 2 + B j uB j v j = 1 where B are m boundary conditions such that Ker B ∩ P m − 1 = { 0 } . Nathalie Villa-Vialaneix 9 / 30 �
A general consistency result Basics about smoothing splines I Suppose that X is the Sobolev space � [ 0 , 1 ] |∀ j = 1 , . . . , m , D j h exists (weak sense) and D m h ∈ L 2 � H m = h ∈ L 2 equipped with the scalar product � m � u , v � H m = � D m u , D m v � L 2 + B j uB j v j = 1 where B are m boundary conditions such that Ker B ∩ P m − 1 = { 0 } . ( H m , � ., . � H m ) is a RKHS : ∃ k 0 : P m − 1 × P m − 1 → R and k 1 : Ker B × Ker B → R such that ∀ u ∈ P m − 1 , t ∈ [ 0 , 1 ] , � u , k 0 ( t , . ) � H m = u ( t ) and ∀ u ∈ Ker B , t ∈ [ 0 , 1 ] , � u , k 1 ( t , . ) � H m = u ( t ) Nathalie Villa-Vialaneix See [Berlinet and Thomas-Agnan, 2004] for further details. 9 / 30 �
A general consistency result Basics about smoothing splines II A simple example of boundary conditions : h ( 0 ) = h ( 1 ) ( 0 ) = . . . = h ( m − 1 ) ( 0 ) = 0 . Then, � m − 1 t k s k k 0 ( s , t ) = ( k !) 2 k = 0 and � 1 ( t − w ) m − 1 ( s − w ) m − 1 + + k 1 ( s , t ) = dw . ( m − 1 )! 0 Nathalie Villa-Vialaneix 10 / 30 �
A general consistency result Estimating the predictors with smooth- ing splines I Assumption (A1) | τ d | ≥ m − 1 sampling points are distinct in [ 0 , 1 ] B j are linearly independent from h → h ( t ) for all t ∈ τ d Nathalie Villa-Vialaneix 11 / 30 �
Recommend
More recommend