Anticipated and adaptive prediction in functional discriminant analysis Cristian P REDA Polytech’Lille, USTL Lille, France Gilbert S APORTA CNAM Paris, France Mohamed Hedi B EN M BAREK Institut Sup´ erieur de Gestion, Sousse, Tunisia Anticipated and adaptive prediction in functional discriminant analysis – p.1/21
The context Cookie’s quality at Danone and the kneading process A cookie from Danone : • choose a type of flour (components, density, etc) • kneading process ( ≈ 1h) - Put the dough in form and cook it • > 2h. - Evaluate quality of the obtained cookies Idea : predict the cookie’s quality from elements derived from the kneading process : Danone gets time and money ! Anticipated and adaptive prediction in functional discriminant analysis – p.2/21
Dough resistance during the kneading process : X = X ( t ) , t ∈ { 0 , 2 , 4 , . . . 480 s } 500 400 X(t) 300 200 0 100 200 300 400 time Anticipated and adaptive prediction in functional discriminant analysis – p.3/21
Flour evaluation The quality of cookies obtained with some type of flour is given by experts. The response ( Y ) is : this type of flour is Good or Bad . 90 flours were evaluated : 50 are good and 40 are bad. Anticipated and adaptive prediction in functional discriminant analysis – p.4/21
90 flours : quality and dough resistance during 480s. 700 500 X(t) 300 100 0 100 200 300 400 time Good(black) Bad(red) Anticipated and adaptive prediction in functional discriminant analysis – p.5/21
Functional discriminant analysis X t : Ω → R , X = { X t } t ∈ [0 ,T ] , - E ( X 2 t ) < ∞ , - L 2 –continuous, - ∀ ω ∈ Ω : ( X t ( ω )) t ∈ [0 ,T ] ∈ L 2 ([0 , T ]) , - E ( X t ) = 0 , ∀ t ∈ [0 , T ] . Y : Ω → { 0 , 1 } . Discriminant score : d T = Φ( X ) Anticipated and adaptive prediction in functional discriminant analysis – p.6/21
Discriminant score estimation { ( X 1 , Y 1 ) , ( X 2 , Y 2 ) , . . . , ( X n , Y n ) } • Linear discriminant score : Φ( X ) = � β, X � L 2 [0 ,T ] , β ∈ L 2 [0 , T ] . Criterion (Fisher): V ( E (Φ( X ) | Y )) max V (Φ( X )) β ∈ L 2 [0 ,T ] Estimation by functional linear regression model (PCR, PLS, etc). Anticipated and adaptive prediction in functional discriminant analysis – p.7/21
• Nonparametric estimation � n i =1 Y i K ( u ( X, X i ) /h ) ˆ Φ( X ) = � n i =1 K ( u ( X, X i ) /h ) • RKHS approximation n ˆ � Φ( X ) = α i K ( X i , X ) i Criterion (logistic loss) : L ( x, y, Φ) = − y Φ( x ) + log(1 + e Φ( x ) ) Anticipated and adaptive prediction in functional discriminant analysis – p.8/21
Kneading data results 700 700 500 500 X(t) X(t) 300 300 100 100 0 100 200 300 400 0 100 200 300 400 time time Good (black) and bad (red) flours. Left : original data. Right : smoothed data Model PLS_FLDA NP PC_FLDA Gaussian(6) LDA Error rate 0.112 0.103 0.142 0.108 0.154 Error rate averaged over 100 test samples. Anticipated and adaptive prediction in functional discriminant analysis – p.9/21
Anticipated prediction X is observed on [0 , T ] Problem : find the smallest T ∗ , T ∗ < T , such that the prediction of Y by X observed on [0 , T ∗ ] is "similar" to the prediction obtained with X observed on [0 , T ] . Discrimination power : ROC curve – d : the discriminant score. – threshold r : Y = 1 if d > r . – "Sensitivity" : P ( d > r | Y = 1) – "Specificity" : P ( d > r | Y = 0) . Measure of discrimination : area under the ROC curve (AUC) Anticipated and adaptive prediction in functional discriminant analysis – p.10/21
– Estimation of AUC { Y = 1 } : X 1 = d | Y =1 , sample of size n 1 { Y = 0 } : X 0 = d | Y =0 , sample of size n 2 AUC = # { X 1 > X 0 } � n 1 n 2 D = { d t } 0 <t ≤ T , { � AUC ( t ) } 0 <t ≤ T Criterion for T ∗ : compare AUC ( t ) and AUC ( T ) for t < T and chose T ∗ as the largest t such that the test H 0 : AUC ( t ) = AUC ( T ) , H 1 : AUC ( t ) < AUC ( T ) is significant (p-value < 0.05). Anticipated and adaptive prediction in functional discriminant analysis – p.11/21
Simulation 8 W (1 − t ) , 0 ≤ t ≤ 1 < Class { Y = 0 } : X t = − 2 sin( t − 1) + W ( t − 1) , 1 < t ≤ 2 : 8 W (1 − t ) , 0 ≤ t ≤ 1 < Class { Y = 1 } : X t = 2 sin( t − 1) + W ( t − 1) , 1 < t ≤ 2 : X(t) Y=1 (black), Y=0 (blue) 5 4 3 2 1 0 -1 -2 -3 -4 -5 t 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Figure : Sample of size n = 100 for each class of Y . Anticipated and adaptive prediction in functional discriminant analysis – p.12/21
M = 50 learning and test samples of size 100. For each t ∈ { 0 , 2 . 00 , 1 . 98 , . . . , 0 } : sample of size M = 50 of � AUC ( t ) . – T ∗ = 1 . 46 . – test statistic : S = 1 . 663 AUC ( t ∗ ) = 0 . 856 , � � AUC ( T ) = 0 . 872 Anticipated and adaptive prediction in functional discriminant analysis – p.13/21
Kneading data Y ∈ { Bad, Good } . The sample of 90 flours is randomly divided into a learning sample of size 60 and a test sample of size 30 . Error test rate (PLS estimation) T = 480 : 0 . 112 , – � AUC ( T ) = 0 . 746 . Anticipated prediction : T ∗ = 186 . Error test rate (PLS estimation) T ∗ = 186 : 0 . 121 , – � AUC ( T ∗ ) = 0 . 778 . Conclusion : the predictive power of the dough curves for the cookies quality is resumed by the first 186 seconds of the kneading process. Anticipated and adaptive prediction in functional discriminant analysis – p.14/21
Adaptive prediction Remark : in anticipated prediction T ∗ is a constant. Let ω ∈ Ω be a new observation for which one wants to predict Y from X . Suppose that X is observed in a sequential way. The problem addressed by the adaptive prediction is : Problem : find the smallest T ∗ ( ω ) , T ∗ ( ω ) < T , such that X observed on [0 , T ∗ ( ω )] provides similar prediction as it is observed on [0 , T ] . Remark : - here T ∗ is a random variable. - to observe X ( ω ) on [ T ∗ , T ] will not change the prediction for Y ( ω ) obtained with X on [0 , T ∗ ( ω )] . Anticipated and adaptive prediction in functional discriminant analysis – p.15/21
Conservative index for prediction : X ] [ dt x 1 x n 0 t T The discriminant score d t . Anticipated and adaptive prediction in functional discriminant analysis – p.16/21
Denote by Ω ω ( t ) = { ω i ∈ Ω | ˆ Y t ( ω ) = ˆ Y t,i } and Ω ω ( t ) = Ω − Ω ω ( t ) the class of elements having the same prediction as ω , respectively its complement with respect to Ω . ω d d t T ^ p Y=0 0 W (t) ^ p ω Y=1 1 W ^ p Y=0 0 W ω (t) ^ p Y=1 1 0 t T Conservation rate of the prediction for ω and t . Anticipated and adaptive prediction in functional discriminant analysis – p.17/21
Let � � � { ω ′ ∈ Ω | ˆ Y T ( ω ′ ) = 0 } ∩ Ω ω ( t ) } � � � p 0 | Ω ω ( t ) = . | Ω ω ( t ) | be the observed rate of elements in Ω ω ( t ) predicted in the class Y = 0 at the time T using the score d T . Similarly, let p 1 | Ω ω ( t ) , p 0 | Ω ω ( t ) and p 1 | Ω ω ( t ) Let define by C Ω ω ( t ) = max { p 0 | Ω ω ( t ) , p 1 | Ω ω ( t ) } , respectively by C Ω ω ( t ) = max { p 0 | Ω ω ( t ) , p 1 | Ω ω ( t ) } the conservation rate of the prediction at the time t with respect to the time T for the elements of Ω ω ( t ) , respectively of Ω ω ( t ) . As a global measure of conservation we consider C Ω ( ω, t ) = min { C Ω ω ( t ) , C Ω ω ( t ) } . Anticipated and adaptive prediction in functional discriminant analysis – p.18/21
Remark : For each t ∈ [0 , T ] , C Ω ( ω, t ) is such that 0 . 5 ≤ C Ω ( ω, t ) ≤ 1 and C Ω ( ω, T ) = 1 . Given a confidence conservation threshold γ ∈ (0 , 1) , e.g. γ = 0 . 90 , we define the following adaptive prediction rule for ω and t : if C Ω ( ω, t ) ≥ γ then the observation of X for ω on the time interval [0 , t ] is sufficient (1) for the prediction of Y ( ω ) . ˆ Y ( ω ) is then the same as the prediction at time T of the subgroup of Ω ω ( t ) corresponding to C Ω ω ( t ) . if C Ω ( ω, t ) < γ then the observation process of X for ω should continue after t . Put (2) t = t + h and repeat the adaptive prediction procedure. Then, T ∗ ( ω ) is the smallest t such that the condition (1) of the adaptive prediction rule is satisfied. Anticipated and adaptive prediction in functional discriminant analysis – p.19/21
1.00 0.95 500 0.90 400 0.85 C(w,t) X(t) 0.80 300 0.75 200 0.70 0 100 200 300 400 100 200 300 400 time time Right : C Ω ( ω, t ) , t ∈ [100 , 480] , γ = 0 . 90 . Left : new flour ω . T ∗ ( ω ) = 220 . Anticipated and adaptive prediction in functional discriminant analysis – p.20/21
For 25 new flours, the adaptive procedure is applied. 1.0 0.8 0.6 Fn(x) 0.4 0.2 0.0 150 200 250 300 time Empirical cumulative distribution function of T ∗ (in red, the time point t=186). Anticipated and adaptive prediction in functional discriminant analysis – p.21/21
Recommend
More recommend