conditional quantiles with functional covariates an
play

Conditional quantiles with functional covariates: an application to - PowerPoint PPT Presentation

Conditional quantiles with functional covariates: an application to Ozone pollution forecasting Herv Cardot, Christophe Crambes & Pascal Sarda Compstat - Prague August 2004 Compstat 2004 - Prague p.1/14 Presentation of the data (1)


  1. Conditional quantiles with functional covariates: an application to Ozone pollution forecasting Hervé Cardot, Christophe Crambes & Pascal Sarda Compstat - Prague August 2004 Compstat 2004 - Prague – p.1/14

  2. Presentation of the data (1) Data (ORAMIP) : Compstat 2004 - Prague – p.2/14

  3. Presentation of the data (1) Data (ORAMIP) : � 9 variables : NO, N 2 , O 3 , WD, WS, . . . (hourly measurements) Compstat 2004 - Prague – p.2/14

  4. Presentation of the data (1) Data (ORAMIP) : � 9 variables : NO, N 2 , O 3 , WD, WS, . . . (hourly measurements) � 6 stations Compstat 2004 - Prague – p.2/14

  5. Presentation of the data (1) Data (ORAMIP) : � 9 variables : NO, N 2 , O 3 , WD, WS, . . . (hourly measurements) � 6 stations � 4 years : 1997 − 2000 ( 15 th May - 15 th Sept) Compstat 2004 - Prague – p.2/14

  6. Presentation of the data (2) 120 100 80 Ozone 60 40 20 0 0 10 20 30 40 50 60 70 hours Compstat 2004 - Prague – p.3/14

  7. Presentation of the data (3) Compstat 2004 - Prague – p.4/14

  8. Presentation of the data (3) � variable of interest : max of O 3 every day: Y = t ( Y 1 , . . . , Y n ) Compstat 2004 - Prague – p.4/14

  9. Presentation of the data (3) � variable of interest : max of O 3 every day: Y = t ( Y 1 , . . . , Y n ) � covariates : NO, N 2 , O 3 , DV or V V : 18h . . . 24h 1h . . . 17h day 0/day 1 . . . . . . . . . . . . X 1 , 1 X 1 , 24 . . . . . . . . . day n − 1 /day n X n, 1 . . . . . . . . . . . . X n, 24 Compstat 2004 - Prague – p.4/14

  10. Presentation of the data (3) � variable of interest : max of O 3 every day: Y = t ( Y 1 , . . . , Y n ) � covariates : NO, N 2 , O 3 , DV or V V : 18h . . . 24h 1h . . . 17h day 0/day 1 . . . . . . . . . . . . X 1 , 1 X 1 , 24 . . . . . . . . . day n − 1 /day n X n, 1 . . . . . . . . . . . . X n, 24 � ( X i , Y i ) i =1 ,...,n couples of random variables with Y i ∈ R and X i ∈ L 2 ( I ) Compstat 2004 - Prague – p.4/14

  11. Presentation of the data (3) � variable of interest : max of O 3 every day: Y = t ( Y 1 , . . . , Y n ) � covariates : NO, N 2 , O 3 , DV or V V : 18h . . . 24h 1h . . . 17h day 0/day 1 . . . . . . . . . . . . X 1 , 1 X 1 , 24 . . . . . . . . . day n − 1 /day n X n, 1 . . . . . . . . . . . . X n, 24 � ( X i , Y i ) i =1 ,...,n couples of random variables with Y i ∈ R and X i ∈ L 2 ( I ) � X i is known in t 1 , . . . , t p ∈ I (equispaced) Compstat 2004 - Prague – p.4/14

  12. Definition of the conditional quantiles Compstat 2004 - Prague – p.5/14

  13. Definition of the conditional quantiles � α ∈ ]0 , 1[ , x ∈ L 2 ( I ) Compstat 2004 - Prague – p.5/14

  14. Definition of the conditional quantiles � α ∈ ]0 , 1[ , x ∈ L 2 ( I ) � α conditional quantile : P ( Y ≤ g α ( X ) | X = x ) = α Compstat 2004 - Prague – p.5/14

  15. Definition of the conditional quantiles � α ∈ ]0 , 1[ , x ∈ L 2 ( I ) � α conditional quantile : P ( Y ≤ g α ( X ) | X = x ) = α � property : g α ( x ) = arg min a ∈ R E ( l α ( Y − a ) | X = x ) with l α ( u ) = | u | + (2 α − 1) u Compstat 2004 - Prague – p.5/14

  16. Presentation of the model Compstat 2004 - Prague – p.6/14

  17. Presentation of the model � model (cf. Koenker and Bassett, 1978) : � g α ( X ) = c + � Ψ α , X � = c + Ψ α ( t ) X ( t ) dt I Compstat 2004 - Prague – p.6/14

  18. Presentation of the model � model (cf. Koenker and Bassett, 1978) : � g α ( X ) = c + � Ψ α , X � = c + Ψ α ( t ) X ( t ) dt I � we want to estimate the function Ψ α ∈ L 2 ( I ) : spline estimation Compstat 2004 - Prague – p.6/14

  19. Spline estimation of Ψ α Compstat 2004 - Prague – p.7/14

  20. Spline estimation of Ψ α k ∈ N ⋆ , q ∈ N Compstat 2004 - Prague – p.7/14

  21. Spline estimation of Ψ α k ∈ N ⋆ , q ∈ N interval I I j I 1 I k k sub−intervals Compstat 2004 - Prague – p.7/14

  22. Spline estimation of Ψ α k ∈ N ⋆ , q ∈ N t ( B 1 , . . . , B k + q ) B -splines basis B k , q = Compstat 2004 - Prague – p.7/14

  23. Spline estimation of Ψ α k ∈ N ⋆ , q ∈ N t ( B 1 , . . . , B k + q ) B -splines basis B k , q = k + q � t B k , q � � estimator : � Ψ α = θ = θ j B j j =1 Compstat 2004 - Prague – p.7/14

  24. Spline estimation of Ψ α k ∈ N ⋆ , q ∈ N t ( B 1 , . . . , B k + q ) B -splines basis B k , q = k + q � t B k , q � � estimator : � Ψ α = θ = θ j B j j =1 Compstat 2004 - Prague – p.7/14

  25. c and � Determination of � θ Compstat 2004 - Prague – p.8/14

  26. c and � Determination of � θ � � θ and � c solution of the minimisation problem : � 1 n � l α ( Y i − c − � t B k , q θ , X i � ) + ρ � ( t B k , q θ ) ( m ) � 2 � min n θ ∈ R k + q i =1 Compstat 2004 - Prague – p.8/14

  27. c and � Determination of � θ � � θ and � c solution of the minimisation problem : � 1 n � l α ( Y i − c − � t B k , q θ , X i � ) + ρ � ( t B k , q θ ) ( m ) � 2 � min n θ ∈ R k + q i =1 empirical version of E ( l α ( Y − c − � s, X � )) Compstat 2004 - Prague – p.8/14

  28. c and � Determination of � θ � � θ and � c solution of the minimisation problem : � 1 n � l α ( Y i − c − � t B k , q θ , X i � ) + ρ � ( t B k , q θ ) ( m ) � 2 � min n θ ∈ R k + q i =1 penalization Compstat 2004 - Prague – p.8/14

  29. c and � Determination of � θ � � θ and � c solution of the minimisation problem : � 1 n � l α ( Y i − c − � t B k , q θ , X i � ) + ρ � ( t B k , q θ ) ( m ) � 2 � min n θ ∈ R k + q i =1 � no explicit solution Compstat 2004 - Prague – p.8/14

  30. c and � Determination of � θ � � θ and � c solution of the minimisation problem : � 1 n � l α ( Y i − c − � t B k , q θ , X i � ) + ρ � ( t B k , q θ ) ( m ) � 2 � min n θ ∈ R k + q i =1 � no explicit solution � algorithm : Iterative Reweighted Least Squares Compstat 2004 - Prague – p.8/14

  31. Multiple conditional quantiles Compstat 2004 - Prague – p.9/14

  32. Multiple conditional quantiles � v covariates X 1 , . . . , X v Compstat 2004 - Prague – p.9/14

  33. Multiple conditional quantiles � v covariates X 1 , . . . , X v � model : g α ( X 1 , . . . , X v ) � � Ψ 1 α ( t ) X 1 ( t ) dt + . . . + Ψ v α ( t ) X v ( t ) dt = c + I I Compstat 2004 - Prague – p.9/14

  34. Multiple conditional quantiles � v covariates X 1 , . . . , X v � model : g α ( X 1 , . . . , X v ) � � Ψ 1 α ( t ) X 1 ( t ) dt + . . . + Ψ v α ( t ) X v ( t ) dt = c + I I � algorithm : backfitting + Iterative Reweighted Least Squares Compstat 2004 - Prague – p.9/14

  35. Application to the pollution data Compstat 2004 - Prague – p.10/14

  36. Application to the pollution data � learning sample : ( X l i , Y l i ) i =1 ,...,n learn � test sample : ( X t i , Y t i ) i =1 ,...,n test Compstat 2004 - Prague – p.10/14

  37. Application to the pollution data � learning sample : ( X l i , Y l i ) i =1 ,...,n learn � test sample : ( X t i , Y t i ) i =1 ,...,n test � number of knots : k = 8 (equispaced) � degree of splines functions : q = 3 � order of derivation in the penalization : m = 2 Compstat 2004 - Prague – p.10/14

  38. Application to the pollution data � learning sample : ( X l i , Y l i ) i =1 ,...,n learn � test sample : ( X t i , Y t i ) i =1 ,...,n test � number of knots : k = 8 (equispaced) � degree of splines functions : q = 3 � order of derivation in the penalization : m = 2 � choice of ρ : Generalized Cross Validation Compstat 2004 - Prague – p.10/14

  39. Quality criteria of the models Compstat 2004 - Prague – p.11/14

  40. Quality criteria of the models � n t i =1 ( Y t i − � 1 Y t i ) 2 n t C 1 = � n t 1 i =1 ( Y t i − Y l ) 2 n t Compstat 2004 - Prague – p.11/14

  41. Quality criteria of the models � n t i =1 ( Y t i − � 1 Y t i ) 2 n t C 1 = � n t 1 i =1 ( Y t i − Y l ) 2 n t n t � C 2 = 1 | Y t i − � Y t i | n t i =1 Compstat 2004 - Prague – p.11/14

  42. Quality criteria of the models � n t i =1 ( Y t i − � 1 Y t i ) 2 n t C 1 = � n t 1 i =1 ( Y t i − Y l ) 2 n t n t � C 2 = 1 | Y t i − � Y t i | n t i =1 � n t i =1 l α ( Y t i − � 1 Y t i ) n t C 3 = � n t 1 i =1 l α ( Y t i − q α ( Y l )) n t Compstat 2004 - Prague – p.11/14

  43. Results (conditional median) Compstat 2004 - Prague – p.12/14

  44. Results (conditional median) Models Variables C 1 C 2 C 3 N2 0 . 814 16 . 916 0 . 906 1 covariate O3 0.414 12.246 0.656 WS 0 . 802 16 . 836 0 . 902 O3, NO 0 . 413 11 . 997 0 . 643 2 covariates 0 . 413 11 . 880 0 . 637 O3, N2 O3, WS 0 . 414 12 . 004 0 . 635 O3, NO, N2 0 . 412 12 . 127 0 . 644 3 covariates O3, N2, WD 0 . 409 12 . 004 0 . 645 O3, N2, WS 0 . 410 11 . 997 0 . 642 4 covariates O3, NO, N2, WS 0.400 11.718 0.634 5 covariates O3, NO, N2, WD, WS 0 . 401 11 . 750 0 . 639 Compstat 2004 - Prague – p.12/14

  45. Forecasting (conditional median) Compstat 2004 - Prague – p.13/14

Recommend


More recommend