Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion A general procedure to combine estimators Fr´ ed´ eric Lavancier and Paul Rochet Laboratoire de Math´ ematiques Jean Leray University of Nantes
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Introduction 1 The method 2 Theoretical results 3 Estimation of the MSE matrix Σ 4 Generalization to several parameters 5 Simulations 6 Conclusion 7
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion The problem Let θ be an unknown quantity in a statistical model. Consider a collection of k estimators T 1 , ..., T k of θ . Aim: combining these estimators to obtain a better estimate. Natural approach : choose a suitable combination k � ˆ λ j T j = λ ⊤ T , λ ∈ Λ ⊆ R k . θ λ = j =1 where T = ( T 1 , ..., T k ) ⊤ . This amounts to find ˆ λ . Standard settings : Selection: Λ = { (1 , 0 , . . . , 0) , (0 , 1 , 0 , . . . , 0) , . . . , (0 , . . . , 0 , 1) } Convex: Λ = { λ ∈ R k : λ j ≥ 0 , � j λ j = 1 } Affine: Λ = { λ ∈ R k : � j λ j = 1 } Linear: Λ = R k
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion The problem Let θ be an unknown quantity in a statistical model. Consider a collection of k estimators T 1 , ..., T k of θ . Aim: combining these estimators to obtain a better estimate. Natural approach : choose a suitable combination k � ˆ λ j T j = λ ⊤ T , λ ∈ Λ ⊆ R k . θ λ = j =1 where T = ( T 1 , ..., T k ) ⊤ . This amounts to find ˆ λ . Standard settings : Selection: Λ = { (1 , 0 , . . . , 0) , (0 , 1 , 0 , . . . , 0) , . . . , (0 , . . . , 0 , 1) } Convex: Λ = { λ ∈ R k : λ j ≥ 0 , � j λ j = 1 } Affine: Λ = { λ ∈ R k : � j λ j = 1 } Linear: Λ = R k
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Existing works : Aggregation and Averaging Aggregation T 1 , ..., T k are not random (in practice: built from an independent training sample). Non-parametric regression: Y = θ ( X ) + ε � � θ λ ( X ) � 2 + pen( λ ) ˆ � Y − ˆ λ = arg min (Juditsky, Nemirovsky 2000) . λ ∈ Λ Density estimation: X 1 , ..., X n iid with density θ � � n � θ λ � 2 − 2 ˆ � ˆ ˆ λ = arg min θ λ ( X i ) (Rigollet, Tsybakov 2007) . n λ ∈ Λ i =1 Flexibility in the choice of Λ Strong results (oracle inequalities, minimax rates, lower bounds...)
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Existing works : Aggregation and Averaging Averaging Forecast averaging (time series): X 1 , ..., X t Predictors T 1 ( t ) , ..., T k ( t ) � t � � 2 ˆ X i − λ ⊤ T ( i ) λ = arg min λ ∈ Λ (Bates, Granger 1969) . i =1 Model averaging (between misspecifed models) Regression: Y i = µ ( X i ) + ε i ˆ λ minimizes an estimator of the risk Compromise estimator (Hjort, Claeskens 2003), Jackknife (Hansen, Racine 2012), Mallows’ Cp (Benito 2012) Bayesian Model Averaging Likelihood: Y i ∼ f ( y , θ, γ ) Jackknife (Ando, Li 2014), AIC (Hjort, Claeskens 2003)
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Other examples Example 1 : mean and median Let x 1 , . . . , x n be n i.i.d realisations of an unknown distribution on the real line. Assume this distribution is symmetric around some parameter θ ( θ ∈ R ). Two natural choices to estimate θ : the mean T 1 = ¯ x n the median T 2 = x ( n / 2) The idea to combine these two estimators comes from Pierre Simon de Laplace. In the Second Supplement of the Th´ eorie Analytique des Probabilit´ es (1812), he wrote : ” En combinant les r´ esultats de ces deux m´ ethodes, on peut obtenir un r´ esultat dont la loi de probabilit´ e des erreurs soit plus rapidement d´ ecroissante.” [ In combining the results of these two methods, one can obtain a result whose probability law of error will be more rapidly decreasing. ]
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Other examples Example 1 : mean and median Let x 1 , . . . , x n be n i.i.d realisations of an unknown distribution on the real line. Assume this distribution is symmetric around some parameter θ ( θ ∈ R ). Two natural choices to estimate θ : the mean T 1 = ¯ x n the median T 2 = x ( n / 2) The idea to combine these two estimators comes from Pierre Simon de Laplace. In the Second Supplement of the Th´ eorie Analytique des Probabilit´ es (1812), he wrote : ” En combinant les r´ esultats de ces deux m´ ethodes, on peut obtenir un r´ esultat dont la loi de probabilit´ e des erreurs soit plus rapidement d´ ecroissante.” [ In combining the results of these two methods, one can obtain a result whose probability law of error will be more rapidly decreasing. ]
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Laplace considered the combination λ 1 ¯ x n + λ 2 x ( n / 2) with λ 1 + λ 2 = 1. 1. He proved that the asymptotic law of this combination is Gaussian 2. Minimizing the asymptotic variance in λ 1 , λ 2 , he concluded that if the underlying distribution is Gaussian, then the best combination is to take λ 1 = 1 and λ 2 = 0. for other distributions, the best combination depends on the distribution: ” L’ignorance o` u l’on est de la loi de probabilit´ e des erreurs des observations rend cette correction impraticable” [ When one does not know the distribution of the errors of observation this correction is not feasible. ]
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Laplace considered the combination λ 1 ¯ x n + λ 2 x ( n / 2) with λ 1 + λ 2 = 1. 1. He proved that the asymptotic law of this combination is Gaussian 2. Minimizing the asymptotic variance in λ 1 , λ 2 , he concluded that if the underlying distribution is Gaussian, then the best combination is to take λ 1 = 1 and λ 2 = 0. for other distributions, the best combination depends on the distribution: ” L’ignorance o` u l’on est de la loi de probabilit´ e des erreurs des observations rend cette correction impraticable” [ When one does not know the distribution of the errors of observation this correction is not feasible. ]
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Laplace considered the combination λ 1 ¯ x n + λ 2 x ( n / 2) with λ 1 + λ 2 = 1. 1. He proved that the asymptotic law of this combination is Gaussian 2. Minimizing the asymptotic variance in λ 1 , λ 2 , he concluded that if the underlying distribution is Gaussian, then the best combination is to take λ 1 = 1 and λ 2 = 0. for other distributions, the best combination depends on the distribution: ” L’ignorance o` u l’on est de la loi de probabilit´ e des erreurs des observations rend cette correction impraticable” [ When one does not know the distribution of the errors of observation this correction is not feasible. ]
Introduction Method Theory Estimation of Σ Several parameters Simulations Conclusion Other examples Example 2 : Weibull model Let x 1 , . . . , x n i.i.d with respect to the Weibull distribution � x � β − 1 f ( x ) = β e − ( x /η ) β , x > 0 . η η We consider 3 standard methods to estimate β and η the maximum likelihood estimator (ML) the method of moments (MM) the ordinary least squares method or Weibull plot (OLS)
Recommend
More recommend