On the asymptotics of the m.l. estimators Matematiikan p aiv at - PowerPoint PPT Presentation

On the asymptotics of the m.l. – estimators Matematiikan p¨ aiv¨ at 4.-5.1. 2006, Tampere Esko Valkeila Teknillinen korkeakoulu 4.1.2006

Outline of the talk ◮ Motivation

Outline of the talk ◮ Motivation ◮ Some technical facts

Outline of the talk ◮ Motivation ◮ Some technical facts ◮ One result of Le Cam

Outline of the talk ◮ Motivation ◮ Some technical facts ◮ One result of Le Cam ◮ Multi-dimensional parameters

Outline of the talk ◮ Motivation ◮ Some technical facts ◮ One result of Le Cam ◮ Multi-dimensional parameters ◮ Abstract filtered models

Outline of the talk ◮ Motivation ◮ Some technical facts ◮ One result of Le Cam ◮ Multi-dimensional parameters ◮ Abstract filtered models ◮ Examples

Outline of the talk ◮ Motivation ◮ Some technical facts ◮ One result of Le Cam ◮ Multi-dimensional parameters ◮ Abstract filtered models ◮ Examples ◮ Conclusions

Motivation Basic setup We work with statistical models/experiments E n (Θ) := (Ω n , F n , P θ n ; θ ∈ Θ); here (Ω n , F n ) is a model for the observation scheme, and P θ n , θ ∈ Θ is a model for different statistical theories concerning the observations. We are interested in asymptotics: what happens if n → ∞ ? More precisely, would like to understand what are the minimal assumptions to quarantee that the maximum likelihood estimator is asymptotically normal, efficient . . .

Motivation Text book information How these problems are treated in the text books of Statistics? I would like to describe the situation as follows: ◮ Cook books on Statistics: Under some regularity assumptions the m.l.e. is asymptotically normal. Typically, no proof is given.

Motivation Text book information How these problems are treated in the text books of Statistics? I would like to describe the situation as follows: ◮ Cook books on Statistics: Under some regularity assumptions the m.l.e. is asymptotically normal. Typically, no proof is given. ◮ Main stream books on Statistics: The log-likelihood is smooth (in C 2 in the neighbourhood of the true parameter) , some domination on the remainder term, and the support of the true distribution does not depend on the parameter. A detailed proof is given.

Motivation Text book information, cont. Next I will make some comments: ◮ If we want to understand the minimal conditions for the good properties of the m.l.e. to be valid, we should forget cook books on Statistics.

Motivation Text book information, cont. Next I will make some comments: ◮ If we want to understand the minimal conditions for the good properties of the m.l.e. to be valid, we should forget cook books on Statistics. ◮ Main stream books on Statistics are inaccurate: the support can depend on the parameter, if the dependency is smooth (take f ( x ; θ ) = ( x − θ ) e − ( x − θ ) 1 { x ≥ θ } ).

Some technical facts L 2 - differentiability The following definition will be very useful. We work with statistical model/experiment (Ω , F , P θ ; θ ∈ Θ) with the following additional property: there exists a probability measure Q such that P θ ≺≺ Q for all θ ∈ Θ. Put f θ := dP θ dQ . Notation: Θ ⊂ R d , ( u , v ) is the inner product in R d . The model is differentiable in L 2 , if there exists a random variable w θ, 2 ∈ L 2 ( Q ) such that for all u n → 0 we have � √ √ � 2 f θ + u n − f θ − ( u n , w θ, 2 ) E Q → 0 | u n | as n → ∞ .

Some technical facts L q - differentiability We formally generalize the L 2 -differentiability: take q > 2 The model is differentiable in L q , if there exists a random variable w θ, q ∈ L q ( Q ) such that for all u n → 0 we have √ √ q � � f θ + u n − q q f θ � � − ( u n , w θ, q ) → 0 E Q � � � | u n | � � � as n → ∞ .

Some technical facts Score and L q - differentiability To simplify the discussion, we assume that P θ ∼ Q and put L η,θ = f η f θ ; L η,θ is the likelihood. One can show the following: f θ + u n − f θ � � � − ( u n , w θ, 1 ) � E Q � → 0 � � | u n | � with some random variable w θ, 1 ∈ L 1 ( Q ), if and only if E P θ | L θ + u n ,θ − 1 − ( u n , v θ ) | → 0 | u n | with some random variable v θ ∈ L 1 ( P θ ). The vector v θ is the score vector.

Some technical facts Score and L q - differentiability, cont. Moreover, the model is differentiable in L q if and only if √ q � L θ + u n ,θ − 1 � q − 1 � � q ( u n , v θ ) E P θ → 0 . � � � | u n | � � � √ √ Hence w θ, 2 = 1 f θ v θ , w θ, q = 1 f θ v θ ; for L 2 - differentiable the q 2 q Fisher information matrix I ( θ ) automatically exists and I ij ( θ ) = E P θ v θ i v θ j .

One result of LeCam Consider next the case when the statistical model E n (Θ) is a product experiment E n (Θ) = (Ω n , ⊗ n k =1 F , P θ n ; θ ∈ Θ); n = � n here P θ n is the product measure P θ k =1 P θ . One can show that the experiment E n (Θ) is L q - differentiable if and only if the coordinate experiment e (Θ) = (Ω , F , P θ ; θ ∈ Θ) is L q - differentiable. Let ˆ θ n be the m.l.- estimator of the parameter θ in the product experiment, i.e. the m.l.- estimator based on n independent and identical observations from the model (Ω , F , P θ ; θ ∈ Θ).

One result of LeCam, cont. We can now formulate the result of LeCam for the product experiments. Assume that ◮ Θ ⊂ R is open and bounded.

One result of LeCam, cont. We can now formulate the result of LeCam for the product experiments. Assume that ◮ Θ ⊂ R is open and bounded. ◮ The model (Ω , F , P θ ; θ ∈ Θ) is L 2 differentiable.

One result of LeCam, cont. We can now formulate the result of LeCam for the product experiments. Assume that ◮ Θ ⊂ R is open and bounded. ◮ The model (Ω , F , P θ ; θ ∈ Θ) is L 2 differentiable. ◮ 0 < inf θ I ( θ ) and sup θ I ( θ ) < ∞ and the map θ �→ I ( θ ) is continuous.

One result of LeCam, cont. We can now formulate the result of LeCam for the product experiments. Assume that ◮ Θ ⊂ R is open and bounded. ◮ The model (Ω , F , P θ ; θ ∈ Θ) is L 2 differentiable. ◮ 0 < inf θ I ( θ ) and sup θ I ( θ ) < ∞ and the map θ �→ I ( θ ) is continuous. ◮ Then the following facts hold:

One result of LeCam, cont. We can now formulate the result of LeCam for the product experiments. Assume that ◮ Θ ⊂ R is open and bounded. ◮ The model (Ω , F , P θ ; θ ∈ Θ) is L 2 differentiable. ◮ 0 < inf θ I ( θ ) and sup θ I ( θ ) < ∞ and the map θ �→ I ( θ ) is continuous. ◮ Then the following facts hold: ◮ There exists maximum likelihood estimators ˆ θ n .

One result of LeCam, cont. We can now formulate the result of LeCam for the product experiments. Assume that ◮ Θ ⊂ R is open and bounded. ◮ The model (Ω , F , P θ ; θ ∈ Θ) is L 2 differentiable. ◮ 0 < inf θ I ( θ ) and sup θ I ( θ ) < ∞ and the map θ �→ I ( θ ) is continuous. ◮ Then the following facts hold: ◮ There exists maximum likelihood estimators ˆ θ n . ◮ The sequence √ n (ˆ θ n − θ ) is asymptotically normal under P θ 1 with the limit N (0 , I ( θ ) ).

One result of LeCam, discussion Essentially the good properties of the m.l.e. follow from the L 2 - differentiability, when the parameter is one-dimensional. In the main stream text books the proof is based on Taylor expansion with two terms and the correction term. This is not possible here, because we do not have a Taylor expansion with two terms, but with one term only. In the proof one must control the terms | L θ + u ,θ sup − 1 | , n | u |≤ δ by using the Kolmogorov criteria for modulus of continuity; here L θ + u ,θ is the likelihood in the product experiment. If the parameter n is one-dimensional, then L 2 differentiability is sufficient for the control we are looking for.

Multi dimensional parameters Assume now that Θ ⊂ R d , where d ≥ 2. We still assume that the model is L 2 – differentiable, the parameter set Θ is an open and bounded subset of R d , Fisher information is continuous, strictly non-degenerate, and the score vector v θ satisfies v θ ∈ L q ( P θ ) with some q > d . Then ◮ There exists maximum likelihood estimators ˆ θ n . ◮ The sequence √ n (ˆ θ n − θ ) is asymptotically normal under P θ with the limit N (0 , I ( θ ) − 1 ).

Multi dimensional parameters, discussion As explained earlier, the main problem with this approach is to control the expression | L θ + u ,θ sup − 1 | n | u |≤ δ or equivalently � L θ + u ,θ q sup | − 1 | . n | u |≤ δ If v θ ∈ L q ( P θ ), then the experiment is also L q - differentiable, and this makes the desired control possible. The proof of these results is essentially in Ibragimov and Has’minskii, but the role of L q – differentiability in their arguments is missing.

General observation schemes Filtered experiments We work now with filtered models: (Ω , F , F , P θ ; θ ∈ Θ). Here F = ( F t ) 0 ≤ t ≤ T is an increasing family of sigma-fields, so called filtration. Assume that P θ ∼ Q and define the density processes by t = dP θ z θ t ; dQ t here P θ t = P θ | F t ( Q t = Q | F t ). We have the following for free: density processes z θ are ( F , Q )- martingales.

On the asymptotics of the m.l. estimators Matematiikan p aiv at - PowerPoint PPT Presentation

On the asymptotics of the m.l. estimators Matematiikan p aiv at 4.-5.1. 2006, Tampere Esko Valkeila Teknillinen korkeakoulu 4.1.2006 Outline of the talk Motivation Outline of the talk Motivation Some technical facts

L-estimators, R-estimators, Redescending M gr. Jakub Petr asek Estimators Revision Seminar

Asymptotics Will Perkins January 22, 2013 Asymptotics In many theorems and questions in

Asymptotics of symmetric functions with applications to Setup Asymptotics of statistical

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Foundations of Computer Science Lecture 9 Sums And Asymptotics Computing Sums Asymptotics:

r N ! N X Q ( N ) ( N k )! N k = + O (1) 2 1 k N AofA Asymptotics Q&A 1

Statistical mechanics via Answers: GUE asymptotics of symmetric functions Probability via Schur

What is this talk about? Applied Asymptotics in R an R package bundle Examples of the use of

Estimation theory Parametric estimation Properties of estimators Minimum variance

Foundations of Computer Science Last Time Lecture 9 Sums And Asymptotics Computing Sums

Survival models and Cox-regression Rates and Survival Lifetable estimators Bendix Carstensen

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Continuous attractors as unreliable estimators Arvind Murugan Dept. of Physics Regression using

Estimating Estimands with Estimators Fill In Your Name 30 October 2020 1/88 Key Points Review

Asymptotics of radiation fields in asymptotically Minkowski spacetimes Dean Baskin joint with

Condence Sets Based on Sparse Estimators Are Necessarily Large Benedikt M. Ptscher

From the master equation to mean field game asymptotics Daniel Lacker Division of Applied

On Third-Order Asymptotics for DMCs Vincent Y. F. Tan Institute for Infocomm Research (I 2 R)

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

From the master equation to mean field game asymptotics Daniel Lacker Division of Applied

Sum of matrix entries of representations of the symmetric group and its asymptotics Dario De

Impartial-culture asymptotics a central limit theorem for manipulation of elections Geoffrey

On the asymptotics of the m.l. estimators Matematiikan p aiv at - PowerPoint PPT Presentation

On the asymptotics of the m.l. estimators Matematiikan p aiv at 4.-5.1. 2006, Tampere Esko Valkeila Teknillinen korkeakoulu 4.1.2006 Outline of the talk Motivation Outline of the talk Motivation Some technical facts

L-estimators, R-estimators, Redescending M gr. Jakub Petr asek Estimators Revision Seminar

Asymptotics Will Perkins January 22, 2013 Asymptotics In many theorems and questions in

Asymptotics of symmetric functions with applications to Setup Asymptotics of statistical

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Foundations of Computer Science Lecture 9 Sums And Asymptotics Computing Sums Asymptotics:

r N ! N X Q ( N ) ( N k )! N k = + O (1) 2 1 k N AofA Asymptotics Q&amp;A 1

Statistical mechanics via Answers: GUE asymptotics of symmetric functions Probability via Schur

What is this talk about? Applied Asymptotics in R an R package bundle Examples of the use of

Estimation theory Parametric estimation Properties of estimators Minimum variance

Foundations of Computer Science Last Time Lecture 9 Sums And Asymptotics Computing Sums

Survival models and Cox-regression Rates and Survival Lifetable estimators Bendix Carstensen

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Continuous attractors as unreliable estimators Arvind Murugan Dept. of Physics Regression using

Estimating Estimands with Estimators Fill In Your Name 30 October 2020 1/88 Key Points Review

Asymptotics of radiation fields in asymptotically Minkowski spacetimes Dean Baskin joint with

Condence Sets Based on Sparse Estimators Are Necessarily Large Benedikt M. Ptscher

From the master equation to mean field game asymptotics Daniel Lacker Division of Applied

On Third-Order Asymptotics for DMCs Vincent Y. F. Tan Institute for Infocomm Research (I 2 R)

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

From the master equation to mean field game asymptotics Daniel Lacker Division of Applied

Sum of matrix entries of representations of the symmetric group and its asymptotics Dario De

Impartial-culture asymptotics a central limit theorem for manipulation of elections Geoffrey

r N ! N X Q ( N ) ( N k )! N k = + O (1) 2 1 k N AofA Asymptotics Q&A 1