hypothesis testing problem choose on basis of data x
play

Hypothesis Testing Problem: choose, on basis of data X , between two - PDF document

Hypothesis Testing Problem: choose, on basis of data X , between two alternatives. Formally: choose between 2 hypotheses : H o : 0 or H 1 : 1 where 0 and 1 are a partition of the model P ; . That is 0


  1. Hypothesis Testing Problem: choose, on basis of data X , between two alternatives. Formally: choose between 2 hypotheses : H o : θ ∈ Θ 0 or H 1 : θ ∈ Θ 1 where Θ 0 and Θ 1 are a partition of the model P θ ; θ ∈ Θ. That is Θ 0 ∪ Θ 1 = Θ and Θ 0 ∩ Θ 1 = {} . Make desired choice using rejection or critical region of test: R = { X : we choose Θ 1 if we observe X } Neyman Pearson approach to hypothesis test- ing: treat two hypotheses asymmetrically. Hypothesis H o is referred to as the null hy- pothesis (because traditionally it has been the hypothesis that some treatment has no effect). 135

  2. Definition : power function of test with critical region R is π ( θ ) = P θ ( X ∈ R ) Optimality theory: problem of finding best R . Good R : π ( θ ) small for θ ∈ Θ 0 and large for θ ∈ Θ 1 . There is a trade off: can be made in many ways. Jargon: Type I error : error made when θ ∈ Θ 0 but we choose H 1 , that is, X ∈ R . The other kind of error, when θ ∈ Θ 1 but we choose H 0 is called a Type II error . Defn : The level or size of a test is α ≡ max π ( θ ) . θ ∈ Θ o (Worst case probability of Type I error.) 136

  3. The other error probability is denoted β and defined as β ( θ ) = P θ ( X �∈ R ) for θ ∈ Θ 1 Notice: β will depend on θ . Simple versus Simple testing Finding best test is easiest when hypotheses very precise. Definition : A hypothesis H i is simple if Θ i contains only a single value θ i . The simple versus simple testing problem arises when we test θ = θ 0 against θ = θ 1 so that Θ has only two points in it. This problem is of importance as a technical tool, not because it is a realistic situation. Suppose that the model specifies that if θ = θ 0 then the density of X is f 0 ( x ) and if θ = θ 1 then the density of X is f 1 ( x ). How should we choose R ? 137

  4. Minimize α + β , the total error probability: P θ 0 ( X ∈ R ) + P θ 1 ( X �∈ R ) Write as integral: � [ f 0 ( x )1( x ∈ R ) + { 1 − 1( x ∈ R ) } f 1 ( x )] dx For each x put x in R or not in such a way as to minimize integral. But for each x the quantity f 0 ( x )1( x ∈ R ) + { 1 − 1( x ∈ R ) } f 1 ( x ) can be chosen either to be f 0 ( x ) or f 1 ( x ). Solution: put x ∈ R iff f 1 ( x ) > f 0 ( x ). Note can rephrase condition in terms of likelihood ratio f 1 ( x ) /f 0 ( x ). 138

  5. Theorem : For each fixed λ the quantity β + λα is minimized by R which has � � x : f 1 ( x ) R = f 0 ( x ) > λ . Neyman-Pearson: two kinds of errors might have unequal consequences. So: pick the more serious kind of error, label it Type I and require rule to hold probability α of Type I error at or below prespecified level α 0 . Typically: α 0 = 0 . 05, chiefly for historical rea- sons. Neyman-Pearson solution: minimize β subject to constraint α ≤ α 0 . Usually equivalent to constraint α = α 0 . Most Powerful Level α 0 test maximizes 1 − β subject to α ≤ α 0 . 139

  6. The Neyman Pearson Lemma Theorem : In testing f 0 against f 1 the proba- bility β of a type II error is minimized, subject to α ≤ α 0 by the rejection region: � � x : f 1 ( x ) R = f 0 ( x ) > λ where λ is the largest constant such that � � f 1 ( x ) = α 0 P 0 f 0 ( x ) ≥ λ Example : If X 1 , . . . , X n are iid N ( µ, 1) and we have µ 0 = 0 and µ 1 > 0 then f 1 ( X 1 , . . . , X n ) f 0 ( X 1 , . . . , X n ) = X i − nµ 2 X i + nµ 2 � � exp { µ 1 1 / 2 − µ 0 0 / 2 } which simplifies to X i − nµ 2 � exp { µ 1 1 / 2 } 140

  7. Now choose λ so that X i − nµ 2 � P 0 (exp { µ 1 1 / 2 } > λ ) = α 0 Rewrite the probability as X i > [log( λ ) + nµ 2 � P 0 ( 1 / 2] /µ 1 ) = 1 − Φ([log( λ ) + nµ 2 1 / 2] / [ n 1 / 2 µ 1 ]) Notation: z α : upper α critical point of N(0,1) distribution. Then 1 / 2] / [ n 1 / 2 µ 1 ] z α 0 = [log( λ ) + nµ 2 which you can solve to get a formula for λ in terms of z α 0 , n and µ 1 . Rejection region looks complicated: reject if a complicated statistic is larger than λ which has a complicated formula. But re-expressed rejection region as � X i √ n > z α 0 141

  8. Key point: rejection region same for any µ 1 > 0. Definition : In the general problem of testing Θ 0 against Θ 1 level of critical region R is α = sup P θ ( X ∈ R ) . θ ∈ Θ 0 The power function is π ( θ ) = P θ ( X ∈ R ) . A test with rejection region R is Uniformly Most Powerful at level α 0 if 1. the test has level α ≤ α o 2. If R ∗ is another rejection region with level α ≤ α 0 then for every θ ∈ Θ 1 we have P θ ( X ∈ R ∗ ) ≤ P θ ( X ∈ R ) . 142

  9. Application of the NP lemma : In the N ( µ, 1) model consider Θ 1 = { µ > 0 } and Θ 0 = { 0 } or Θ 0 = { µ ≤ 0 } . The UMP level α 0 test of H 0 : µ ∈ Θ 0 against H 1 : µ ∈ Θ 1 is R ∗ == { x : n 1 / 2 ¯ X > z α 0 } Proof : For either choice of Θ 0 this test has level α 0 because for µ ≤ 0 we have P µ ( n 1 / 2 ¯ X >z α 0 ) = P µ ( n 1 / 2 ( ¯ X − µ ) > z α 0 − n 1 / 2 µ ) = P ( N (0 , 1) > z α 0 − n 1 / 2 µ ) ≤ P ( N (0 , 1) > z α 0 ) = α 0 (Notice use of µ ≤ 0. Key idea: critical point fixed by behaviour on edge of null hypothesis. 143

  10. Now suppose R is any other level α 0 critical region: P 0 (( X 1 , . . . , X n ) ∈ R ) ≤ α 0 . Fix a µ > 0. According to the NP lemma P µ { ( X 1 , . . . , X n ) ∈ R } ≤ P µ { ( X 1 , . . . , X n ) ∈ R λ } where R λ = { x : f µ ( x 1 , . . . , x n ) /f 0 ( x 1 , . . . , x n ) > λ } for a suitable λ . But we just checked that this test had a rejec- tion region of the form R ∗ = n 1 / 2 ¯ X > z α 0 The NP lemma produces the same test for ev- ery µ > 0 chosen as an alternative. So this test is UMP level α 0 . 144

  11. Proof of the Neyman Pearson lemma : Lagrange Multipliers Suppose you want to minimize f ( x ) subject to g ( x ) = 0. Consider first the function h λ ( x ) = f ( x ) + λg ( x ) If x λ minimizes h λ then for any other x f ( x λ ) ≤ f ( x ) + λ [ g ( x ) − g ( x λ )] Now suppose you can find a value of λ such that the solution x λ has g ( x λ ) = 0. Then for any x we have f ( x λ ) ≤ f ( x ) + λg ( x ) and for any x satisfying the constraint g ( x ) = 0 we have f ( x λ ) ≤ f ( x ) This proves that for this special value of λ the quantity x λ minimizes f ( x ) subject to g ( x ) = 0. Notice that to find x λ you set the usual partial derivatives equal to 0; then to find the special x λ you add in the condition g ( xλ ) = 0. 145

  12. Proof of NP lemma R λ = { x : f 1 ( x ) /f 0 ( x ) ≥ λ } minimizes λα + β . As λ increases from 0 to ∞ level of R λ de- creases from 1 to 0. Ignore technical problem: f 1 ( X ) /f 0 ( X ) might be discrete. There is thus a value λ 0 where level = α 0 . According to theorem above test minimizes α + λ 0 β . Suppose R ∗ is some other test with level α ∗ ≤ α 0 . Then λ 0 α + β ≤ λ 0 α R ∗ + β R ∗ We can rearrange this as β R ∗ ≥ β + ( α − α R ∗ ) λ 0 Since α R ∗ ≤ α 0 = α the second term is non-negative and β R ∗ ≥ β which proves the Neyman Pearson Lemma. 146

  13. General phenomenon: for any µ > µ 0 , likeli- hood ratio f µ /f 0 is an increasing function of � X i . Rejection region of NP test is thus always a region of the form � X i > k . Value of constant k determined by requirement that test have level α 0 ; this depends on µ 0 not on µ 1 . Definition : The family f θ ; θ ∈ Θ ⊂ R has monotone likelikelood ratio with respect to a statistic T ( X ) if for each θ 1 > θ 0 the likelihood ratio f θ 1 ( X ) /f θ 0 ( X ) is a monotone increasing function of T ( X ). Theorem : For a monotone likelihood ratio family the Uniformly Most Powerful level α test of θ ≤ θ 0 (or of θ = θ 0 ) against the alternative θ > θ 0 is R = { xT ( x ) > t α } where P 0 ( T ( X ) > t α ) = α 0 . 147

  14. Usual application: one parameter exponential family. Almost any other problem: method doesn’t work and there is no uniformly most powerful test. For instance: testing µ = µ 0 against the two sided alternative µ � = µ 0 there is no UMP level α test. 148

Recommend


More recommend