Lecture 24: The Sample Variance S 2 The squared variation 0/ 13
Suppose we have n numbers x 1 , x 2 , . . . , x n . Then their squared variation n ( x i − x ) 2 sv = sv ( x 1 , x 2 , . . . , x n ) sv ( x 1 , x 2 , . . . , x n ) = � i = 1 n (denoted σ 2 and called the Their mean (average) squared variation msv or σ 2 “population variance on page 33 of our text) is given by n n = 1 nsv = 1 � msv = σ 2 ( x i − x ) 2 n i = 1 Here x is the average 1 n � x i . n i = 1 1/ 13 Lecture 24: The Sample Variance S 2 The squared variation
The msv measure how much the numbes x 1 , x 2 , . . . , x n vary (precisely how much they vary from their average x ). For example if they are all equal then they will be all equal to their average x so sv = 0 and msv = 0 We also define the sample variance s 2 by 1 n S 2 = n − 1 sv = n − 1 msv n 1 S 2 = � ( x i − x ) 2 n − 1 i = 1 Amazingly, s 2 is more important then msv in statistics 2/ 13 Lecture 24: The Sample Variance S 2 The squared variation
The Shortcut Formula for the Squared Variation Theorem n n i − 1 � � x 2 x i ) 2 sv ( x 1 , x 2 , . . . , x n ) = n ( ( ∗ ) i = 1 i = 1 Proof Note since x = 1 n n � x i we have � x i = nx n i = 1 i = 1 Now 3/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Proof (Cont.) n � i − 2 x ( nx ) + nx 2 x 2 = i = 1 n i − 2 nx 2 + nx 2 � x 2 = i = 1 n � i − nx 2 x 2 = i = 1 2 n � x i n � i = 1 x 2 = i − n n i = 1 � n � 2 � x i n � i = 1 x 2 i − ✚ = n � n 2 � i = 1 2 n n i − 1 � � x 2 = x i n i = 1 i = 1 � 4/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Corollary 1 Divide both sides of ( ∗ ) by n to get 2 n n msv = 1 1 − 1 � � x 2 x i n 2 n i = 1 i = 1 Corollary 2 ((Shortcut formula for s 2 )) Divide both sides of ( ∗ ) by n − 1 to get 2 n n 1 1 S 2 = − � � x 2 x i i − n ( n − 1 ) n − 1 i = 1 i = 1 It is this last formula that we will need. 5/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Let met give a conceptual proof of the theorem the way a professorial mathematician would prove the theorem. Definition A polynomial p ( x 1 , x 2 , . . . , x n ) is symmetric, if it is unchanged by permuting the variables. Examples 3 p ( x , y , z ) = x 2 + y 2 + z 2 is symmetric p ( x , y , z ) = xy + z 2 is not symmetric Theorem Any symmetric polynomial pin x 1 , x 2 , . . . , x n can be rewritten as a polynomial in n x k � the power sums i that is i = 1 �� � � � x 2 x ℓ p ( x 1 , . . . , x n ) = q x i , 1 , . . . , i if deg p = ℓ . 6/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Bottom Line n ( x i − x ) 2 is a symmetric polynomial in x 1 , x 2 , . . . , x n so there exist a and sv = � i = 1 b with 2 n n � � x 2 sv ( x 1 , x 2 , . . . , X n ) = a i + b x i ( ∗∗ ) i = 1 i = 1 This is true for all x 1 , . . . , x n (an “identify”) so we just choose x 1 , . . . , x n cleverly to get a and b . n n x 2 First choose x 1 = 1, x 2 = − 1, x 3 = . . . = x n = 0 so � x i = 0 and � i = 2 i = 1 i = 1 since x = 0 ( ∗∗ ) becomes 2 = a 2 + b ( 0 ) a = 1 so 7/ 13 Lecture 24: The Sample Variance S 2 The squared variation
To find b take all the x ’s to be 1. so x = 1 and sv ( 1 , 1 : 1 ) = 0 (there is no variation in the x ’s) n n � � x 2 1 = n , x i = n so i = 1 i = 1 n � � x 2 x i ) 2 sv ( x 1 , . . . , x n ) = i + b ( i = 1 gives as b = − 1 0 = h + bn 2 so n and n i − 1 � � x 2 x i ) 2 sv ( x 1 , x 2 , . . . , x n ) = n ( i = 1 as before. Remark 1 Any symmetric quadratic function q ( x 1 , x 2 , . . . , x n ) is a linear combination of n n x i ) 2 that is x 2 � 1 and ( � i = 1 i = 1 2 n n � � x 2 q ( x 1 , . . . , x n ) = a i + b x i i = 1 i = 1 8/ 13 Lecture 24: The Sample Variance S 2 The squared variation
In Which We Return to Statistics Estimating the Population Variance We have seen that X is a good (the best) estimator of the population mean- µ , in particular it was an unbiased estimator. How do we estimate the population variance? 9/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Answer - use the Sample variance s 2 to estimate the population variance σ 2 The reason is that if we take the associated sample variance random variable n − 1 1 S 2 = � ( X i − X ) 2 n − 1 i = 1 then we have Amazing Theorem 1 Why do you need n − 1? We will see. 10/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Before starting the proof we first note the Corollary 2, page 2 implies Proposition (Shortcut formula for the sample variance random variable’s) 2 n n 1 1 S 2 = � � X 2 X i (b) i − n − 1 n ( n − 1 ) i = 1 i = 1 Why does this follow from the formula for s 2 ? We will also need the following Proposition Suppose Y is a random variable then E ( Y 2 ) = E ( Y ) 2 + V ( Y ) (#) Proof. V ( Y ) = E ( Y 2 ) − ( E ( Y )) 2 (Shortcut formula for V ( Y ) � 11/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Corollary Suppose X 1 , X 2 , . . . , X n is a random sample from a population of mean µ and variance σ 2 . Then i ) = µ 2 + σ 2 (i) E ( X 2 (ii) E ( T 0 ) = n 2 µ 2 + n σ 2 Proof. (i) E ( X i ) = µ and V ( Y ) = σ 2 so plug into (#) (ii) E ( T 0 ) = n µ and V ( T 0 ) = n σ 2 so plug into (#) � 12/ 13 Lecture 24: The Sample Variance S 2 The squared variation
We can now prove (b) n 1 1 � � E ( S 2 ) = E X 2 X i ) 2 n ( n − 1 )( i − n − 1 i = 1 since E is linear n 1 1 � E ( X 2 n ( n − 1 ) E ( T 2 = i ) − 0 ) n − 1 i = 1 by (i) and (ii) n 1 1 1 � ( µ 2 + σ 2 ) − n ( n 2 µ 2 + n σ 2 ) = n − 1 n − 1 i = 1 � � 1 n µ 2 + n σ 2 − 1 n ( n 2 µ 2 + n σ 2 ) = n − 1 1 ✚✚ n µ 2 + n σ 2 − ✚✚ � n µ 2 − σ 2 � = n − 1 1 � ( n − 1 ) σ 2 � = n − 1 = σ 2 n − 1 not 1 1 Amazing - you need n . 13/ 13 Lecture 24: The Sample Variance S 2 The squared variation
Recommend
More recommend