Measures of Variation Summary of Section 9.2 Range The difference Largest Data - Smallest Data in a Sample. Deviation from the Mean � x 2 i − nx 2 � ( x i − x ) 2 1 Variance σ 2 = s 2 = = n − 1 n − 1 √ 2 Standard Deviation σ = s = s 2 These are random variables called Sample Variance and Sample Standard Deviation. For a random variable X , µ = E ( X ) is called the mean. The variance Var ( X ) is σ 2 = Var ( X ) = E (( X − µ ) 2 ). Main Property/ Explanation for dividing by n − 1: If X i are i.i.d with distribution X , then if you set S 2 = � ( X i − X ) 2 , its n − 1 expected value is E ( S 2 ) = σ 2 . This is not true for the standard deviation, E ( S ) � = σ. �� f i x 2 M , i − nx 2 Grouped Data s = . n − 1 Dan Barbasch Math 1105 Chapter 9 Week of October 2 1 / 1
Examples I Example (Range) Data 15 , − 3 , 4 , 7 , 18. The smallest is − 3, the largest 18 so Range = 18 − ( − 3) = 21 . Always a nonnegative number. Example (Deviation from the Mean) In the previous example, x = 15 − 3+4+7+18 = 8 . 2. So 5 15 − 8 . 2 = 6 . 8 , − 3 − 8 . 2 = − 11 . 2 , 4 − 8 . 2 = − 3 . 8 , 7 − 8 . 2 = − 1 . 2 , 18 − 8 . 2 = 9 . 8 . Example (Variance and Standard Deviation) s 2 = 6 . 8 2 +11 . 2 2 +3 . 8 2 +1 . 2 2 +9 . 8 2 = 15 2 +3 2 +4 2 +7 2 +18 2 − 5 · 8 . 2 2 √ 4 4 s 2 . s = Dan Barbasch Math 1105 Chapter 9 Week of October 2 2 / 1
Examples II Example (Binomial Distribution) P ( X = 1) = p , P ( X = 0) = 1 − p . Then µ = E ( X ) = p , and σ 2 = E (( X − p ) 2 ) = (1 − p ) 2 p + (0 − p ) 2 (1 − p ) = p (1 − p ) . This is the same as E ( X 2 − p 2 ) = (1 − p 2 ) p + ( − p 2 )(1 − p ) = (1 − p ) p . Remark: Note that the formula for variance and standard deviation only holds for n > 2 . Otherwise, for n = 1 , you would be dividing by 0. For one random variable, the variance is defined as Var ( X ) = E (( X − E ( X )) 2 ) . For X 1 , X 2 , , two independent random variables, Var ( X 1 + X 2 ) = Var ( X 1 ) + Var ( X 2 ) . Suppose X is a random variable. We can write a table . . . X a 1 a 2 a n P ( X ) p 1 p 2 . . . p n Dan Barbasch Math 1105 Chapter 9 Week of October 2 3 / 1
Examples III For the expected value µ = E ( X ) , you multiply the two terms in each column, and add � a i × p n = a 1 p 1 + · · · + a n p n . i In a spreadsheet program, the data would be in columns and you would add over the products from the rows. You use a command like sumproduct to perform the operation. If you have some other variable like ( X − µ ) 2 , you would use the values ( a i − µ ) 2 and the same p i . Dan Barbasch Math 1105 Chapter 9 Week of October 2 4 / 1
Examples IV Example 2 3 − 1 1 X X 2 4 9 1 1 ( X − µ ) 2 (2 − 1 / 4) 2 (3 − 1 / 4) 2 ( − 1 − 1 / 4) 2 (1 − 1 / 4) 2 P ( X ) 1 / 2 1 / 8 1 / 4 1 / 8 Computing the expected values is below. µ = E ( X ) = (2) × (1 / 2) + (3) × (1 / 8) + ( − 1) × (1 / 4) + (1) × (1 / 8) = 1 / 4 . Var ( X ) =(2 − 1 / 4) 2 · (1 / 2) + (3 − 1 / 4) 2 · (1 / 8) + ( − 1 − 1 / 4) 2 · (1 / 4)+ +(1 − 1 / 4) 2 · (1 / 8) = 47 / 16 . Dan Barbasch Math 1105 Chapter 9 Week of October 2 5 / 1
Normal Distribution I Definition Data are said to be normally distributed if the rate at which the frequencies fall off is proportional to the distance of the score from the mean, and to the frequencies themselves. This definition requires Calculus. We don’t assume or do Calculus in this course. We will however learn how to work with this distribution. It is very useful in that many phenomena can be modeled by this. We will see how the binomial distribution is related to the normal distribution later in the chapter. Suppose you have a random variable X , and you would like to know about its mean µ . So you perform many n independent trials, and draw a histogram. The larger the n , the closer the outcome will look like the 2 πσ e − ( x − µ )2 1 2 σ 2 . The pictures in the text show what it looks curve f ( x ) = √ like. The resulting probability is called N ( µ, σ 2 ) , normal with mean µ and Dan Barbasch Math 1105 Chapter 9 Week of October 2 6 / 1
Normal Distribution II variance σ 2 . There is a precise statement called the Central Limit Theorem which says that for large n , √ n ( S n − µ ) “looks” like a normal distribution N (0 , σ 2 ) . it is used in practice to model large populations and “ errors”. There are many examples that can be approximated by normal distributions. Heights of people, and scores on tests are examples. This is not a finite distribution. For a random variable that is normally distributed, we write N ( µ, σ 2 ) , P ( X ≤ a ) = the area under the normal curve from − ∞ to a . This is tabulated for µ = 0 and σ = 1 . The rest is computed by simple formulas involving arithmetic. Dan Barbasch Math 1105 Chapter 9 Week of October 2 7 / 1
Height Example I Example (from the practice prelim) 8. (14 points) Assume that the height in inches of American women follows a normal distribution with mean mu = 64 ′′ (5’4”) and standard deviation σ = 3 ′′ . (a) (3 points) How many standard deviations above or below the mean is a height of 73” (6’1”)? (b) (4 points) What fraction of women are taller than 73 inches? (c) (4 points) In a room with 30 women, what is the probability that at least one of them is taller than 73”? (d) (3 points) What assumptions did you make when answering part (c)? Are there circumstances under which those assumptions would not be justified? Dan Barbasch Math 1105 Chapter 9 Week of October 2 8 / 1
Height Example II Answer. (a) same as before 3 standard deviations away. ( b ) P ( X ≥ 73) = P ( X − 64 ≥ 73 − 64 = 9 = 3 σ ) = P ( X − µ ≥ 3) = σ =1 − P ( X − µ ≤ 3) = 1 − 0 . 999 = 0 . 001 . σ This is 1 / 1000 . The random variable X has probability distribution N (64 , 17). The probability P ( X ≥ 73) comes from this normal distribution. To actually look it up in the tables, you rewrite it in terms of Z = X − 64 which has 3 probability distribution N (0 , 1). This is the one in the tables. ( c ) P (at least 1 / 30 ≥ 73) =1 − P (30 / 30 ≤ 73) = 1 − P ( X ≤ 73) 30 = =1 − (0 . 998) 30 . Dan Barbasch Math 1105 Chapter 9 Week of October 2 9 / 1
z − value The principle is ⇒ Z = X − µ X normal N ( µ, σ 2 ) ⇐ normal N (0 , 1). σ So P ( X ≤ a ) = P ( Z ≤ a − µ ) . σ z = a − µ is called the z − value. This is what you look up in the tables. σ Dan Barbasch Math 1105 Chapter 9 Week of October 2 10 / 1
Example with Grades Example A professor (not this one!) of a course wants to give grades so that A top 8% F bottom 8% B next 20% below A D next 20% above the F C the rest The mean is µ = 67 and the standard deviation is σ = 17. Find the cutoffs. Answer. P ( ≤ A ) = 0 . 92 z = 1 . 41 a = µ + z σ = 67 + 17 · 1 . 41 = 91 P ( ≤ B ) = 0 . 72 z = 0 . 58 a = µ + z σ = 67 + 17 · 0 . 58 = 77 P ( ≤ C ) = 0 . 28 z = − . 59 a = µ + z σ = 67 + 17 · ( − . 59) = 57 P ( ≤ D ) = 0 . 08 z = − 1 . 39 a = µ + z σ = 67 + 17 · ( − 1 . 39) = 43 from the tables. In Excel or alike you can write norminv (0 . 92 , 67 , 17) ∼ = 91 . Dan Barbasch Math 1105 Chapter 9 Week of October 2 11 / 1
Approximate Binomial Distribution by the Normal Dstribution I Theorem Let B = X 1 + · · · + X n be the binomial Distribution, coming from adding up X i = X i.i.d. with P ( X = 1) = p , P ( X = 0) = 1 − p . Then E ( B ) = np , Var ( B ) = np (1 − p ) . The normal approxinmation of the binomial distribution is a − np P ( B ≤ a ) ≃ P ( Z ≤ ) . � np (1 − p ) where Z has the normal distribution N (0 , 1) . In other words, the binomial distribution is approximately the normal distribution with the same mean and variance, N ( np , np (1 − p ) . Dan Barbasch Math 1105 Chapter 9 Week of October 2 12 / 1
Approximate Binomial Distribution by the Normal Dstribution II Remember the notation N ( µ, σ 2 ) for the normal distribution. σ 2 is the variance, its square root σ is the standard deviation. Example Approximate C (100 , 50) . Dan Barbasch Math 1105 Chapter 9 Week of October 2 13 / 1
Approximate Binomial Distribution by the Normal Dstribution III Answer. Use the binomial distribution with p = 0 . 5 and n = 100 . C (100 , 50) · (0 . 5) 100 ≃ P (49 . 5 < B < 50 . 5) . √ The mean is 100 · 0 . 5 = 50 . The standard deviation is 100 · 0 . 5 · 0 . 5 = 5 . So � 49 . 5 − 50 ≤ Z ≤ 50 . 5 − 50 � C (100 , 50) ≃ 2 100 · P = 5 5 =0 . 5398 − 0 . 4602 = 0 . 08 . You have to box in 50 by two numbers: 49 . 5 < 50 < 50 . 5 is a reasonable choice. Dan Barbasch Math 1105 Chapter 9 Week of October 2 14 / 1
Drug Effectiveness I Example (Drug Effectiveness, Problem 24 in 9.4) A new drug cures 80% of the patients to whom it is administered. It is given to 25 patients. Find the probabilities that among these patients, the following results occur. a. Exactly 20 are cured. b. All are cured. c. No one is cured. d. Twelve or fewer are cured. Dan Barbasch Math 1105 Chapter 9 Week of October 2 15 / 1
Recommend
More recommend