CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Comparing Systems and Analyzing Alternatives CS 147: Computer Systems Performance Analysis Comparing Systems and Analyzing Alternatives 1 / 29
Overview CS147 Overview 2015-06-15 Finding Confidence Intervals Basics Using the z Distribution Using the t Distribution Comparing Alternatives Paired Observations Overview Unpaired Observations Proportions Special Considerations Sample Sizes Finding Confidence Intervals Basics Using the z Distribution Using the t Distribution Comparing Alternatives Paired Observations Unpaired Observations Proportions Special Considerations Sample Sizes 2 / 29
Comparing Systems Using Sample Data CS147 Comparing Systems Using Sample Data 2015-06-15 ◮ It’s not usually enough to collect data ◮ Usually we also want to say what’s better Comparing Systems Using Sample Data ◮ It’s not usually enough to collect data ◮ Usually we also want to say what’s better 3 / 29
Finding Confidence Intervals Review CS147 Review 2015-06-15 Finding Confidence Intervals ◮ How tall is Fred? Review ◮ How tall is Fred? 4 / 29
Finding Confidence Intervals Review CS147 Review 2015-06-15 Finding Confidence Intervals ◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm Review ◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm 4 / 29
Finding Confidence Intervals Review CS147 Review 2015-06-15 Finding Confidence Intervals ◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm ∴ Fred is between 155 and 190 cm Review ◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm ∴ Fred is between 155 and 190 cm 4 / 29
Finding Confidence Intervals Review CS147 Review 2015-06-15 Finding Confidence Intervals ◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm ∴ Fred is between 155 and 190 cm ◮ We are 90% confident that Fred is between 155 and 190 cm Review ◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm ∴ Fred is between 155 and 190 cm ◮ We are 90% confident that Fred is between 155 and 190 cm 4 / 29
Finding Confidence Intervals Basics Confidence Interval of Sample Mean CS147 Confidence Interval of Sample Mean 2015-06-15 Finding Confidence Intervals ◮ Knowing where 90% of sample means fall, we can state a 90% confidence interval Basics ◮ Key is Central Limit Theorem : ◮ Sample means are normally distributed ◮ Only if samples independent ◮ Mean of sample means is population mean µ Confidence Interval of Sample Mean ◮ Standard deviation ( standard error ) is σ/ √ n ◮ Knowing where 90% of sample means fall, we can state a 90% confidence interval ◮ Key is Central Limit Theorem : ◮ Sample means are normally distributed ◮ Only if samples independent ◮ Mean of sample means is population mean µ ◮ Standard deviation ( standard error ) is σ/ √ n 5 / 29
Finding Confidence Intervals Basics Estimating Confidence Intervals CS147 Estimating Confidence Intervals 2015-06-15 Finding Confidence Intervals ◮ Two formulas for confidence intervals ◮ Over 30 samples from any distribution: z -distribution Basics ◮ Small sample from normally distributed population: t -distribution ◮ Common error: using t -distribution for non-normal population Estimating Confidence Intervals ◮ Central Limit Theorem often saves us ◮ Two formulas for confidence intervals ◮ Over 30 samples from any distribution: z -distribution ◮ Small sample from normally distributed population: t -distribution ◮ Common error: using t -distribution for non-normal population ◮ Central Limit Theorem often saves us 6 / 29
Finding Confidence Intervals Using the z Distribution The z Distribution CS147 The z Distribution 2015-06-15 Finding Confidence Intervals ◮ Interval on either side of mean: � s Using the z Distribution � x ∓ z 1 − α √ n 2 ◮ Significance level α is small for large confidence levels The z Distribution ◮ Tables of z are tricky: be careful! ◮ Interval on either side of mean: � s � x ∓ z 1 − α √ n 2 ◮ Significance level α is small for large confidence levels ◮ Tables of z are tricky: be careful! 7 / 29
Finding Confidence Intervals Using the z Distribution Example of z Distribution CS147 Example of z Distribution 2015-06-15 Finding Confidence Intervals ◮ 35 samples: 10, 16, 47, 48, 74, 30, 81, 42, 57, 67, 7, 13, 56, 44, 54, 17, 60, 32, 45, 28, 33, 60, 36, 59, 73, 46, 10, 40, 35, 65, 34, 25, 18, 48, 63 Using the z Distribution ◮ Sample mean x = 42 . 1. Standard deviation s = 20 . 1. n = 35. ◮ 90% confidence interval is Example of z Distribution 42 . 1 ∓ ( 1 . 6456 ) 20 . 1 √ = ( 36 . 5 , 47 . 4 ) 35 ◮ 35 samples: 10, 16, 47, 48, 74, 30, 81, 42, 57, 67, 7, 13, 56, 44, 54, 17, 60, 32, 45, 28, 33, 60, 36, 59, 73, 46, 10, 40, 35, 65, 34, 25, 18, 48, 63 ◮ Sample mean x = 42 . 1. Standard deviation s = 20 . 1. n = 35. ◮ 90% confidence interval is 42 . 1 ∓ ( 1 . 6456 ) 20 . 1 √ = ( 36 . 5 , 47 . 4 ) 35 8 / 29
Finding Confidence Intervals Using the z Distribution Graph of z Distribution Example CS147 Graph of z Distribution Example 2015-06-15 Finding Confidence Intervals 80 90% C.I. 60 Using the z Distribution 40 Graph of z Distribution Example 20 0 80 90% C.I. 60 40 20 0 9 / 29
Finding Confidence Intervals Using the t Distribution The t Distribution CS147 The t Distribution 2015-06-15 Finding Confidence Intervals ◮ Formula is almost the same: � s Using the t Distribution � x ∓ t [ 1 − α √ n 2 ; n − 1 ] ◮ Usable only for normally distributed populations! The t Distribution ◮ But works with small samples ◮ Formula is almost the same: � s � x ∓ t [ 1 − α √ n 2 ; n − 1 ] ◮ Usable only for normally distributed populations! ◮ But works with small samples 10 / 29
Finding Confidence Intervals Using the t Distribution Example of t Distribution CS147 Example of t Distribution 2015-06-15 Finding Confidence Intervals ◮ 10 height samples: 148, 166, 170, 191, 187, 114, 168, 180, 177, 204 ◮ Sample mean x = 170 . 5. Standard deviation s = 25 . 1, Using the t Distribution n = 10. ◮ 90% confidence interval is 170 . 5 ∓ ( 1 . 833 ) 25 . 1 Example of t Distribution √ = ( 156 . 0 , 185 . 0 ) 10 ◮ 99% interval is (144.7, 196.3) ◮ 10 height samples: 148, 166, 170, 191, 187, 114, 168, 180, 177, 204 ◮ Sample mean x = 170 . 5. Standard deviation s = 25 . 1, n = 10. ◮ 90% confidence interval is 170 . 5 ∓ ( 1 . 833 ) 25 . 1 √ = ( 156 . 0 , 185 . 0 ) 10 ◮ 99% interval is (144.7, 196.3) 11 / 29
Finding Confidence Intervals Using the t Distribution Graph of t Distribution Example CS147 Graph of t Distribution Example 2015-06-15 Finding Confidence Intervals 200 150 Using the t Distribution 100 99% C.I. Graph of t Distribution Example 50 90% C.I. 0 200 150 100 99% C.I. 50 90% C.I. 0 12 / 29
Finding Confidence Intervals Using the t Distribution Getting More Confidence CS147 Getting More Confidence 2015-06-15 Finding Confidence Intervals ◮ Asking for a higher confidence level widens the confidence interval ◮ Counterintuitive? Using the t Distribution ◮ How tall is Fred? ◮ 90% sure he’s between 155 and 190 cm ◮ We want to be 99% sure we’re right Getting More Confidence ◮ So we need more room: 99% sure he’s between 145 and 200 cm ◮ Asking for a higher confidence level widens the confidence interval ◮ Counterintuitive? ◮ How tall is Fred? ◮ 90% sure he’s between 155 and 190 cm ◮ We want to be 99% sure we’re right ◮ So we need more room: 99% sure he’s between 145 and 200 cm 13 / 29
Comparing Alternatives Making Decisions CS147 Making Decisions 2015-06-15 Comparing Alternatives ◮ Why do we use confidence intervals? ◮ Summarizes error in sample mean ◮ Gives way to decide if measurement is meaningful ◮ Allows comparisons in face of error ◮ But remember: at 90% confidence, 10% of sample C.I.s do Making Decisions not include population mean ◮ In other words, 10% of experiments give wrong answer! ◮ Why do we use confidence intervals? ◮ Summarizes error in sample mean ◮ Gives way to decide if measurement is meaningful ◮ Allows comparisons in face of error ◮ But remember: at 90% confidence, 10% of sample C.I.s do not include population mean ◮ In other words, 10% of experiments give wrong answer! 14 / 29
Comparing Alternatives Testing for Zero Mean CS147 Testing for Zero Mean 2015-06-15 Comparing Alternatives ◮ Is population mean significantly � = 0 ? ◮ If confidence interval includes 0, answer is no ◮ Can test for any value (mean of sums is sum of means) ◮ Our height samples are consistent with average height of 170 Testing for Zero Mean cm ◮ Is population mean significantly � = 0 ? ◮ If confidence interval includes 0, answer is no ◮ Can test for any value (mean of sums is sum of means) ◮ Our height samples are consistent with average height of 170 cm 15 / 29
Recommend
More recommend