stat 113 final exam practice problems
play

STAT 113: FINAL EXAM PRACTICE PROBLEMS COLIN REIMER DAWSON, FALL - PDF document

STAT 113: FINAL EXAM PRACTICE PROBLEMS COLIN REIMER DAWSON, FALL 2015 Research Design / Describing Samples. (1) The following measures can be used to describe distributions (either population or sample distributions). For each one describe


  1. STAT 113: FINAL EXAM PRACTICE PROBLEMS COLIN REIMER DAWSON, FALL 2015 Research Design / Describing Samples. (1) The following measures can be used to describe distributions (either population or sample distributions). For each one describe conceptu- ally (without mathematical notation, and without simply describing how to calculate it) and as concisely as possible, what information it captures. (a) The mean (b) The median (c) The range (d) The interquartile range (IQR) (e) The variance (f) The standard deviation (2) Describe what it means for a measure to be robust/resistant (two terms for the same thing). For each of the measures above, indicate whether it is or is not relatively robust/resitant. What consider- ations go into choosing whether or not to use a robust/resistant measure? (3) (Modified/abridged from A.3) In a study investigating how students use their laptop computers in class, researchers recruited 45 students at one university in the Northeast who regularly take their laptops to class. On average, the students cycled through 65 active windows per lecture, with one student averaging 174 active windows per lecture. They found that, on average, 62% of the windows students open in class are completely unrelated to the class, and students had dis- tracting windows open and active 42% of the time, on average. The study included a measure of how each student performed on a test of the relevant material. Not surprisingly, the study finds that the students who spent more time on distracting websites generally had lower test scores. Date : December 14, 2015. 1

  2. 2 COLIN REIMER DAWSON, FALL 2015 (a) Identify the cases and sample size for this study. (b) Is this an experiment or an observational study? (c) From the description given, what variables are recorded for each case? Identify each as categorical or quantitative. (d) What graph is most appropriate to display the data about num- ber of active windows open per lecture if we want to quickly determine whether the maximum value (174) is an outlier? (e) The last sentence of the paragraph describes an association. Identify a graph and a statistic that could be used to display and quantify this association, respectively. (f) From the information given, can we conclude that students who allocate their cognitive resources to distracting sites during class get lower grades because of it? Why or why not? (4) (Modified from A.27) The number of consecutive frost-free days in a year is called the growing season. A farmer considering moving to a new region finds that the median growing season for the area for the last 50 years is 275 days while the mean growing season is 240 days. (a) Explain how it is possible for the mean to be so much lower than the median, and describe the distribution of the growing season lengths in this area for the last 50 years. (b) Sketch either a possible histogram or a possible density curve for the shape of this distribution. Label the mean and median on the horizontal axis. Inference Foundations. Study Exam 2 and the practice problems for exam 2. Inference for Correlation and Regression. (1) (modified from D.46) Is depression a possible factor in students miss- ing classes? A study analyzed relationships among various variables pertaining to a population of college students. Two of those variables are DepressionScore , scores on a standard depression scale with higher numbers indicating greater depression, and ClassesMissed , the number of classes missed during the semester. Computer out- put is shown below for a linear regression model used to predict the number of classes missed based on the depression score.

  3. STAT 113: FINAL EXAM PRACTICE PROBLEMS 3 Coefficients: Estimate Std. Error t-value P-value (Intercept) 1.77712 0.26714 6.652 1.79e-10 DepressionScore 0.08312 0.03368 2.468 0.0142 Residual standard error: 3.208 on 251 degrees of freedom Multiple R-squared: 0.0237 (a) Interpret the slope of the regression line in the context of de- pression and missed classes. (b) Based on the output above, what can we conclude about the relationship between these variables in the population? (c) Interpret R 2 in the context of depression and missed classes. (What does it tell us about the relationship?) (2) (modified from D.50 and D.51) We can use data from a sample of NBA basketball games to construct a regression model to predict points in a season for a player based on the number of free throws made. For our sample data, the number of free throws made in a season ranges from 16 to 594, while the number of points ranges from 104 to 2161. For the information in (a) and (b), interpret the confidence and prediction interval given in the context of free throws and points scored per season . Make a specific statement about what the value of 95% means in each case. (a) The predicted number of points made for a player who makes 100 free throws in a season is 710.8 points, with a 95% confidence interval of 675.7 to 745.8 points. The prediction interval at the same free throw number is 340.7 to 1080.8 points. (b) The predicted number of points made for a player who makes 400 free throws in a season is 1613.6 points, with a 95% confidence interval of 1559.3 to 1667.9 points. The prediction interval at the same free throw number is 1241.2 to 1986.0 points. (c) Use the information above to find the slope of the regression line. (d) How do you expect the width of the confidence interval for a player who makes 20 free throws in a season to compare to the intervals given in (a) and (b)? Why? Goodness of Fit and Association Tests for Categorical Variables. (1) An Ipsos/Reuters poll conducted between Dec. 5th and 9th of this year asked a random sample of 494 adult Americans identifying as members of the Republican party who their preferred presidential candidate was. Donald Trump was the choice of 183 respondents,

  4. 4 COLIN REIMER DAWSON, FALL 2015 Ben Carson was chosen by 64, Marco Rubio by 59 and Ted Cruz by 54. A total of 104 respondents identified one of the other candidates, and 30 were undecided. (a) Set aside the undecided respondents and those who identified a candidate outside the top four. Can we conclude that the propor- tion of the population from which the respondents were selected who prefer Trump is higher than the combined proportion who prefer one of Carson, Rubio and Cruz? Use a chi-square test and show all details. (b) Setting aside the Trump voters as well, can we conclude that Carson, Rubio and Cruz are not equally preferred by the popula- tion from which the respondents were selected? Use a chi-square statistic and show all details. (2) On November 15-18, 2012 Gallup conducted a survey of 1,015 ran- domly selected U.S. adults. They were asked whether they planned to go shopping on “Black Friday” (the day after Thanksgiving). The results, broken down by sex (as self-reported by the participants), are summarized in the following two-way table. Shopping Plans? Yes No Total M 82 433 515 F 100 400 500 Sex Total 182 833 1015 (a) Compute the expected cell count for the Male/Yes Shopping cell, to two decimal places. (b) The appropriate chi-square distribution for this test has 1 degree of freedom (( R − 1)( C − 1) = (2 − 1)(2 − 1) = 1). Explain why the test has 1 degree of freedom. (c) Here is some computer output for H 0 : Planning to shop the Friday after Thanksgiving is unrelated to sex H 1 : Sex and planning to shop on the Friday after Thanksgiving are related Chi-Square = 2.866, DF = 1, P-Value = 0.090 What is the test conclusion at the 5% significance level? Do you reject H0? Why or why not? (d) Describe a different approach that could have been used to test these same hypotheses, instead of the chi-square test. Without doing any calculations, what P -value would you expect to get if you did the test this other way?

Recommend


More recommend