week 2 inference for slr
play

Week 2: Inference for SLR Inference: sampling distributions, testing - PowerPoint PPT Presentation

BUS41100 Applied Regression Analysis Week 2: Inference for SLR Inference: sampling distributions, testing confidence intervals, and prediction intervals Max H. Farrell The University of Chicago Booth School of Business Back to House Prices


  1. BUS41100 Applied Regression Analysis Week 2: Inference for SLR Inference: sampling distributions, testing confidence intervals, and prediction intervals Max H. Farrell The University of Chicago Booth School of Business

  2. Back to House Prices Understand the relationship between price and size . How? Last week we fit a line through a bunch of points: price = 39 + 35 × size . ● 160 ● 140 ● price 120 ● ● ● 100 ● ● ● ● ● ● 80 ● ● 60 ● 1.0 1.5 2.0 2.5 3.0 3.5 size 1

  3. CAPM Another example of conditional distributions: Individual returns given market return. The Capital Asset Pricing Model (CAPM) for asset A relates return R At = V At − V At − 1 to the “market” return, R Mt . V At − 1 In particular, the relationship is given by the regression model R At = α + βR Mt + ε with observations at times t = 1 . . . T (more on ( α, β ) vs ( b 0 , b 1 ) vs ( β 0 , β 1 ) in a minute). When asset A is a mutual fund, this CAPM regression can be used as a performance benchmark for fund managers. 2

  4. > mfund <- read.csv("mfunds.csv", stringsAsFactors=TRUE) > mu <- apply(mfund, 2, mean) > mu drefus fidel keystne Putnminc scudinc 0.006767000 0.004696739 0.006542550 0.005517072 0.004432333 windsor valmrkt tbill 0.010021906 0.006812983 0.005978333 > stdev <- apply(mfund, 2, sd) > stdev drefus fidel keystne Putnminc scudinc 0.047237111 0.056587091 0.084236450 0.030079074 0.035969261 windsor valmrkt tbill 0.048639473 0.048000146 0.002522863 3

  5. > plot(mu, stdev, col=0) > text(x=mu, y=stdev, labels=names(mfund), col=4) keystne 0.08 0.06 fidel windsor valmrkt drefus stdev 0.04 scudinc Putnminc 0.02 0.00 tbill 0.005 0.006 0.007 0.008 0.009 0.010 mu 4

  6. Lets look at just windsor (which dominates the market). > windsor.reg <- lm(mfund$windsor ~ mfund$valmrkt) > plot(mfund$valmrkt, mfund$windsor, pch=20) > abline(windsor.reg, col="green") ● 0.15 ● ● ● ● ● ● ● ● ● ● mfund$windsor ● ● ● ● ● ● ● ● ● 0.05 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.05 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● b_0 = 0.0036 ● −0.15 b_1 = 0.9357 ● −0.10 −0.05 0.00 0.05 0.10 0.15 mfund$valmrkt 5

  7. What is a good line? Statistics version! In a happy coincidence, the least squares line makes good statistical sense too. To see why, we need a model and we need to remember the conditional distribution. We will also use the model to talk about uncertainty. Okay, so lm(Y ∼ X) makes a great line, but how “likely” is it that our answer is useful? ◮ The concept of a sampling distribution is the fundamental idea in all of statistics, and understanding it is our main job today. 6

  8. Normal Distribution – Quick Review Why do we like the Normal distribution? ◮ Symmetric ◮ Concentration around the mean! → 95% of the data within 2 s.d. ֒ Z 0.025 Z 0.975 95% 2.5% 2.5% −3 sd −2 sd −1 sd mean +1 sd +2 sd +3 sd 7

  9. Simple linear regression (SLR) model ε ∼ N (0 , σ 2 ) Y = β 0 + β 1 X + ε, What’s important? ◮ It is a model, so we are assuming this relationship holds for some fixed but unknown values of β 0 , β 1 . ◮ It is linear. ◮ The error ε is independent & mean zero 1. E [ ε ] = 0 ⇔ E [ Y | X ] = β 0 + β 1 X 2. Fixed but unknown variance σ 2 ; constant over X 3. Most things are approx. Normal (Central Limit Theorem) 4. ε represents anything left, not captured in linear fcn of X ◮ It just works! This is a very robust model for the world. 8

  10. Remember the two types of regression questions: 1. Prediction 2. Model ˆ Y = b 0 + b 1 X Y = β 0 + β 1 X + ε Y = b 0 + b 1 X + e 1. Predicting Y ◮ Best guess for Y given (or “conditional on”) X . 2. Properties of β k ◮ Sign: Does Y go up when X goes up? ◮ Magnitude: By how much? 9

  11. Conditional distributions Regression models are really all about modeling the conditional distribution of Y given X . Why are conditional distributions important? We want to develop models for forecasting. What we are doing is exploiting the information in the conditional distribution of Y given X . The conditional distribution is obtained by “slicing” the point cloud in the scatterplot to obtain the distribution of Y conditional on various ranges of X values. 10

  12. Conditional v. marginal distribution Consider a regression of house price on size: “slice” of data { ● ● ● 400 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 300 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● conditional ● 100 marginal ● ● ● ● ● distribution ● distribution of price given 0.5 1.0 1.5 2.0 2.5 3.0 3.5 of price 3 < size < 3.5 size 400 ● 300 ● price ● 200 ● 100 regression line marg 1 − 1.5 1.5 − 2 2 − 2.5 2.5 − 3 3 − 3.5 11

  13. Key observations from these plots: ◮ Conditional distributions answer the forecasting problem: if I know that a house is between 1 and 1.5 1000 sq.ft., then the conditional distribution (second boxplot) gives me a point forecast (the mean) and prediction interval. ◮ The conditional means (medians) seem to line up along the regression line. ◮ The conditional distributions have much smaller dispersion than the marginal distribution. 12

  14. This suggests two general points: ◮ If X has no forecasting power, then the marginal and conditionals will be the same. ◮ If X has some forecasting information, then conditional means will be different than the marginal or overall mean and the conditional standard deviation of Y given X will be less than the marginal standard deviation of Y . 13

  15. Intuition from an example where X has no predictive power. ● ● ● ● 400 ● House price v. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● number of stop ● ● ● ● ● ● ● ● ● ● ● ● ● 300 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● signs (Y) within a price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● two-block radius ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 ● of a house (X) ● ● ● ● ● ● 0 1 2 3 4 # stops See that in this 400 case the 300 marginal and price 200 conditionals are not all that 100 different marg 0 1 2 3 4 14

  16. Before looking at any data, the model specifies ◮ how Y varies with X on average: E [ Y | X ] = β 0 + β 1 X ; i.e. what’s the trend? ◮ and the influence of factors other than X , ε ∼ N (0 , σ 2 ) independently of X . Y ε E [ Y | X ] = β 0 + β 1 X X 15

Recommend


More recommend