201ab quantitative methods l 12 linear model categorical
play

201ab Quantitative methods L.12 Linear model: Categorical - PowerPoint PPT Presentation

201ab Quantitative methods L.12 Linear model: Categorical predictors E D V UL | UCSD Psychology Psych 201ab: Quantitative methods Overly specific named procedures Response ~null ~binary ~category ~numerical ~numerical + category


  1. 201ab Quantitative methods L.12 Linear model: Categorical predictors E D V UL | UCSD Psychology Psych 201ab: Quantitative methods

  2. Overly specific named procedures Response ~null ~binary ~category ~numerical ~numerical + category Numerical 1-sample 2-sample T- ANOVA Regression, ANCOVA T-test test Pearson correlation Ranked- Mann- Kruskall- Spearman numerical Whitney-U Wallis correlation 2-category Binomial Fisher’s Chi-sq. Logistic regression test exact test indep. k-category Chi-sq. Chi-squared independence goodness of fit E D V UL | UCSD Psychology

  3. Conceptually correct, but some restrictions apply. E D V UL | UCSD Psychology

  4. Overly specific named procedures Response ~null ~binary ~category ~numerical ~numerical + category Numerical 1-sample 2-sample T- ANOVA Regression, ANCOVA T-test test Pearson correlation lm(y~1) lm(y~f) lm(y~x) lm(y~x+f) Ranked- Mann- Kruskall- Spearman numerical Whitney-U Wallis correlation ~ lm(rank(y)~rank(x)) ~ lm(rank(y)~f) 2-category Binomial Fisher’s Chi-sq. Logistic regression test exact test indep. glm(y~…, family=binomial()) k-category Chi-sq. Chi-squared independence goodness of fit ~ glm(y~…, family=poisson()) E D V UL | UCSD Psychology

  5. Overly specific named procedures Response ~null ~binary ~category ~numerical ~numerical + category Numerical 1-sample 2-sample T- ANOVA Regression, ANCOVA T-test test Pearson correlation lm(y~1) lm(y~f) lm(y~x) lm(y~x+f) Ranked- Mann- Kruskall- Spearman numerical Whitney-U Wallis correlation ~ lm(rank(y)~rank(x)) ~ lm(rank(y)~f) 2-category Binomial Fisher’s Chi-sq. Logistic regression test exact test indep. glm(y~…, family=binomial()) k-category Chi-sq. Chi-squared independence goodness of fit ~ glm(y~…, family=poisson()) E D V UL | UCSD Psychology

  6. GLM: Categorical predictors (factors) • Why? • Making it go in R. – Data representation for categorical variable – lm() implementation • What is it actually doing? – Different perspectives on categorical predictors – Predictors / design matrix in LM. – Coding categories into design matrix. • Variations that require extensions of LM – Unequal variance t-test or ANOVA – Repeated measures and other random effects / correlated error structures. E D V UL | UCSD Psychology

  7. Why categorical predictors? • Does mean y differ between… Predictor is treated – Treatment and control? as a dichotomous / – Males and females? binary categorical variable – Dogs and cats? • Does mean y vary among… Predictor is – Drug types? treated as a – Ethnicities? Religions? Etc. categorical variable – Dog breeds? E D V UL | UCSD Psychology

  8. Do the groups have different means? • If we have 1 group and a point null for mean, we test the intercept: lm(y~1) -- a “one-sample t-test” • If we have 2 groups and a null of same means: we test the difference coef: lm(y~f) -- a “2-sample t-test”. • If we have 3+ groups and a null of same means: we test the ANOVA: lm(y~f) – an “analysis of variance” – Lots of t-tests between pairs of groups are impractical, don’t answer the right question. – Instead we test the variance of means across groups : this is the “analysis of variance”. E D V UL | UCSD Psychology

  9. Three ways to think about factors Cell organization: Tidy data frame/table: Common formulation for doing How we will see our data. ANOVA calculation by hand. We avoid hand calculations, but this formulation helps understand what we are estimating. E D V UL | UCSD Psychology

  10. Categorical predictors in R E D V UL | UCSD Psychology

  11. Categorical predictors in R: 1-sample t-test • Does the mean of a group differ from some null mean? • E.g., does the mean level of conscientiousness deviate from random responses. – 10 (1-5 likert items), 6 positively coded, 4 negatively coded. – Mean expected from random responding: 6 (3*6 – 3*4) E D V UL | UCSD Psychology

  12. Categorical predictors in R: 1-sample t-test • Does the mean of a group differ from some null mean? • E.g., does the mean level of conscientiousness deviate from random responses. – 10 (1-5 likert items), 6 positively coded, 4 negatively coded. – Mean expected from random responding: 6 (3*6 – 3*4) Why is this wrong? E D V UL | UCSD Psychology

  13. Categorical predictors in R: 1-sample t-test • Does the mean of a group differ from some null mean? • E.g., does the mean level of conscientiousness deviate from random responses. – 10 (1-5 likert items), 6 positively coded, 4 negatively coded. – Mean expected from random responding: 6 (3*6 – 3*4) Via lm() Via t-test function E D V UL | UCSD Psychology

  14. Categorical predictors in R: 2-sample t-test • Do the two groups have the same mean? • E.g., does the mean level of conscientiousness differ between males and females? E D V UL | UCSD Psychology

  15. Categorical predictors in R: 2-sample t-test • Do the two groups have the same mean? • E.g., does the mean level of conscientiousness differ between males and females? Via t-test function Via lm() E D V UL | UCSD Psychology

  16. Categorical predictors in R: one-way anova • Do the groups have the same mean? i.e., is there non-zero variance across group means? • E.g., does the mean level of conscientiousness differ among religions? E D V UL | UCSD Psychology

  17. Categorical predictors in R: one-way anova • Do groups have same mean? Variance across group means? • does mean conscientiousness differ among religions? E D V UL | UCSD Psychology

  18. Categorical predictors in R: two-way anova • Does mean vary across either/both factors? Consistently? does mean conscientiousness vary among religion, gender? E D V UL | UCSD Psychology

  19. Categorical predictors in R: two-way anova • Does mean vary across either/both factors? Consistently? does mean conscientiousness vary among religion, gender? E D V UL | UCSD Psychology

  20. GLM: Categorical predictors (factors) • Why? • Making it go in R. – Data representation for categorical variable – lm() implementation • What is it actually doing? – Different perspectives on categorical predictors – Predictors / design matrix in LM. – Coding categories into design matrix. • Variations that require extensions of LM – Unequal variance t-test or ANOVA – Repeated measures and other random effects / correlated error structures. E D V UL | UCSD Psychology

  21. Three ways to think about factors Cell organization: Tidy data frame/table: Matrix notation: Common formulation for doing How we will see our data. How statistical software ANOVA calculation by hand. represents our data to do the analysis. We avoid hand calculations, but this formulation helps understand what Makes it easier to think we are estimating. about coding schemes. E D V UL | UCSD Psychology

  22. Y i = β 0 + β 1 X 1 i + β 2 X 2 i + ε i Y Y i β 0 Response ˆ Y i ≡ µ Y | X 1 i , X 2 i , ε i Plane Ÿ Ÿ β 2 Ÿ Ÿ Ÿ Ÿ Ÿ 2 β 2 β 1 (0,0,0) Ÿ Ÿ (0,1) β 1 + β 2 Ÿ Ÿ (0,2) (1,0) Ÿ Ÿ X 2 ( X 1 i , X 2 i ) Ÿ Ÿ X 1 (1,1) Ÿ (1,2) FROM JULIAN PARRIS E D V UL | UCSD Psychology

  23. Y i = β 0 + β 1 X 1 i + β 2 X 2 i + ε i ! $ ! $ ! $ y 1 1 x 11 x 21 ε 1 # & # & # & y 2 1 x 12 x 22 # & # & # ε 2 & # & # & # & ! $ y 3 1 x 13 x 23 β 0 ε 3 # & # & # & # & # & # & # & ... ... ... ... ... = # β 1 & + # & # & # & # & y i 1 x 1 i x 2 i ε i β 2 # & # & # & # & " % # & # & # & ... ... ... ... ... # & # & # & y n 1 x 1 n x 2 n ε n # & # & # & " % " % " % E D V UL | UCSD Psychology

  24. Y i = β 0 + β 1 X 1 i + β 2 X 2 i + ε i All the y data ! $ ! $ ! $ y 1 1 x 11 x 21 ε 1 points in a # & # & # & single vector y 2 1 x 12 x 22 # & # & # ε 2 & # & # & # & ! $ y 3 1 x 13 x 23 β 0 ε 3 # & # & # & # & # & # & # & ... ... ... ... ... = # β 1 & + # & # & # & # & y i 1 x 1 i x 2 i ε i β 2 # & # & # & # & " % # & # & # & ... ... ... ... ... # & # & # & y n 1 x 1 n x 2 n ε n # & # & # & " % " % " % E D V UL | UCSD Psychology

  25. Y i = β 0 + β 1 X 1 i + β 2 X 2 i + ε i All of the x predictors in one matrix. (constant 1 for the intercept: sometimes called X0) All the y data ! $ ! $ ! $ y 1 1 x 11 x 21 ε 1 points in a # & # & # & single vector y 2 1 x 12 x 22 # & # & # ε 2 & # & # & # & ! $ y 3 1 x 13 x 23 β 0 ε 3 # & # & # & # & # & # & # & ... ... ... ... ... = # β 1 & + # & # & # & # & y i 1 x 1 i x 2 i ε i β 2 # & # & # & # & " % # & # & # & ... ... ... ... ... # & # & # & y n 1 x 1 n x 2 n ε n # & # & # & " % " % " % E D V UL | UCSD Psychology

Recommend


More recommend