PS 405 Week 4 Section: Difference of means, ANOVA, and Matrix - PowerPoint PPT Presentation

PS 405 – Week 4 Section: Difference of means, ANOVA, and Matrix Algebra D.J. Flynn February 4, 2014

t-tests ◮ for equality of two sample means ◮ hypotheses: H 0 : no difference in sample means H A : significant difference ◮ calculating the t-stat: t = statistic - hypothesized difference SE of estimate

Gender/partisanship example Question: are men and women equally likely to be Democrats? t-stat for difference in proportions: ( P m − P f ) t = � P m ( 1 − P m ) + P f ( 1 − P F ) n m n F p-value that R estimates is for null of no difference; confidence interval is for difference between two sample means Interpretation of p-value: “if null hypothesis is true, how ofen would we observe a difference this large under repeated sampling?” – NOT there is a p % chance that the true difference is equal to X.

Logic of ANOVA and the F test ◮ running theme: experiments with > 2 groups ◮ does assignment to a particular group (X) affect some continuous outcome (Y)? ◮ this question can be answered with one-way ANOVA (AKA F-test) ◮ two sources of variation in DV: ◮ intended: independent variable/factor ◮ unintended: error/residual ◮ goal of ANOVA: determine share of variance explained by X

ANOVA table ◮ Go through table quickly ◮ F statistic (sometimes called F-act): unexplained variance = MS A explained variance F = MS E ◮ look up critical F-stat based on numerator df, denominator df, and confidence level ◮ if F-act > F-critical, then we reject the null of independence

ANOVA in R 1. identify independent and dependent variables 2. determine variable structures (and change if necessary) 3. estimate ANOVA and call up results

Determining variable structure ◮ str(variable) returns the structure of a variable: integer, factor, character, number, logical ◮ important because ANOVAs are used for categorical IVs ◮ practice: install.packages("datasets") library(datasets) names(chickwts) str(chickwts$weight) str(chickwts$feed) levels(chickwts$feed)

Estimating ANOVAs in R anova<-aov(weight ∼ feed,data=chickwts) summary(anova) Df Sum Sq Mean Sq F value Pr(>F) feed 5 231129 46226 15.37 5.94e-10 *** Residuals 65 195556 3009 --- Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’

What happens if we instead estimate aov(feed ∼ weight) ? wrong.model<-aov(feed ∼ weight,data=chickwts) Warning messages: 1: In model.response(mf, "numeric") : using type = "numeric" with a factor response will be ignored 2: In Ops.factor(y, z$residuals) : - not meaningful for factors

Another example We have data on which undergraduate institution people attended and mid-life satisfaction (0-100): names(my.data) [1] "school" "satisfaction" table(my.data$school) school fsu uf um 5 5 5 my.anova<-aov(satisfaction ∼ school,data=my.data) summary(my.anova) Df Sum Sq Mean Sq F value Pr(>F) school 2 7216 3608 11.85 0.00144 ** Residuals 12 3655 305 --- Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’

fsu<-subset(my.data,school=="fsu") uf<-subset(my.data,school=="uf") um<-subset(my.data,school=="um") mean(fsu$satisfaction) [1] 92.6 mean(uf$satisfaction) [1] 39.2 mean(um$satisfaction) [1] 60.8

Changing variable structure ◮ Current structure: is.factor is.numeric is.character is.vector ... will return TRUE or FALSE ◮ New structure: as.factor as.numeric as.character as.vector ... will change object to desired structure

Generalizations of the one-way ANOVA 1. two-way ANOVA: if we have more than 1 explanatory factor (e.g., soil type + type of potato = potato yield) 2. ANCOVA: ANOVA with a continuous covariate (e.g., soil type + type of potato + weather = potato yield)

Example of two-way ANOVA in R Does income depend on type of profession and education? library(car) names(Prestige) [1] "education" "income" "women" "prestige" "census" "type" str(Prestige$education) num [1:102] 13.1 12.3 12.8 11.4 14.6 ... str(Prestige$type) Factor w/ 3 levels "bc","prof","wc": 2 2 2 2 2 2 2 ...

summary(Prestige$education) Min. 1st Qu. Median Mean 3rd Qu. Max. 6.380 8.445 10.540 10.740 12.650 15.970 Prestige$education.recoded<-recode(Prestige$education, "6.38:8.445=1;8.446:10.54=2;10.55:10.74=3;10.75:12.65=4; 12.66:15.97=5;else=NA") table(Prestige$education.recoded) 1 2 3 4 5 26 25 2 23 26 as.factor(Prestige$education.recoded) [1] 5 4 5 4 5 5 5 5 5 5 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 4 [37] 4 3 4 2 4 4 2 2 2 4 4 4 4 2 4 2 2 2 4 4 4 2 4 1 2 3 2 [73] 1 1 1 2 2 1 1 1 2 2 2 1 1 2 1 2 2 1 1 1 1 1 1 4 2 1 1 Levels: 1 2 3 4 5

my.two.way<-aov(income ∼ type+education.recoded, data=Prestige) summary(my.two.way) Df Sum Sq Mean Sq F value Pr(>F) type 2 5.960e+08 297978078 25.266 1.65e-09 ** education.recoded 1 2.952e+07 29520188 2.503 0.117 Residuals 94 1.109e+09 11793647 --- Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 4 observations deleted due to missingness

Matrix algebra terms ◮ scalar ◮ vector ◮ matrix

Matrix algebra operations ◮ addition ◮ subtraction ◮ multiplication ◮ inverse ◮ transpose

Why we care: the linear model ◮ Scalar form : Y i = β 0 + β 1 X 1 i + β 2 X 2 i + ...β K X Ki + ǫ i ◮ Matrix form : Y i = X i β + ǫ i Benefits of matrix form: 1. more parsimonious expression of models with lots of covariates 2. understand what’s going on behind the scenes. For example, the parameter β is estimated by calculating ( X T X ) − 1 X T y

This is the linear model in matrix form: Y i = X i β + ǫ i For each term in this equation... ◮ scalar, vector, or matrix? ◮ size?

PS 405 Week 4 Section: Difference of means, ANOVA, and Matrix - PowerPoint PPT Presentation

PS 405 Week 4 Section: Difference of means, ANOVA, and Matrix Algebra D.J. Flynn February 4, 2014 t-tests for equality of two sample means hypotheses: H 0 : no difference in sample means H A : significant difference calculating

Two-Way ANOVA Two-way ANOVA So far, our ANOVA problems had only one dependent variable and

Workshop 7.6a: Factorial ANOVA Murray Logan 19 Jul 2017 Section 1 Background Factorial ANOVA

Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June

STAT 213 ANOVA as Multiple Regression Colin Reimer Dawson Oberlin College 5 April 2016 Outline

Topic 9 - ANOVA Background ANOVA 1 Comparing several means (some situations) Does

I-405 Peak-Use Shoulder Lane Project Overview Barrett Hanson, P.E. Design Manager WSDOT

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

EDUR 8131 Chat 13: ANOVA , Part 2 1 Notes 9a: One-way ANOVA Previous chat covered through

R06 - ANOVA and F-tests STAT 587 (Engineering) Iowa State University November 3, 2020

ANOVA: Analysis of Variance An example ANOVA problem 25 individuals split into three

SVD- -based Functional ANOVA For based Functional ANOVA For SVD Measurement Evaluation of

Statistical Power in Statistical Power in ANOVA ANOVA Rick Balkin Balkin, Ph.D., LPC , Ph.D.,

STAT 213 Two-Way ANOVA II Colin Reimer Dawson Oberlin College May 2, 2018 1 / 21 Outline

STAT 215 Multifactor ANOVA I Colin Reimer Dawson Oberlin College November 28, 2017 1 / 25

Factorial ANOVA Theory Rick Balkin, Ph.D., LPC-S, NCC Department of Counseling Texas A&M

Writing Results for Writing Results for ANOVA ANOVA Rick Balkin Balkin, Ph.D., LPC , Ph.D.,

Continuous Improvement Toolkit ANOVA Continuous Improvement Toolkit . www.citoolkit.com Managing

Overview Kursus 02402 Introduction to Statistics Oneway analysis of Variance (ANOVA) 1 Intro

Analysis of Variance October 16, 2019 October 16, 2019 1 / 23 ANOVA and the F-test Question:

P2PRG Live Streaming Research Questions March 24 th , 2010 Omer Luzzatti RayV Overview The

Applied Political Research Session 11: Analysis of Variance (ANOVA) Lecturer: Prof. A.

Univariate 1-Way ANOVA as a Linear Model with Fixed Regressors Group 1 Group 2 Group 3 x x x

Sta$s$cs & Experimental Design with R Barbara Kitchenham

Multivariate Analysis of Variance Max Turgeon STAT 4690Applied Multivariate Analysis Quick

PS 405 Week 4 Section: Difference of means, ANOVA, and Matrix - PowerPoint PPT Presentation

PS 405 Week 4 Section: Difference of means, ANOVA, and Matrix Algebra D.J. Flynn February 4, 2014 t-tests for equality of two sample means hypotheses: H 0 : no difference in sample means H A : significant difference calculating

Two-Way ANOVA Two-way ANOVA So far, our ANOVA problems had only one dependent variable and

Workshop 7.6a: Factorial ANOVA Murray Logan 19 Jul 2017 Section 1 Background Factorial ANOVA

Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June

STAT 213 ANOVA as Multiple Regression Colin Reimer Dawson Oberlin College 5 April 2016 Outline

Topic 9 - ANOVA Background ANOVA 1 Comparing several means (some situations) Does

I-405 Peak-Use Shoulder Lane Project Overview Barrett Hanson, P.E. Design Manager WSDOT

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

EDUR 8131 Chat 13: ANOVA , Part 2 1 Notes 9a: One-way ANOVA Previous chat covered through

R06 - ANOVA and F-tests STAT 587 (Engineering) Iowa State University November 3, 2020

ANOVA: Analysis of Variance An example ANOVA problem 25 individuals split into three

SVD- -based Functional ANOVA For based Functional ANOVA For SVD Measurement Evaluation of

Statistical Power in Statistical Power in ANOVA ANOVA Rick Balkin Balkin, Ph.D., LPC , Ph.D.,

STAT 213 Two-Way ANOVA II Colin Reimer Dawson Oberlin College May 2, 2018 1 / 21 Outline

STAT 215 Multifactor ANOVA I Colin Reimer Dawson Oberlin College November 28, 2017 1 / 25

Factorial ANOVA Theory Rick Balkin, Ph.D., LPC-S, NCC Department of Counseling Texas A&amp;M

Writing Results for Writing Results for ANOVA ANOVA Rick Balkin Balkin, Ph.D., LPC , Ph.D.,

Continuous Improvement Toolkit ANOVA Continuous Improvement Toolkit . www.citoolkit.com Managing

Overview Kursus 02402 Introduction to Statistics Oneway analysis of Variance (ANOVA) 1 Intro

Analysis of Variance October 16, 2019 October 16, 2019 1 / 23 ANOVA and the F-test Question:

P2PRG Live Streaming Research Questions March 24 th , 2010 Omer Luzzatti RayV Overview The

Applied Political Research Session 11: Analysis of Variance (ANOVA) Lecturer: Prof. A.

Univariate 1-Way ANOVA as a Linear Model with Fixed Regressors Group 1 Group 2 Group 3 x x x

Sta$s$cs &amp; Experimental Design with R Barbara Kitchenham

Multivariate Analysis of Variance Max Turgeon STAT 4690Applied Multivariate Analysis Quick

Factorial ANOVA Theory Rick Balkin, Ph.D., LPC-S, NCC Department of Counseling Texas A&M

Sta$s$cs & Experimental Design with R Barbara Kitchenham