Find a significant difference between groups Multivariate Analysis of Variance (MANOVA)
Consider Univariate ANOVA Used when you have 3 or more samples π¦ π΅πΆπ· C B A 508 514.25 583.25 727.5 π¦ π π¦ πΆ π¦ π΅ πΌ π : π π΅ = π πΆ = π π· πΌ π : π π΅ β π πΆ β π π· The alternative could be true because all the means are different or just one of them is different than the others If we reject the null hypothesis we need to perform some further analysis to draw conclusions about which population means differ from the others and by how much
Consider Univariate ANOVA Used when you have 3 or more samples π¦ π΅πΆπ· C B A SIGNAL NOISE 508 514.25 583.25 727.5 π¦ π π¦ πΆ π¦ π΅ πΊ = π‘πππππ πΊ = π€ππ πππππ πππ’π₯πππ ππππ‘π π€ππ πππππ π₯ππ’βππ π π π¦ π β π¦ π΅ππ 2 π€ππ πππππ π π π π€ππ πππππ π₯ππ’βππ = π€ππ πππππ πππ’π₯πππ = β π π π β 1 A large F-value indicates a significant difference
Consider Univariate ANOVA Used when you have 3 or more samples π¦ π΅πΆπ· C B A SIGNAL NOISE 508 514.25 583.25 727.5 π¦ π π¦ πΆ π¦ π΅ β 4 = 727.5 β 583.25 2 + 514.25 β 583.25 2 + 508 β 583.25 2 π΅,πΆ,π· π¦ π β π¦ π΅πΆπ· 2 π€ππ πππππ πππ’π₯πππ = π β 4 3 β 1 2 π€ππ πππππ πππ’π₯πππ = πππππ. ππ π€ππ πππππ π₯ππ’βππ = π€ππ π΅ + π€ππ πΆ + π€ππ = 891.6667 + 819.3333 + 305.5833 π· 3 3 π€ππ πππππ π₯ππ’βππ = πππ. ππππ πΊ = π€ππ πππππ πππ’π₯πππ π€ππ πππππ π₯ππ’βππ = 62463.25 One-way ANOVA in R: 672.1943 = ππ. πππππ anova(lm(YIELD~VARIETY))
F-Distribution (family of distributions- shape is dependent on degrees of freedom) π‘πππππ < ππππ‘π π‘πππππ > ππππ‘π πΊ = π‘πππππ πΊ = π€ππ πππππ πππ’π₯πππ ππππ‘π π€ππ πππππ π₯ππ’βππ Probability of observation π π¦ π β π¦ π΅ππ 2 π€ππ πππππ πππ’π₯πππ = π β πππππ’ π β 1 π π€ππ πππππ π₯ππ’βππ = π€ππ πππππ π π π β= 0.05 β In R: In R: qf ( p, ππ 1 , ππ 2 ) pf ( F, ππ 1 , ππ 2 ) P-value (percentiles, probabilities) Present 1-p-value 0 0.50 0.95 The larger the F-value the further into the tail β AND the smaller the probability that the calculated F- value was found by chance, MEANING there is a high probability that something is causing a significant difference between the groups
Using DISCRIM to predict which group Problem : A new skull is found but we donβt know whether it belongs to homo erectus or homo habilis or if itβs a new group? Homo erectus Skull measurement Homo habilis Group centroid New find (unknown origin) Popular method in taxonomy and anthropology How predictions work: 1. Calculate group centroid 2. Find out which centroid is the closest position to the unknown data point New groups are defined when we find a significant difference between new find and predefined groups
Multivariate Analysis of Variance (MANOVA) Is there a significant difference among groups based on multiple response variables? (e.g. ANOVA with multiple response variables) When we calculate a centroid of a group Skull measurement you build a probability distribution around the centroid for comparison You can the run repeated t-tests (with adjusted p-values for multiple comparisons) to compare the new data to the groups but MANOVA does it all for you in one shot! MANOVA in R: output=manova(responseMatrix~predictorMatrix) (stats package) Another lab on MANOVA for reference: Lauraβs website, RENR 480, Lab 22
Assumptions of (MANOVA) MANOVA is VERY sensitive to invalid assumptions and outliers Within groups we need to have: 1. Normality: Residuals have to be normally distributed 2. Homogeneity of variances: residuals need to have equal variances Need to meet the assumption in the univariate context to meet them for multivariate analyses You therefore first have to check each individual measurement (response variable) for normality and homogeneity e.g. By making boxplots or plotting ANOVA residuals for each variable
Assumptions of (MANOVA) Boxplots in R (multiple plots): boxplot(ResponseVariable~Group) Generate boxplots for each response variable and assess shape & whiskers Frequency Left skewed Normal Right skewed Median negatively skewed perfectly symmetric positively skewed Represented as a Mean boxplot Mode Mode Bi-Modal Frequency Two different modes Not necessarily symmetric Mean Median
Assumptions of (MANOVA) Residual plots in R (multiple plots): plot(lm(ResponseVariable~Group))(2 nd plot) Testing for Normality & Equal Variances β Residual Plots Predicted values β’ NORMAL distribution: equal number of points along observed β’ EQUAL variances: equal spread on either side of the mean predicted value =0 0 β’ Good to go! Observed (original units) Predicted values β’ NON-NORMAL distribution: unequal number of points along observed β’ 0 EQUAL variances: equal spread on either side of the mean predicted value =0 β’ Optional to fix Observed (original units) Predicted values β’ NORMAL/NON NORMAL: look at histogram or test β’ 0 UNEQUAL variances: cone shape β away from or towards zero β’ This needs to be fixed for MANOVA (transformations) Observed (original units) Predicted values 0 β’ OUTLIERS: points that deviate from the majority of data points β’ This needs to be fixed for MANOVA (transformations or removal) Observed (original units)
Assumptions of (MANOVA) If you violate the assumptions of MANOVA: 1. Transform your data (follow examples we will discuss on the board) 2. Use non-parametric options (e.g. perMANOVA Lab 6)
Multivariate Analysis of Variance (MANOVA) - output You can see if there is a significant difference across all predictor variables using the Wilkβs MANOVA test statistic Or you can see if there is a significant difference among groups for each predictor variable separately P-value β the probability the observed difference between groups or larger is due to random chance Thus if p-value is small this means that something is having an effect on the groups causing the difference
Recommend
More recommend