Lesson 10: Analysis of Covariance (ANCOVA)
|Key Learning Goals for this Lesson:|
See Textbook: Chapter 22
A Few Comments About ANCOVA
In the next two units we are going to build on concepts that we learned so far in this course, but these next two units are also going to remind us of the principles and foundations of regression that you learned in STAT 501. These are going to expand on the idea of the general linear model and how it can handle both quantitative and qualitative predictors. In the general linear model, when we're talking about the analysis of covariance, this can be thought of as sort of the larger picture, an 'umbrella' procedure if you will. If you have a model where you have no continuous factors you simply have an ANOVA. If you have a model with no categorical factors you simply have a regression. If you have a model that has both continuous and categorical factors then this is a General Linear Model and you can use ANCOVA to include both of these different types of factors.
You might find it interesting that historically when SAS first came out they had PROC ANOVA and PROC REGRESSION and that was it. Then people asked,"What about the case when you have categorical factors and you want to do an ANOVA but now you have this other variable, a continuous variable, that you can use as a covariate to account for extraneous variability in the response?" So, SAS came out with PROC GLM which is the general linear model. With PROC GLM you could take the continuous regression variable pop it into the ANOVA model and it runs. Or, conversely, if you are running a regression and you have a categorical predictor like gender, you could include it into the regression model and it runs. The general linear model handles both the regression and the categorical variables in the same model. There is no PROC ANCOVA is SAS but there is PROC MIXED. PROC GLM had problems when it came to random effects, and was effectively replaced by PROC MIXED. The same sort of process can be seen in Minitab and accounts for the multiple tabs under Stat > ANOVA and Stat > Regression. In SAS PROC MIXED or in Minitab's General Linear Model, you have the capacity to include covariates and correctly work with random effects. But enough about history, let's get to this lesson.
In the first lesson we will address the classic case of ANCOVA where the ANOVA is potentially improved by adjusting for the presence of a linear covariate. In the second part we will deal with a little bit more complexity by considering functions of the covariate that are not linear. We will generalize the treatment of the continuous factors to include polynomials, with linear, quadratic, cubic components that can interact with categorical treatment levels.
We find this idea of ANCOVA not only interesting in the fact that merges these two statistical concepts, but can also be very powerful Aha! moment for students studying statistics.
Introduction to Analysis of Covariance (ANCOVA)
A ‘classic’ ANOVA tests for differences in mean responses to categorical factor (treatment) levels. When we have heterogeneity in experimental units sometimes restrictions on the randomization (blocking) can improve the test for treatment effects. In some cases, we don’t have the opportunity to construct blocks, but can recognize and measure a continuous variable as contributing to the heterogeneity in the experimental units.
These sources of extraneous variability historically have been referred to as ‘nuisance’ or ‘concomitant’ variables. More recently, these variables are referred to as ‘covariates’.
When a continuous covariate is included in an ANOVA we have the analysis of covariance (ANCOVA). The continuous covariates enter the model as regression variables, and we have to be careful to go through several steps to employ the ANCOVA method.
Inclusion of covariates in ANCOVA models often means the difference between concluding there are or are not significant differences among treatment means using ANOVA.