Lesson 9: Influential Points
Overview of this Lesson
In this lesson, we learn about how data observations can potentially be influential in different ways. If an observation has a response value that is very different from the predicted value based on a model, then that observation is called an outlier. On the other hand, if an observation has a particularly unusual combination of predictor values (e.g., one predictor has a very different value for that observation compared with all the other data observations), then that observation is said to have high leverage. Thus, there is a distinction between outliers and high leverage observations, and each can impact our regression analyses differently. It is also possible for an observation to be both an outlier and have high leverage. Thus, it is important to know how to detect outliers and high leverage data points. Once we've identified any outliers and/or high leverage data points, we then need to determine whether or not the points actually have an undue influence on our model. This lesson addresses all these issues using the following measures:
 leverages
 residuals
 standardized residuals
 deleted residuals (or PRESS prediction errors)
 studentized residuals
 difference in fits (DFFITS)
 Cook's distances
Key Learning Goals for this Lesson: 
