Confidence Intervals for Regression Parameters
Before we can derive confidence intervals for α and β, we first need to derive the probability distributions of a, b and \(\hat{\sigma}^2\). In the process of doing so, let's adopt the more traditional estimator notation, and the one our textbook follows, of putting a hat on greek letters. That is, here we'll use:
\(a=\hat{\alpha}\) and \[b=\hat{\beta}\]
Theorem. Under the assumptions of the simple linear regression model: \[\hat{\alpha}\sim N\left(\alpha,\dfrac{\sigma^2}{n}\right)\] |
Proof. Recall that the ML (and least squares!) estimator of α is:
\(a=\hat{\alpha}=\bar{Y}\)
where the responses Y_{i} are independent and normally distributed. More specifically:
\[Y_i \sim N(\alpha+\beta(x_i-\bar{x}),\sigma^2)\]
The expected value of \(\hat{\alpha}\) is α, as shown here:
The variance of \(\hat{\alpha}\) follow directly from what we know about the variance of a sample mean, namely:
\(Var(\hat{\alpha})=Var(\bar{Y})=\dfrac{\sigma^2}{n}\)
Therefore, since a linear combination of normal random variables is also normally distributed, we have:
\[\hat{\alpha} \sim N\left(\alpha,\dfrac{\sigma^2}{n}\right)\]
as was to be proved!
Theorem. Under the assumptions of the simple linear regression model: \[\hat{\beta}\sim N\left(\beta,\dfrac{\sigma^2}{\sum_{i=1}^n (x_i-\bar{x})^2}\right)\] |
Proof. Recalling one of the shortcut formulas for the ML (and least squares!) estimator of β:
\[b=\hat{\beta}=\dfrac{\sum_{i=1}^n (x_i-\bar{x})Y_i}{\sum_{i=1}^n (x_i-\bar{x})^2}\]
we see that the ML estimator is a linear combination of independent normal random variables Y_{i} with:
\[Y_i \sim N(\alpha+\beta(x_i-\bar{x}),\sigma^2)\]
The expected value of \[\hat{\beta}\] is β, as shown here:
And, the variance of \[\hat{\beta}\] is:
Therefore, since a linear combination of normal random variables is also normally distributed, we have:
\[\hat{\beta}\sim N\left(\beta,\dfrac{\sigma^2}{\sum_{i=1}^n (x_i-\bar{x})^2}\right)\]
as was to be proved!
Theorem. Under the assumptions of the simple linear regression model: \[\dfrac{n\hat{\sigma}^2}{\sigma^2}\sim \chi^2_{(n-2)}\] and \(a=\hat{\alpha}\), \[b=\hat{\beta}\], and \(\hat{\sigma}^2\) are mutually independent. |
Argument. First, note that the heading here says Argument, not Proof. That's because we are going to be doing some hand-waving and pointing to another reference, as the proof is beyond the scope of this course. That said, let's start our hand-waving. For homework, you are asked to show that:
\[\sum\limits_{i=1}^n (Y_i-\alpha-\beta(x_i-\bar{x}))^2=n(\hat{\alpha}-\alpha)^2+(\hat{\beta}-\beta)^2\sum\limits_{i=1}^n (x_i-\bar{x})^2+\sum\limits_{i=1}^n (Y_i-\hat{Y})^2\]
Now, if we divide through both sides of the equation by the population variance σ^{2}, we get:
\[\dfrac{\sum_{i=1}^n (Y_i-\alpha-\beta(x_i-\bar{x}))^2 }{\sigma^2}=\dfrac{n(\hat{\alpha}-\alpha)^2}{\sigma^2}+\dfrac{(\hat{\beta}-\beta)^2\sum\limits_{i=1}^n (x_i-\bar{x})^2}{\sigma^2}+\dfrac{\sum (Y_i-\hat{Y})^2}{\sigma^2}\]
Rewriting a few of those terms just a bit, we get:
\[\dfrac{\sum_{i=1}^n (Y_i-\alpha-\beta(x_i-\bar{x}))^2 }{\sigma^2}=\dfrac{(\hat{\alpha}-\alpha)^2}{\sigma^2/n}+\dfrac{(\hat{\beta}-\beta)^2}{\sigma^2/\sum\limits_{i=1}^n (x_i-\bar{x})^2}+\dfrac{n\hat{\sigma}^2}{\sigma^2}\]
Now, the terms are written so that we should be able to readily identify the distributions of each of the terms. The distributions are:
Now, it might seem reasonable that the last term is a chi-square random variable with n−2 degrees of freedom. That is .... hand-waving! ... indeed the case. That is:
\[\dfrac{n\hat{\sigma}^2}{\sigma^2} \sim \chi^2_{(n-2)}\]
and furthemore (more hand-waving!), \(a=\hat{\alpha}\), \[b=\hat{\beta}\], and \(\hat{\sigma}^2\) are mutually independent. (For a proof, you can refer to any number of mathematical statistics textbooks, but for a proof presented by one of the authors of our textbook, see Hogg, McKean, and Craig, Introduction to Mathematical Statistics, 6th ed.)
With the distributional results behind us, we can now derive (1−α)100% confidence intervals for α and β!
Theorem. Under the assumptions of the simple linear regression model, a (1−α)100% confidence interval for the slope parameter β is: \[b \pm t_{\alpha/2,n-2}\times \left(\dfrac{\sqrt{n}\hat{\sigma}}{\sqrt{n-2} \sqrt{\sum (x_i-\bar{x})^2}}\right)\] or equivalently: \[\hat{\beta} \pm t_{\alpha/2,n-2}\times \sqrt{\dfrac{MSE}{\sum (x_i-\bar{x})^2}}\] |
Proof. Recall the definition of a T random variable. That is, recall that if (1) Z is a standard normal (N(0,1)) random variable, (2) U is a chi-square random variable with r degrees of freedom, and (3) Z and U are independent, then:
\[T=\dfrac{Z}{\sqrt{U/r}}\]
follows a T distribution with r degrees of freedom. Now, our work above tells us that:
\[\dfrac{\hat{\beta}-\beta}{\sigma/\sqrt{\sum (x_i-\bar{x})^2}} \sim N(0,1) \] and \[\dfrac{n\hat{\sigma}^2}{\sigma^2} \sim \chi^2_{(n-2)}\] are independent
Therefore, we have that:
\[T=\dfrac{\dfrac{\hat{\beta}-\beta}{\sigma/\sqrt{\sum (x_i-\bar{x})^2}}}{\sqrt{\dfrac{n\hat{\sigma}^2}{\sigma^2}/(n-2)}}=\dfrac{\hat{\beta}-\beta}{\sqrt{\dfrac{n\hat{\sigma}^2}{n-2}/\sum (x_i-\bar{x})^2}}=\dfrac{\hat{\beta}-\beta}{\sqrt{MSE/\sum (x_i-\bar{x})^2}} \sim t_{n-2}\]
follows a T distribution with n−2 degrees of freedom. Now, deriving a confidence interval for β reduces to the usual manipulation of the inside of a probability statement:
\[P\left(-t_{\alpha/2} \leq \dfrac{\hat{\beta}-\beta}{\sqrt{MSE/\sum (x_i-\bar{x})^2}} \leq t_{\alpha/2}\right)=1-\alpha\]
leaving us with:
\[\hat{\beta} \pm t_{\alpha/2,n-2}\times \sqrt{\dfrac{MSE}{\sum (x_i-\bar{x})^2}}\]
as was to be proved!
Now, for the confidence interval for the intercept parameter α.
Theorem. Under the assumptions of the simple linear regression model, a (1−α)100% confidence interval for the intercept parameter α is: \[a \pm t_{\alpha/2,n-2}\times \left(\sqrt{\dfrac{\hat{\sigma}^2}{n-2}}\right)\] or equivalently: \[a \pm t_{\alpha/2,n-2}\times \left(\sqrt{\dfrac{MSE}{n}}\right)\] |
Proof. The proof, which again may or may not appear on a future assessment , is left for you for homework.
Example
The following table shows x, the catches of Peruvian anchovies (in millions of metric tons) and y, the prices of fish meal (in current dollars per ton) for 14 consecutive years. (Data from Bardach, JE and Santerre, RM, Climate and the Fish in the Sea, Bioscience 31(3), 1981).
Find a 95% confidence interval for the slope parameter β.
Solution. The following portion of output was obtained using Minitab's regression analysis package, with the parts useful to us here circled:
Minitab's basic descriptive analysis can also calculate the standard deviation of the x-values, 3.91, for us. Therefore, the formula for the sample variance tells us that:
\[\sum\limits_{i=1}^n (x_i-\bar{x})^2=(n-1)s^2=(13)(3.91)^2=198.7453\]
Putting the parts together, along with the fact that t_{0.025,12} = 2.179, we get:
\[-29.402 \pm 2.179 \sqrt{\dfrac{5139}{198.7453}}\]
which simplifies to:
\[-29.402 \pm 11.08\]
That is, we can be 95% confident that the slope parameter falls between −40.482 and −18.322. That is, we can be 95% confident that the average price of fish meal decreases between 18.322 and 40.482 dollars per ton for every one unit (one million metric ton) increase in the Peruvian anchovy catch.
Find a 95% confidence interval for the intercept parameter α.
Solution. We can use Minitab (or our calculator) to determine that the mean of the 14 responses is:
\[\dfrac{190+160+\cdots +410}{14}=270.5\]
Using that, as well as the MSE = 5139 obtained from the output above, along with the fact that t_{0.025,12} = 2.179, we get:
\[270.5 \pm 2.179 \sqrt{\dfrac{5139}{14}}\]
which simplifies to:
\[270.5 \pm 41.75\]
That is, we can be 95% confident that the intercept parameter falls between 228.75 and 312.25 dollars per ton.