Hey, we've checked off the estimation of a number of population parameters already. Let's check off a few more! In this lesson, we'll derive (1−α)100% confidence intervals for:
(1) a single population variance: \(\sigma^2\)
(2) the ratio of two population variances: \(\dfrac{\sigma^2_X}{\sigma^2_Y}\) or \(\dfrac{\sigma^2_Y}{\sigma^2_X}\)
Along the way, we'll take a side path to explore the characteristics of the probability distribution known as the Fdistribution.
Let's start right out by stating the confidence interval for one population variance.
Theorem. If X_{1}, X_{2}, ...X_{n} are normally distributed and \(a=\chi^2_{1\alpha/2,n1}\) and \(b=\chi^2_{\alpha/2,n1}\), then a (1−α)% confidence interval for the population variance σ^{2} is: \(\left(\dfrac{(n1)s^2}{b} \leq \sigma^2 \leq \dfrac{(n1)s^2}{a}\right)\) And a (1−α)% confidence interval for the population standard deviation σ is: \(\left(\dfrac{\sqrt{(n1)}}{\sqrt{b}}s \leq \sigma \leq \dfrac{\sqrt{(n1)}}{\sqrt{a}}s\right)\) 
Proof. We learned previously that if X_{1}, X_{2}, ...X_{n} are normally distributed with mean μ and population variance σ^{2}, then:
\(\dfrac{(n1)S^2}{\sigma^2} \sim \chi^2_{n1}\)
Then, using the following picture as a guide:
with (\(a=\chi^2_{1\alpha/2}\)) and (\(b=\chi^2_{\alpha/2}\)), we can write the following probability statement:
\(P\left[a\leq \dfrac{(n1)S^2}{\sigma^2} \leq b\right]=1\alpha\)
Now, as always it's just a matter of manipulating the quantity in the parentheses. That is:
\(a\leq \dfrac{(n1)S^2}{\sigma^2} \leq b\)
Taking the reciprocal of all three terms, and thereby changing the direction of the inequalities, we get:
\(\dfrac{1}{a}\geq \dfrac{\sigma^2}{(n1)S^2} \geq \dfrac{1}{b}\)
Now, multiplying through by (n−1)S^{2}, and rearranging the direction of the inequalities, we get the confidence interval for σ^{2}:
\(\dfrac{(n1)S^2}{b} \leq \sigma^2 \leq \dfrac{(n1)S^2}{a}\)
as was to be proved. And, taking the square root, we get the confidence interval for σ:
\(\dfrac{\sqrt{(n1)S^2}}{\sqrt{b}} \leq \sigma \leq \dfrac{\sqrt{(n1)S^2}}{\sqrt{a}}\)
as was to be proved.
A large candy manufacturer produces, packages and sells packs of candy targeted to weigh 52 grams. A quality control manager working for the company was concerned that the variation in the actual weights of the targeted 52gram packs was larger than acceptable. That is, he was concerned that some packs weighed significantly less than 52grams and some weighed significantly more than 52 grams. In an attempt to estimate σ, the standard deviation of the weights of all of the 52gram packs the manufacturer makes, he took a random sample of n = 10 packs off of the factory line. The random sample yielded a sample variance of 4.2 grams. Use the random sample to derive a 95% confidence interval for σ.
Solution. First, we need to determine the two chisquare values with (n−1) = 9 degrees of freedom. Using the table in the back of the text book, we see that they are:
\(a=\chi^2_{1\alpha/2,n1}=\chi^2_{0.975,9}=2.7\) and \(b=\chi^2_{\alpha/2,n1}=\chi^2_{0.025,9}=19.02\)
Now, it's just a matter of substituting in what we know into the formula for the confidence interval for the population variance. Doing so, we get:
\(\left(\dfrac{9(4.2)}{19.02} \leq \sigma^2 \leq \dfrac{9(4.2)}{2.7}\right)\)
Simplifying, we get:
\((1.99\leq \sigma^2 \leq 14.0)\)
We can be 95% confident that the variance of the weights of all of the packs of candy coming off of the factory line is between 1.99 and 14.0 gramssquared. Taking the square root of the confidence limits, we get the 95% confidence interval for the population standard deviation σ:
\((1.41\leq \sigma \leq 3.74)\)
That is, we can be 95% confident that the standard deviation of the weights of all of the packs of candy coming off of the factory line is between 1.41 and 3.74 grams.
(1) Under the Stat menu, select Basic Statistics, and then select 1 Variance...:
(2) In the popup window that appears, in the box labeled Data, select Sample variance. Then, fill in the boxes labeled Sample size and Sample variance.
(3) Click on the button labeled Options... In the popup window that appears, specify the confidence level and "not equal" for the alternative.
Then, click on OK to return to the main popup window.
(4) Then, upon clicking OK on the main popup window, the output should appear in the Session window:
As we'll soon see, the confidence interval for the ratio of two variances requires the use of the probability distribution known as the Fdistribution. So, let's spend a few minutes learning the definition and characteristics of the Fdistribution.
Definition. If U and V are independent chisquare random variables with r_{1} and r_{2} degrees of freedom, respectively, then: \(F=\dfrac{U/r_1}{V/r_2}\) follows an Fdistribution with r_{1} numerator degrees of freedom and r_{2} denominator degrees of freedom. We write F ~ F(r_{1}, r_{2}). 
(1) Fdistributions are generally skewed. The shape of an Fdistribution depends on the values of r_{1} and r_{2}, the numerator and denominator degrees of freedom, respectively, as this picture pirated from your textbook illustrates:
(2) The probability density function of an F random variable with r_{1} numerator degrees of freedom and r_{2 }denominator degrees of freedom is:
\(f(w)=\dfrac{(r_1/r_2)^{r_1/2}\Gamma[(r_1+r_2)/2]w^{(r_1/2)1}}{\Gamma[r_1/2]\Gamma[r_2/2][1+(r_1w/r_2)]^{(r_1+r_2)/2}}\)
over the support w ≥ 0.
(3) The definition of an Frandom variable:
\(F=\dfrac{U/r_1}{V/r_2}\)
implies that if the distribution of W is F(r_{1}, r_{2}), then the distribution of 1/W is F(r_{2}, r_{1}).
One of the primary ways that we will need to interact with an Fdistribution is by needing to know either (1) an Fvalue, or (2) the probabilities associated with an Frandom variable, in order to complete a statistical analysis. We could go ahead and try to work with the above probability density function to find the necessary values, but I think you'll agree before long that we should just turn to an Ftable, and let it do the dirty work for us. For that reason, we'll now explore how to use a typical Ftable to look up Fvalues and/or Fprobabilities. Let's start with two definitions.
Definition. Let α be some probability between 0 and 1 (most often, a small probability less than 0.10). The upper 100α^{th} percentile of an Fdistribution with r_{1} and r_{2} degrees of freedom is the value \(F_\alpha(r_1,r_2)\) such that the area under the curve and to the right of \(F_\alpha(r_1,r_2)\) is α:

The above definition is used in Table VII, the Fdistribution table in the back of your textbook. While the next definition is not used directly in Table VII, you'll still find it necessary when looking for Fvalues (or Fprobabilities) in the left tail of an Fdistribution.
Definition. Let α be some probability between 0 and 1 (most often, a small probability less than 0.10). The 100α^{th} percentile of an Fdistribution with r_{1} and r_{2} degrees of freedom is the value \(F_{1\alpha}(r_1,r_2)\) such that the area under the curve and to the right of \(F_{1\alpha}(r_1,r_2)\) is 1−α: 
With the two definitions behind us, let's now take a look at the Ftable in the back of your textbook.
In summary, here are the steps you should take in using the Ftable to find an Fvalue:
Now, at least theoretically, you could also use the Ftable to find the probability associated with a particular Fvalue. But, as you can see, the table is pretty (very!) limited in that direction. For example, if you have an F random variable with 6 numerator degrees of freedom and 2 denominator degrees of freedom, you could only find the probabilities associated with the F values of 19.33, 39.33, and 99.33:
What would you do if you wanted to find the probability that an F random variable with 6 numerator degrees of freedom and 2 denominator degrees of freedom was less than 6.2, say? Well, the answer is, of course... statistical software, such as SAS or Minitab! For what we'll be doing, the F table will (mostly) serve our purpose. When it doesn't, we'll use Minitab. At any rate, let's get a bit more practice now using the F table.
Let X be an F random variable with 4 numerator degrees of freedom and 5 denominator degrees of freedom. What is the upper fifth percentile?
Solution. The upper fifth percentile is the Fvalue x such that the probability to the right of x is 0.05, and therefore the probability to the left of x is 0.95. To find x using the Ftable, we:
Now, all we need to do is read the Fvalue where the r_{1} = 4 column and the identified α = 0.05 row intersect. What do you get?
The table tells us that the upper fifth percentile of an F random variable with 4 numerator degrees of freedom and 5 denominator degrees of freedom is 5.19.
Let X be an F random variable with 4 numerator degrees of freedom and 5 denominator degrees of freedom. What is the first percentile?
Solution. The first percentile is the Fvalue x such that the probability to the left of x is 0.01 (and hence the probability to the right of x is 0.99). Since such an Fvalue isn't directly readable from the Ftable, we need to do a little finagling to find x using the Ftable. That is, we need to recognize that the Fvalue we are looking for, namely F_{0.99}(4,5), is related to F_{0.01}(5,4), a value we can read off of the table by way of this relationship:
\(F_{0.99}(4,5)=\dfrac{1}{F_{0.01}(5,4)}\)
That said, to find x using the Ftable, we:
Now, all we need to do is read the Fvalue where the r_{1} = 5 column and the identified α = 0.01 row intersect, and take the inverse. What do you get?
The table, along with a minor calculation, tells us that the first percentile of an F random variable with 4 numerator degrees of freedom and 5 denominator degrees of freedom is 1/15.52 = 0.064.
What is the probability that an F random variable with 4 numerator degrees of freedom and 5 denominator degrees of freedom is greater than 7.39?
Solution. There I go... just a minute ago, I said that the Ftable isn't very helpful in finding probabilities, then I turn around and ask you to use the table to find a probability! Doing it at least once helps us make sure that we fully understand the table. In this case, we are going to need to read the table "backwards." To find the probability, we:
What do you get?
The table tells us that the probability that an F random variable with 4 numerator degrees of freedom and 5 denominator degrees of freedom is greater than 7.39 is 0.025.
Now that we have the characteristics of the Fdistribution behind us, let's again jump right in by stating the confidence interval for the ratio of two population variances.
Theorem. If \(X_1,X_2,\ldots,X_n \sim N(\mu_X,\sigma^2_X)\) and \(Y_1,Y_2,\ldots,Y_m \sim N(\mu_Y,\sigma^2_Y)\) are independent random samples, and: (1) \(c=F_{1\alpha/2}(m1,n1)=\dfrac{1}{F_{\alpha/2}(n1,m1)}\) and (2) \(d=F_{\alpha/2}(m1,n1)\), then a (1−α) 100% confidence interval for \(\sigma^2_X/\sigma^2_Y\) is: \(\left(\dfrac{1}{F_{\alpha/2}(n1,m1)} \dfrac{s^2_X}{s^2_Y} \leq \dfrac{\sigma^2_X}{\sigma^2_Y}\leq F_{\alpha/2}(m1,n1)\dfrac{s^2_X}{s^2_Y}\right)\) 
Proof. Because \(X_1,X_2,\ldots,X_n \sim N(\mu_X,\sigma^2_X)\) and \(Y_1,Y_2,\ldots,Y_m \sim N(\mu_Y,\sigma^2_Y)\) , it tells us that:
\(\dfrac{(n1)S^2_X}{\sigma^2_X}\sim \chi^2_{n1}\) and \(\dfrac{(m1)S^2_Y}{\sigma^2_Y}\sim \chi^2_{m1}\)
Then, by the independence of the two samples, we well as the definition of an F random variable, we know that:
\(F=\dfrac{\dfrac{(m1)S^2_Y}{\sigma^2_Y}/(m1)}{\dfrac{(n1)S^2_X}{\sigma^2_X}/(n1)}=\dfrac{\sigma^2_X}{\sigma^2_Y}\cdot \dfrac{S^2_Y}{S^2_X} \sim F(m1,n1)\)
Therefore, the following probability statement holds:
\(P\left[F_{1\frac{\alpha}{2}}(m1,n1) \leq \dfrac{\sigma^2_X}{\sigma^2_Y}\cdot \dfrac{S^2_Y}{S^2_X} \leq F_{\frac{\alpha}{2}}(m1,n1)\right]=1\alpha\)
Finding the (1−α)100% confidence interval for the ratio of the two population variances then reduces, as always, to manipulating the quantity in parentheses. Multiplying through the inequality by:
\(\dfrac{S^2_X}{S^2_Y}\)
and recalling the fact that:
\(F_{1\frac{\alpha}{2}}(m1,n1)=\dfrac{1}{F_{\frac{\alpha}{2}}(n1,m1)}\)
the (1−α)100% confidence interval for the ratio of the two population variances reduces to:
\(\dfrac{1}{F_{\frac{\alpha}{2}}(n1,m1)}\dfrac{S^2_X}{S^2_Y}\leq \dfrac{\sigma^2_X}{\sigma^2_Y} \leq F_{\frac{\alpha}{2}}(m1,n1)\dfrac{S^2_X}{S^2_Y}\)
as was to be proved.
Let's return to the example, in which the feeding habits of twospecies of netcasting spiders are studied. The species, the deinopis and menneus, coexist in eastern Australia. The following summary statistics were obtained on the size, in millimeters, of the prey of the two species:
Estimate, with 95% confidence, the ratio of the two population variances.
Solution. In order to estimate the ratio of the two population variances, we need to obtain two Fvalues from the Ftable, namely:
\(F_{0.025}(9,9)=4.03\) and \(F_{0.975}(9,9)=\dfrac{1}{F_{0.025}(9,9)}=\dfrac{1}{4.03}\)
Then, the 95% confidence interval for the ratio of the two population variances is:
\(\dfrac{1}{4.03} \left(\dfrac{2.51^2}{1.90^2}\right) \leq \dfrac{\sigma^2_X}{\sigma^2_Y} \leq 4.03 \left(\dfrac{2.51^2}{1.90^2}\right)\)
Simplifying, we get:
\(0.433\leq \dfrac{\sigma^2_X}{\sigma^2_Y} \leq7.033\)
That is, we can be 95% confident that the ratio of the two population variances is between 0.433 and 7.033. (Because the interval contains the value 1, we cannot conclude that the population variances differ.)
Now that we've spent two pages learning confidence intervals for variances, I have a confession to make. It turns out that confidence intervals for variances have generally lost favor with statisticians, because they are not very accurate when the data are not normally distributed. In that case, we say they are "sensitive" to the normality assumption, or the intervals are "not robust."
(1) Under the Stat menu, select Basic Statistics, and then select 2 Variances...:
(2) In the popup window that appears, in the box labeled Data, select Sample standard deviations (or alternatively Sample variances). In the box labeled Sample size, type in the size n of the First sample and m of the Second sample. In the box labeled Standard deviation, type in the sample standard deviations for the First and Second samples:
(3) Click on the button labeled Options... In the popup window that appears, specify the confidence level, and in the box labeled Alternative, select not equal.
Then, click on OK to return to the main popup window.
(4) Then, upon clicking OK on the main popup window, the output should appear in the Session window: