7.5 - Power and Sample Size Determination for Testing a Population Mean

Printer-friendly version
 Unit Summary Why Do We Need to Compute the Power of a Test? Power and Type II Error of a Test Choosing the Sample Size for Testing Population Mean Using Minitab to Perform a One-Sample t-Test and to Compute Power Last Words About Using Minitab's Power & Sample Size Tools

An Introduction to Statistical Methods and Data Analysis, (See Course Schedule).

Why Do We Need to Compute the Power of a Test?

When the data indicate that one cannot reject the null hypothesis, does it mean that one can accept the null hypothesis? For example, when the p-value computed from the data is 0.12, one fails to reject the null hypothesis at $$\alpha$$ = 0.05. Can we say that the data support the null hypothesis?

Answer: When you perform hypothesis testing, you only set the size of Type I error and guard against it. Thus, we can only present the strength of evidence against the null hypothesis. One can sidestep the concern about Type II error if the conclusion never mentions that the null hypothesis is accepted. When the null hypothesis cannot be rejected, there are two possible cases: 1) one can accept the null hypothesis, 2) the sample size is not large enough to either accept or reject the null hypothesis. To make the distinction, one has to check $$\beta$$. If $$\beta$$ at a likely value of the parameter is small, then one accepts the null hypothesis. If the $$\beta$$ is large, then one cannot accept the null hypothesis.

The relationship between $$\alpha$$ and $$\beta$$ :

If the sample size is fixed, then decreasing α will increase $$\beta$$ . If one wants both to decrease, then one has to increase the sample size.

Power and Type II Error of a Test

Power = the probability of correctly rejecting a false null hypothesis = $$1 - \beta$$ .

Choosing the Sample Size for Testing Population Mean

Refer to page 218 (edition 5) or pg 243 (edition 6) of our textbook to see the graphs that show the probability of Type II error.

Usually, acceptable values of power are larger than 0.7. One usually sets the power to be 0.8 or 0.85. Again, the acceptable values of power depend on the problem just as the value of α depends on the problem.

The following are interrelated: Power (which is $$1 - \beta$$), sample size, α , and the distance between the actual mean and the mean specified in the null hypothesis.

Using Minitab to Perform a One-Sample t-Test and to Compute Power

To calculate the smallest sample size needed for specified $$\alpha$$ , $$\beta$$ , $$\mu_a$$ ($$\mu_a$$ is the likely value of $$\mu$$ at which you want to evaluate the power; $$\mu_a$$ is chosen subjectively to reflect the likely value of $$\mu$$ from the user's prior knowledge):

One-Tailed test:

$n=\sigma^2 \frac{(t_\alpha + t_\beta)^2}{(\mu_0-\mu_a)^2}$

Two-Tailed test:

$n=\sigma^2 \frac{(t_{\alpha/2 }+ t_\beta)^2}{(\mu_0-\mu_a)^2}$

Note: The above two formulas are included for your reference only. When you need to compute the sample size, you can simply use Minitab. The formula given in our book are the approximation to the above formula, replacing t by z.

Using Minitab to Compute the Sample Size or the Power

In the main menu in Minitab select:

Stat > power and sample size > 1-sample t

Note: The minimum difference referred to in Minitab is the difference between $$\mu_0$$ and $$\mu_a$$.

Note: One-sample t-tests are used to perform hypothesis tests of the mean.  To calculate power or sample size for these tests, you need to determine the minimum difference (effect) that you consider to be meaningful.  Then, you can determine the power or the sample size you need to be able to refhect the null hypothesis when the true value differs from the hypothesized value by this minimum difference.

Example: Weight Change

Weight change in pounds of 14 female subjects after taking an exercise program for six weeks are recorded:

 17 7 -4 -18 2 9 12 9 -12 -9 -18 -14 -18 -20

Is there sufficient evidence that the average weight change is different from 0? (set $$\alpha$$ = 0.05)

a. State the null and alternative hypothesis:

$$H_0 : \mu = 0$$
$$H_a : \mu \ne 0$$

b. Use Minitab to check whether the one-sample t-test may be used.

Now, the sample size is only 14 and thus we need to use the normal probability plot to check whether the data may come from a normal distribution.

With the 14 data points entered into Minitab, from the menu we can select Graph > probability plot

We can see that we can use the t-test since the normal probability plot indicates that there is no evidence to suggest that the data do not come from a normal distribution.

c. Use Minitab to perform the test and draw a conclusion using the p-value.

From the Minitab menu select: Stat > Basic Statistics > 1-Sample t

Dialog box items:

• Variables: Select the column(s) containing the variable(s) that you want to perform the hypothesis test.
• Test mean: Choose to perform a one-sample t-test by checkind the box for Perform hypothesis test; then specify the null hypothesis test value by entering this value into the text box for Hypothesized mean.  For this example, enter the value 0

Click on Options... in the Confidence Level text box, type your desired confidence level (for this example, use 95). In the Alternative hypothesis text box, select the desired alternative hypothesis from: mean ≠ hypothesized mean, mean < hypothesized mean, mean > hypothesized mean. For this example, select mean ≠ hypothesized mean.

Here is the resulting output:

One-Sample T: C1

Test of μ = 0 vs ≠ 0

 Variable N Mean StDev SE Mean C1 14 -4.07 13.08 3.50 Variable 95.0% CI T P C1 ( -11.62, 3.48) -1.16 0.265

Using Minitab, we see that the observed t-value is -1.16 and the p-value is 0.265 which is greater than $$\alpha$$ = 0.05. We conclude that we cannot reject the null hypothesis.

There are two possible reasons for the failure of rejection of the null hypothesis:

1. the null hypothesis is reasonable, or
2. there's an insufficient sample size to achieve a powerful test.

We do not yet know which one is the real reason and thus proceed to compute the power of the test.

d. Use Minitab to compute the power $$(1 - \beta)$$ of the test at the likely value $$\mu_a = -5.0$$.

Based on the computed power, would you accept the null hypothesis?

n = 14, $$\alpha$$ = 0.05

Difference to detect is = 0 - (-5) = 5.

Using Minitab > Stat > power and sample size, we enter our sample information.  We had a sample size of 14 and the true difference we want to be able to detect is 5.  The sample standard deviation was 13.08.  Click Options to select the alternative of interest (Not Equal) and enter our Significance level (this is our alpha value).  When finished, click OK for the Options and OK again.  We find that power = 0.2635 which is the probability that we correctly reject the null hypothesis when the difference is truly at least 5.  This is not very good!  A common power value is 0.8 or 80 percent. The power is very low and we cannot accept the null hypothesis since the possible Type II error is $$\beta$$ = 1 - power or 1 - 0.2635 = 0.7365. The possible Type II error is too high.  The most likely reason is that our sample size is too small to detect this difference with any reasonable power. [NOTE: The choice of '5' for the difference was a "researcher's decision".  There is specific reason we selected 5 other than for illustrative purposes in this example.  The difference one selects is just the smallest (minimum) difference the researcher wants to detect between the hypothesized population value and the actual value.]

e. Use Minitab to find how large a sample size is needed.

Suppose we want α = 0.05, power = 0.8, and the minimum detectable difference = 5?

From the Minitab output of the one-sample t-test, we see that the standard deviation is 13.08. We can thus estimate $$\sigma$$ by 13.08 for the sample size computation problem:

Minitab gives us the following results:

Power and Sample Size

1-Sample t Test

Testing mean = null (versus no = null)
Calculating power for mean = null + difference
Alpha = 0.05 Sigma = 13.08

 Sample Target Actual Difference Size Power Power 5 56 0.8000 0.8024

Thus, we see that 56 samples need to be collected in order to draw meaningful results about this hypothesis testing problem.

Last Words About Using Minitab's Power & Sample Size Tools

Gathering data is like tasting fine wine—you need the right amount. With wine, too small a sip keeps you from accurately assessing a subtle bouquet, but too large a sip overwhelms the palate.

We can’t tell you how big a sip to take at a wine-tasting event, but when it comes to collecting data, Minitab Statistical Software’s Power and Sample Size tools can tell you how much data you need to be sure about your results.