The P-Value Approach

Up until now, we have used the critical region approach in conducting our hypothesis tests. Now, let's take a look at an example in which we use what is called the P-value approach.

lung cancerExample

Among patients with lung cancer, usually 90% or more die within three years. As a result of new forms of treatment, it is felt that this rate has been reduced. In a recent study of n = 150 lung cancer patients, y = 128 died within three years. Is there sufficient evidence at the α = 0.05 level, say, to conclude that the death rate due to lung cancer has been reduced?

Solution. The sample proportion is:

\[\hat{p}=\dfrac{128}{150}=0.853\]

The null and alternative hypotheses are:

H0: p = 0.90  and HA: p < 0.90

The test statistic is, therefore:

\[Z=\dfrac{\hat{p}-p_0}{\sqrt{\dfrac{p_0(1-p_0)}{n}}}=\dfrac{0.853-0.90}{\sqrt{\dfrac{0.90(0.10)}{150}}}=-1.92\]

And, the rejection region is:

critical region

Since the test statistic Z = −1.92 < −1.645, we reject the null hypothesis. There is sufficient evidence at the α = 0.05 level to conclude that the rate has been reduced.

lung cancer graphicExample (continued)

What if we set the significance level α = P(Type I Error) to 0.01? Is there still sufficient evidence to conclude that the death rate due to lung cancer has been reduced?

Solution. In this case, with α = 0.01, the rejection region is Z ≤ −2.33. That is, we reject if the test statistic falls in the rejection region defined by Z ≤ −2.33:

critical region

Because the test statistic Z = −1.92 > −2.33, we do not reject the null hypothesis. There is insufficient evidence at the α = 0.01 level to conclude that the rate has been reduced.

thresholdExample (continued)

In the first part of this example, we rejected the null hypothesis when α = 0.05. And, in the second part of this example, we failed to reject the null hypothesis when α = 0.01. There must be some level of α, then, in which we cross the threshold from rejecting to not rejecting the null hypothesis. What is the smallest α−level that would still cause us to reject the null hypothesis?

Solution. We would, of course, reject any time the critical value was smaller than our test statistic −1.92:

P-value

That is, we would reject if the critical value were −1.645, −1.83, and −1.92. But, we wouldn't reject if the critical value were −1.93. The α−level associated with the test statistic −1.92 is called the P-value. It is the smallest α−level that would lead to rejection. In this case, the P-value is:

P(Z < −1.92) = 0.0274

So far, all of the examples we've considered have involved a one-tailed hypothesis test in which the alternative hypothesis involved either a less than (<) or a greater than (>) sign. What happens if we weren't sure of the direction in which the proportion could deviate from the hypothesized null value? That is, what if the alternative hypothesis involved a not-equal sign (≠)? Let's take a look at an example. 

two zebra tails

Example (continued)

What if we wanted to perform a "two-tailed" test? That is, what if we wanted to test:

H0: p = 0.90 versus HA: p ≠ 0.90

at the α = 0.05 level?

Solution. Let's first consider the critical value approach. If we allow for the possibility that the sample proportion could either prove to be too large or too small, then we need to specify a threshold value, that is, a critical value, in each tail of the distribution. In this case, we divide the "significance level" α by 2 to get α/2:

normal distribution

That is, our rejection rule is that we should reject the null hypothesis H0 if Z ≥ 1.96 or we should reject the null hypothesis Hif Z ≤ −1.96. Alternatively, we can write that we should reject the null hypothesis H0 if |Z| ≥ 1.96. Because our test statistic is −1.92, we just barely fail to reject the null hypothesis, because 1.92 < 1.96. In this case, we would say that there is insufficient evidence at the α = 0.05 level to conclude that the sample proportion differs significantly from 0.90.

Now for the P-value approach. Again, needing to allow for the possibility that the sample proportion is either too large or too small, we multiply the P-value we obtain for the one-tailed test by 2:

drawing

That is, the P-value is:

\[P=P(|Z|\geq 1.92)=P(Z>1.92 \text{ or } Z<-1.92)=2 \times 0.0274=0.055\]

Because the P-value 0.055 is (just barely) greater than the significance level α = 0.05, we barely fail to reject the null hypothesis. Again, we would say that there is insufficient evidence at the α = 0.05 level to conclude that the sample proportion differs significantly from 0.90.

Let's close this example by formalizing the definition of a P-value, as well as summarizing the P-value approach to conducting a hypothesis test.

Definition. The P-value is the smallest significance level α that leads us to rejecting the null hypothesis. 

Alternatively (and the way I prefer to think of P-values), the P-value is the probability that we'd observe a more extreme statistic than we did if the null hypothesis were true. 

If the P-value is small, that is, if Pα, then we reject the null hypothesis H0.

writing handNote

By the way, to test H0: p = p0, some statisticians will use the test statistic:

\[Z=\dfrac{\hat{p}-p_0}{\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}}\]

rather than the one we've been using:

\[Z=\dfrac{\hat{p}-p_0}{\sqrt{\dfrac{p_0(1-p_0)}{n}}}\]

One advantage of doing so is that the interpretation of the confidence interval — does it contain p0? — is always consistent with the hypothesis test decision, as illustrated here:

For the sake of ease, let:

\[se(\hat{p})=\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}\]

Two-tailed test. In this case, the critical region approach tells us to reject the null hypothesis H0: p = p0 against the alternative hypothesis HA: pp0:

if   \[Z=\dfrac{\hat{p}-p_0}{se(\hat{p})} \geq z_{\alpha/2}\]    or  if   \[Z=\dfrac{\hat{p}-p_0}{se(\hat{p})} \leq -z_{\alpha/2}\]

which is equivalent to rejecting the null hypothesis:

if    \[\hat{p}-p_0 \geq z_{\alpha/2}se(\hat{p})\]   or  if    \[\hat{p}-p_0 \leq -z_{\alpha/2}se(\hat{p})\]

which is equivalent to rejecting the null hypothesis:

if   \[p_0 \geq \hat{p}+z_{\alpha/2}se(\hat{p})\]    or  if    \[p_0 \leq \hat{p}-z_{\alpha/2}se(\hat{p})\]

That's the same as saying that we should reject the null hypothesis H0 if p0 is not in the (1−α)100% confidence interval!

Left-tailed test. In this case, the critical region approach tells us to reject the null hypothesis H0: p = p0 against the alternative hypothesis HAp < p0:

if   \[Z=\dfrac{\hat{p}-p_0}{se(\hat{p})} \leq -z_{\alpha}\]

which is equivalent to rejecting the null hypothesis:

if   \[\hat{p}-p_0 \leq -z_{\alpha}se(\hat{p})\]

which is equivalent to rejecting the null hypothesis:

if   \[p_0 \geq \hat{p}+z_{\alpha}se(\hat{p})\]

That's the same as saying that we should reject the null hypothesis H0 if p0 is not in the upper (1−α)100% confidence interval:

\[(0,\hat{p}+z_{\alpha}se(\hat{p}))\]