8.2 - Hypothesis Testing for a Proportion
Here we will be using hypothesis tests to compare a proportion in one group to a specified population proportion.
Examples: Research Questions
The following are research questions that could be answered using a hypothesis test for one proportion. In each case, we would test the hypothesis by comparing data from a sample to the hypothesized population parameter.
Handedness. Are more than 80% of American’s right handed?
Babies. Is the proportion of babies born male different from .50?
Ice cream. Is the percentage of Creamery customers who prefer chocolate ice cream over vanilla less than 80%?
Recall from the last section the five step hypothesis testing procedure that we will be using in this course:
Five Step Hypothesis Testing Procedure
- Check any necessary assumptions and write null and alternative hypotheses. The assumptions will vary depending on the test. The null and alternative hypotheses will also be written in terms of population parameters; the null hypothesis will always contain an equality (i.e., \(=\), \(\geq\), or \(\leq\)).
- Calculate an appropriate test statistic. This will vary depending on the test, but it will typically be the difference observed in the sample divided by a standard error. In this class we will see \(z\), \(t\), \(\chi ^{2}\), and \(F\) test statistics.
- Determine a p-value associated with the test statistic. This can be found using the tables in Appendix A or using Minitab Express.
- Decide between the null and alternative hypotheses. If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis.
- State a "real world" conclusion. Based on your decision in step 4, write a conclusion in terms of the original research question.
Some steps may vary depending on the test. Let's walk through these five steps for specifically for comparing the proportion of one group to a specified value.
1. Check Any Necessary Assumptions and Write Null and Alternative Hypotheses.
As in previous lessons, the assumption is that both \(n \times p \geq 10\) and \(n \times (1-p) \geq 10\). Note that some textbooks use 15 instead of 10 believing that 10 is too liberal. We will continue to use 10 for our discussions.
In terms of the hypotheses, the null hypothesis will always contain an equality, the alternative hypothesis will never contain an equality. Below is a table with the possible combinations of null and alternative hypotheses. \(p_0\) is the hypothesized value of the population proportion.
Research Question | Is the proportion different from \(p_0\)? | Is the proportion greater than \(p_0\)? | Is the proportion less than \(p_0\)? |
Null Hypothesis, \(H_{0}\) | \(p=p_0\) | \(p\leq p_0\) | \(p\geq p_0\) |
Alternative Hypothesis, \(H_{a}\) | \(p\neq p_0\) | \(p> p_0\) | \(p< p_0\) |
Type of Hypothesis Test | Two-tailed, non-directional | Right-tailed, directional | Left-tailed, directional |
Note: Some statisticians (e.g., MyStatLab) will always use the equality (=) in the null hypothesis regardless of whether a one- or two-tailed test is being performed.
Examples: Writing Hypotheses
Handedness. Are more than 80% of American’s right handed?
\(H_{0}:p\leq.80\)
\(H_{a}:p>.80\)
This is a right-tailed test.
Babies. Is the proportion of babies born male different from .50?
\(H_{0}:p=.50\)
\(H_{a}:p\neq.50\)
This is a two-tailed test.
Ice cream. Is the percentage of Creamery customers who prefer chocolate ice cream over vanilla less than 80%?
\(H_{0}:p\geq.80\)
\(H_{a}:p<.80\)
This is a left-tailed test.
2. Calculate an Appropriate Test Statistic.
When testing on proportion, will be using a \(z\) test statistic using the following formula:
Test statistic: One Group Proportion
\[z=\frac{\widehat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}}\]
\(\widehat{p}\) = sample proportion
\(p_{0}\) = hypothesize population proportion
\(n\) = sample size
Note that this formula is actually the difference between the sample proportion and hypothesized population proportion divided by the standard error of \(\widehat{p}\). In doing so, this formula is finding the z score for the observed sample in terms of the hypothesized distribution of sample proportions.
Examples: Computing Test Statistics
Handedness. Are more than 80% of American’s right handed? In a random sample of 100 Americans, 87 said that they were right handed. Let’s compute a test statistic.
\(\widehat{p}=\frac{87}{100}=.87\), \(p_{0}=.80\), \(n=100\)
\(z= \frac{\widehat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}}= \frac{.87-.80}{\sqrt{\frac{.80 (1-.80)}{100}}}=1.75\)
Our z test statistic is 1.75. Given that the population proportion is .80, a sample of \(n=100\) translates to a z score of 1.75.
Babies. Is the proportion of babies born male different from .50? In a sample of 200 babies, 96 were male. Let’s compute a test statistic.
\(\widehat{p}=\frac{96}{200}=.48\), \(p_{0}=.50\),\(n=200\)
\(z= \frac{\widehat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}}= \frac{.48-.50}{\sqrt{\frac{.50 (1-.50)}{200}}}=-0.566\)
Our z test statistic is -0.566. Given that the population proportion is .50, a sample of \(n=200\) translates to a z score of -0.566.
Ice cream. Is the percentage of Creamery customers who prefer chocolate ice cream over vanilla less than 80%? In a random sample of 50 Creamery customers, 60% said that they prefer chocolate over vanilla. Let’s compute a test statistic.
\(\widehat{p}=.60\), \(p_{0}=.80\), \(n=50\)
\(z= \frac{\widehat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}}= \frac{.60-.80}{\sqrt{\frac{.80 (1-.80)}{50}}}=-3.536\)
Our z test statistic is -3.536. Given that the population proportion is .80, a sample of \(n=50\) translates to a z score of -3.536.
3. Determine the p-value Associated with the Test Statistic.
Now, we use the test statistic that we computed in step 2 to determine the probability of obtaining a sample that deviates from the hypothesized population as much as or more than the sample that we have. In other words, given that the null hypothesis is true, the probability that a randomly selected sample of \(n\) would have a sample statistic as different as the one obtained (or more different) is the p-value.
Note that p-values are also symbolized by \(p\). Do not confuse this with the population proportion which shares the same symbol.
We can look up the p-value using the Standard Normal Table or using Minitab Express. If we are conducting a one-tailed test (i.e., right- or left-tailed), we look up the area of the sampling distbrution that is beyond our test statistic. If we are conducting a two-tailed (i.e., non-directional test) there is one additional step: we need to multiple the area by two to take into account the possibility of being in the right or left tail. Let's continue with the handedness, babies, and ice cream examples:
Examples: Determining p-values
Handedness. Are more than 80% of American’s right handed? In a random sample of 100 Americans, 87 said that they were right handed.
Previously we found the following: \(H_{0}:p\leq.80\), \(H_{a}:p>.80\), this is a right-tailed test
\(z=1.75\)
Using the standard normal distribution, we want to find the probability of obtaining a z score of 1.75 or more extreme (i.e., greater than 1.75).
According to the table, \(P(z<1.75)=.9599\). Therefore, \(P(z\geq1.75)=1-.9599=.0401\)
Using Minitab Express, we find the probability \(P(z\geq1.75)=.0400592\) which may be rounded to \(p\; value=.0401\). Our p-value is .0401.
Babies. Is the proportion of babies born male different from .50? In a sample of 200 babies, 96 were male.
Previously we found the following: \(H_{0}:p=.50\), \(H_{a}:p\neq.50\), this is a two-tailed test
\(z=-0.566\)
Using the standard normal distribution, we want to find the probability of obtaining a z score of -0.566 or more extreme (i.e., less than -0.566).
\(P(z<-0.566)=.2843\)
Because this is a two-tailed test we must take into account both the left and right tails. To do so, we multiply the value above by two. \(p\; value=.2843\times2=.5686\). Our p-value is .5686.
We could also find this probability using Minitab Express, we still need to add the proportion in each tail:
Ice cream. Is the percentage of Creamery customers who prefer chocolate ice cream over vanilla less than 80%? In a random sample of 50 Creamery customers, 60% said that they prefer chocolate over vanilla.
Previously we found the following: \(H_{0}:p\geq.80\), \(H_{a}:p<.80\), this is a left-tailed test
\(z=-3.536\)
Using the standard normal distribution, we want to find the probability of obtaining a z score of -3.536 or more extreme (i.e., less than -3.536).
\(P(z<-3.536)=.000233\), our p-value is .000233
We could also find this probability using Minitab Express:
4. Decide Between the Null and Alternative Hypotheses.
We can decide between the null and alternative hypotheses by examining our p-values. If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis. Unless stated otherwise, assume that \(\alpha=.05\).
When we reject the null hypothesis are results are said to be statistically significant.
Examples: Deciding Between the Null and Alternative Hypotheses
Handedness. Are more than 80% of American’s right handed? In a random sample of 100 Americans, 87 said that they were right handed.
Previously we found the following: \(H_{0}:p\leq.80\), \(H_{a}:p>.80\), this is a right-tailed test \(z=1.75\), \(p \;value=.0401\)
\(p\leq .05\), therefore our decision is to reject the null hypothesis
Babies.Is the proportion of babies born male different from .50? In a sample of 200 babies, 96 were male.
Previously we found the following: \(H_{0}:p=.50\), \(H_{a}:p\neq.50\), this is a two-tailed test
\(z=-0.566\), \(p\; value=.5686\)
\(p>.05\), therefore our decision is to fail to reject the null hypothesis
Ice cream. Is the percentage of Creamery customers who prefer chocolate ice cream over vanilla less than 80%? In a random sample of 50 Creamery customers, 60% said that they prefer chocolate over vanilla.
Previously we found the following: \(H_{0}:p\geq.80\), \(H_{a}:p<.80\), this is a left-tailed test
\(z=-3.536\), \(p \;value=.0002\)
\(p \leq.05\), therefore our decision is to reject the null hypothesis
5. State a "Real World" Conclusion.
Based on our decision in the previous step, we will write a sentence or two concerning our decision in relation to the original research question.
Examples: State a "Real World" Conclusion
Handedness. Are more than 80% of American’s right handed? In a random sample of 100 Americans, 87 said that they were right handed.
Previously we found: \(H_{0}:p\leq.80\), \(H_{a}:p>.80\), \(p \;value=.0401\), reject the null hypothesis
Yes, there is statistical evidence to state that more than 80% of American’s are right handed.
Babies. Is the proportion of babies born male different from .50? In a sample of 200 babies, 96 were male.
Previously we found: \(H_{0}:p=.50\), \(H_{a}:p\neq.50\), \(p\;value=.5686\), fail to reject the null hypothesis
We do not have sufficient evidence to state that in the population the proportion of babies born male is different from .50
Ice cream. Is the percentage of Creamery customers who prefer chocolate ice cream over vanilla less than 80%? In a random sample of 50 Creamery customers, 60% said that they prefer chocolate over vanilla.
Previously we found: \(H_{0}:p\geq.80\), \(H_{a}:p<.80\), \(p\;value=.0002\), reject the null hypothesis
Yes, there is evidence that the percentage of all Creamery customers who prefer chocolate ice cream over vanilla is less than 80%.
Let's walk through a few full examples together before you work through a few on your own.
Example: Proportion of Overweight Citizens
According to the Center for Disease Control (CDC), the percent of adults 20 years of age and over in the United States who are overweight is 69.0% (see http://www.cdc.gov/nchs/fastats/obesity-overweight.htm). One city’s council wants to know if the proportion of overweight citizens in their city is different from this known national proportion. They take a random sample of 150 adults 20 years of age or older in their city and find that 98 are classified as overweight. Let’s use the five step hypothesis testing procedure to determine if there is evidence that the proportion in this city is different from the known national proportion.
1. Check any necessary assumptions and write null and alternative hypotheses.
\(\widehat{p}=\frac{98}{150}=.653\)
\(n \times p \geq 10=150 \times .653=98 \), \(n \times (1-p) \geq 10=150(1-.653)=52\)
Research question: Is this city’s proportion of overweight individuals different from .690? This is a non-direction test because our question states that we are looking for a differences as opposed to a specific direction. This will be a two-tailed test.
\(H_{0}:p=.690\)
\(H_{a}:p\neq.690\)
2. Calculate an appropriate test statistic.
We are testing the proportion of one group.
\( z= \frac{\widehat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}} =\frac{.653- .690 }{\sqrt{\frac{.690 (1- .690)}{150}}} = -0.980 \)
Our test statistic is \(z=-0.980\)
3. Determine the p-value associated with the test statistic.
If we look up \(z=-0.980\) on the standard normal table, we find that .1635 lays below this z value. Because this is a two-tailed test, we must multiply this proportion by 2 in order to take into account both the left and right tails.
\(p\;value=.1635\times2=.327\)
Our p-value is .327. Given that the null hypothesis is true (i.e., if the population from which this sample was drawn from had a proportion of .690 overweight citizens), the probability that we would obtain a sample with a proportion that was this or more different from .690 is .327.
In Minitab Express, we could find the proportion of a normal curve beyond \(\pm0.980\):
Minitab Express gives the proporion in each tail, again, this proportion must be multipled by 2 in order to obtain the total shaded in region. \(p\;value=.163543\times2=.327086\)
4. Decide between the null and alternative hypotheses.
\(\alpha=.05\) and \(p\;value=.327\)
\(p>\alpha\), therefore we fail to reject the null hypothesis
5. State a "real world" conclusion.
There is not sufficient evidence to state that the proportion of citizens of this city who are overweight is different from the national average of .690.
On Your Own
Before continuing to the next session, take some time to work through the problems below on your own. Then click on the icon to the left to compare your answers with solutions.
1. According to the Center for Disease Control (CDC), 17.7% of children 6-11 years of age are obese (see http://www.cdc.gov/nchs/fastats/obesity-overweight.htm). The parent-teacher association at one elementary school wants to find out if the obesity rate at their school is higher than the national average. They take a random sample from their school of 50 children between 6-11 years of age and found that 13 were obese. Use the five-step hypothesis testing approach to answer the parent-teacher association’s question.
2. According to the Center for Disease Control (CDC), 49.4% of American adults 18 years old and over meet their Physical Activity Guidelines for aerobic physical activity (see http://www.cdc.gov/nchs/fastats/exercise.htm). The owners of a gym want to know if the proportion of their members who meet these requirements is different from the national average. They took a random sample of 40 members and 28 met the CDC’s guidelines. Use the five-step hypothesis testing approach to answer their question.