Hypothesis Testing for a Proportion

Printer-friendly versionPrinter-friendly version

Ultimately we will measure statistics (e.g. sample proportions and sample means) and use them to draw conclusions about unknown parameters (e.g. population proportion and population mean). This process, using statistics to make judgments or decisions regarding population parameters is called statistical inference.

Example 2 above produced a sample proportion of 47% heads and is written:

 \(\hat{p}\) [read "p-hat"] = 47/100 = 0.47

P-hat is called the sample proportion and remember it is a statistic (soon we will look at sample means, \(\bar{x}\).) But how can p-hat be an accurate measure of p, the population parameter, when another sample of 100 coin flips could produce 53 heads? And for that matter we only did 100 coin flips out of an uncountable possible total!

The fact that these samples will vary in repeated random sampling taken at the same time is referred to as sampling variability. The reason sampling variability is acceptable is that if we took many samples of 100 coin flips an calculated the proportion of heads in each sample then constructed a histogram or boxplot of the sample proportions, the resulting shape would look normal (i.e. bell-shaped) with a mean of 50%.

[The reason we selected a simple coin flip as an example is that the concepts just discussed can be difficult to grasp, especially since earlier we mentioned that rarely is the population parameter value known. But most people accept that a coin will produce an equal number of heads as tails when flipped many times.]

A statistical hypothesis test is a procedure for deciding between two possible statements about a population. The phrase significance test means the same thing as the phrase "hypothesis test."

The two competing statements about a population are called the null hypothesis and the alternative hypothesis.

  • A typical null hypothesis is a statement that two variables are not related. Other examples are statements that there is no difference between two groups (or treatments) or that there is no difference from an existing standard value.
  • An alternative hypothesis is a statement that there is a relationship between two variables or there is a difference between two groups or there is a difference from a previous or existing standard.

NOTATION: The notation Ho represents a null hypothesis and Ha represents an alternative hypothesis and po is read as "p-not" or "p-zero" and represents the null hypothesized value. Shortly, we will substitute μo for when discussing a test of means.

Ho: p = po
Ha: ppo     or     Ha: p > po     or     Ha: p < po    [Remember, only select one Ha]

The first Ha is called a two-sided test since "not equal" implies that the true value could be either greater than or less than the test value, po. The other two Ha are referred to as one-sided tests since they are restricting the conclusion to a specific side of po.

Example 3 – This is a test of a proportion:

A Tufts University study finds that 40% of 12th grade females feel they are overweight. Is this percent lower for college age females? Let p = proportion of college age females who feel they are overweight. Competing hypothesis are:

Ho: p = .40 (or greater) That is, no difference from Tufts study finding.
Ha: p < .40 (proportion feeling they are overweight is less for college age females.

Example 4 – This is a test of a mean:

Is there a difference between the mean amount that men and women study per week? Competing hypotheses are:

Null hypothesis: There is no difference between mean weekly hours of study for men and women, writing in statistical language as μ1 = μ2
Alternative hypothesis: There is a difference between mean weekly hours of study for men and women, writing in statistical language as μ1≠ μ2

This notation is used since the study would consider two independent samples: one from Women and another from Men.

Test Statistic and p-value

  • A test statistic is a summary of a sample that is in some way sensitive to differences between the null and alternative hypothesis.
  • A p-value is the probability that the test statistic would "lean" as much (or more) toward the alternative hypothesis as it does if the real truth is the null hypothesis. That is, the p-value is the probability that the sample statistic would occur under the presumption that the null hypothesis is true.

A small p-value favors the alternative hypothesis. A small p-value means the observed data would not be very likely to occur if we believe the null hypothesis is true. So we believe in our data and disbelieve the null hypothesis. An easy (hopefully!) way to grasp this is to consider the situation where a professor states that you are just a 70% student. You doubt this statement and want to show that you are better that a 70% student. If you took a random sample of 10 of your previous exams and calculated the mean percentage of these 10 tests, which mean would be less likely to occur if in fact you were a 70% student (the null hypothesis): a sample mean of 72% or one of 90%? Obviously the 90% would be less likely and therefore would have a small probability (i.e. p-value).

Using the p-value to Decide between the Hypotheses

  • The significance level of a test is the border used for deciding between the null and alternative hypotheses.
  • Decision Rule: We decide in favor of the alternative hypothesis when a p-value is less than or equal to the significance level. The most commonly used significance level is 0.05.

In general, the smaller the p-value the stronger the evidence is in favor of the alternative hypothesis.

Example 3 Continued:

In a recent elementary statistics survey, the sample proportion (of women) saying they felt overweight was 37 /129 = .287. Note that this leans toward the alternative hypothesis that the "true" proportion is less than .40. [Recall that the Tufts University study finds that 40% of 12th grade females feel they are overweight. Is this percent lower for college age females?]

Step 1: Let p = proportion of college age females who feel they are overweight.

Ho: p = .40 (or greater) That is, no difference from Tufts study finding.
Ha: p < .40 (proportion feeling they are overweight is less for college age females.

Step 2:

If npo ≥ 10 and n(1 – po) ≥ 10 then we can use the following Z-test statistic: Since both (129) × (0.4) and (129) × (0.6) > 10 [or consider that the number of successes and failures, 37 and 92 respectively, are at least 10] we calculate the test statistic by:

\[z=\frac{\hat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}}\]

Note: In computing the Z-test statistic for a proportion we use the hypothesized value po here not the sample proportion p-hat in calculating the standard error! We do this because we "believe" the null hypothesis to be true until evidence says otherwise.

\[z=\frac{0.287-0.40}{\sqrt{\frac{0.40(1-0.40)}{129}}}=-2.62\]

Step 3: The p-value can be found from Standard Normal Table

Calculating p-value:

The method for finding the p-value is based on the alternative hypothesis:

2 × P(Z ≥ | z | ) for Ha : p po where |z| is the absolute value of z
P(Z ≥ z ) for Ha : p > po
P(Z ≤ z) for Ha : p < po

In our example we are using Ha : p < .40 so our p-value will be found from P(Z ≤ z) = P(Z ≤ -2.62) and from Standard Normal Table this is equal to 0.0044.

Step 4: We compare the p-value to alpha, which we will let alpha be 0.05. Since 0.0044 is less than 0.05 we will reject the null hypothesis and decide in favor of the alternative, Ha.

Step 5: We’d conclude that the percentage of college age females who felt they were overweight is less than 40%. [Note: we are assuming that our sample, since not random, is representative of all college age females.]

The p-value = .004 indicates that we should decide in favor of the alternative hypothesis. Thus we decide that less than 40% of college women think they are overweight.

The "Z-value" (-2.62) is the test statistic. It is a standardized score for the difference between the sample p and the null hypothesis value p = .40. The p-value is the probability that the z-score would lean toward the alternative hypothesis as much as it does if the true population really was p = .40.

Using Software to Perform a One Proportion Test Analysis Using Raw Data

To perform a one proportion test analysis in Minitab using raw data:

  1. Open Minitab data set Class_Survey.MTW
  2. Go to Stat > Basic Stat > 1- proportion
  3. Click the radio button for Samples in Columns (this is the default)
  4. Click the text box under this title (cursor should be in this box)
  5. Select from the variables list the variable Gender (be sure the variable Gender appears in the text box)
  6. Check the box for Perform Hypothesis Test and enter 0.5 (note that for Minitab versions earlier than 15 this test is found under the Options)
  7. The windows in Minitab to perform one proportion test and calculate its confidence interval by using raw data.

  8. Click Options and select the correct Alternative (e.g. not equal to)
  9. Check the box for Use Test and Interval Based on Normal Distribution (remember to verify this use by checking that the number of successes and failures are at least ten)
  10. Click OK twice

Minitab output for one proportion test of the proportion of male students. You can get 95% confidence interval (0.3734, 0.5027), z-value negative 1.86 and the p-value 0.063.

watch!

To perform a one proportion test analysis in Minitab Express using raw data:

  1. Open Minitab data set Class_Survey.MTW.
  2. From the menu bar, select Statistics > One Sample > Proportion.
  3. Double-click the variable Gender to insert it into the "Sample" box.
  4. Check the box next to "Perform hypothesis test" and enter 0.50 in the "Hypothesized propotion" box.
  5. Click the "Options" tab at the top of the window.
  6. Verify that the alternative hypothesis is "Proportion ≠ hypothesized value" and that the confidence level is 95. Select the "Normal approximation" method (remember to verify this use by checking that the number of successes and failure are at least 10).
  7. Click OK twice

The result should be the following output:

minitab express output for a one proportion test

watch!

Using Software to Perform a Summarized One Proportion Test Analysis

To perform a summarized one proportion test analysis in Minitab:

  1. Open Minitab without data
  2. Go to Stat > Basic Stat > 1- proportion
  3. Click the radio button for Summarized Data
  4. Enter 37 for Number of Events and 129 for Number of Trials
  5. Check the box for Perform Hypothesis Test and enter 0.4 (note that for Minitab versions earlier than 15 this test is found under the Options)
  6. Click Options and select the correct Alternative (e.g. less than)
  7. The windows in Minitab to perform one proportion test and calculate its confidence interval by using summarized data.

  8. Check the box for Use Test and Interval Based on Normal Distribution (remember to verify this use by checking that the number of successes and failures are at least ten)
  9. Click OK twice

This should result in the following output:

Minitab output for one proportion test with summarized data, including sample p 0.2868, z-value negative 2.62 and the p-value 0.004.

watch!

To perform a summarized one proportion test analysis in Minitab Express:

  1. Open Minitab without data.
  2. From the menu bar, select Statistics > One Sample > Proportion.
  3. From the drop-down menu, change "Sample data in a column" to "Summarized data".
  4. Enter 37 for Number of Events and 129 for Number of Trials.
  5. Check the box for Perform Hypothesis Test and enter 0.4.
  6. Click the Options tab, and select the Alternative hypothesis "Proportion < hypothesized value".
  7. Verify that the confidence level is 95, and the method is the normal approximation.
  8. Click OK.

This should result in the following output:

minitab express output for a one proportion test analysis with summarized data

watch!

The p-value= .004 indicates that we should decide in favor of the alternative hypothesis. Thus we decide that less than 40% of college women think they are overweight.

The "Z-value" (-2.62) is the test statistic. It is a standardized score for the difference between the sample p and the null hypothesis value p = .40. The p-value is the probability that the z-score would lean toward the alternative hypothesis as much as it does if the true population really was p = .40.