9.1 - Chi-Square Test of Independence

Printer-friendly versionPrinter-friendly version

Unit Summary

  • Chi-square Test of Independence
  • Finding Expected Counts from Observed Counts
  • Minitab Steps for Chi-square Test for Independence
  • Condition for Using the Chi-square Test
  • A 2 × 2 Table: Special Case of Chi-square Test Similar to Z-Test of Two Independent Proportions

reading assignmentReading Assignment
An Introduction to Statistical Methods and Data Analysis, (see Course Schedule).

 

Chi-square Test of Independence

How to test the independence of two categorical variables? It will be done using the Chi-square test of independence. As will all prior statistical tests we need to define null and alternative hypotheses.  Also, as we have learned, the null hypothesis is what is assumed to be true until we have evidence to go against it.  In this lesson, we are interested in researching if two categorical variables are related or associated (i.e. dependent).  Therefore, until we have evidence to suggest that they are we must assume that they are not.  This is the motivation behind the hypothesis for the Chi-square Test of Independence:

\(H_0\): In the population, the two categorical variables are independent.
\(H_a\): In the population, two categorical variables are dependent.

[NOTE: The are several ways to phrase these hypotheses.  Instead of using the words "independent" and "dependent" one could say "there is no relationship between the two categorical variables" versus "there is a relationship between the two categorical variables".  The important part is that the null hypothesis refers to the two categorical variables not being related while the alternative is trying to show that they are related.]

Once we have gathered our data we summarize the data in the two-way contingency table.  This table represents the observed counts and is called the Observed Counts Table or simply the Observed Table.  The contingency table on the introduction page to this lesson represented the observed counts of the party affiliation and opinion for those surveyed. The question becomes, "How would this table look if the two variables were not related?"  That is, under the null hypothesis that the two variables are independent, what would we expect to find in our data if the two variables (e.g. Party Affiliation and Opinion) were not related?  We need to find what is called the Expected Counts Table or simply the Expected Table.  This table displays what the counts would be for our sample data if there were no association between the variables.

Finding Expected Counts from Observed Counts

Once we have the observed counts we need to compute the expected counts under the null hypothesis that the two categorical variables are independent.  This is done using the marginal totals and overall total to compute expected counts for each cell of the table.  In words, to find the expected count for each cell in the table we take multiply the marginal row and column totals for that cell and divide by the overall total.  Formulaically for each cell this is:

\[E=\frac{row\ total \times column\ total}{sample\ size} \]

To demonstrate, we will use the Party Affiliation and Opinion on Tax Reform example.

Observed Table:

  favor indifferent opposed total
democrat 138 83 64 285
republican 64 67 84 215
total 202 150 148 500

Calculating Expected Counts from Observed Counts

  favor indifferent opposed total
democrat \[\frac{285\cdot 202}{500}=115.14\] \[\frac{285\cdot 150}{500}=85.5\] \[\frac{285\cdot 148}{500}=84.36\] 285
republican \[\frac{215\cdot 202}{500}=86.86\] \[\frac{215\cdot 150}{500}=64.5\] \[\frac{215\cdot 148}{500}=63.64\] 215
total 202 150 148 500

To better understand what these expected counts represent, first recall that the expected counts table is designed to reflect what the sample data counts would be if the two variables were independent.  Taking what we know of independent events, we would be saying that the sample counts should show a similarity on opinions of tax reform between democrats and republicans.  If you find the proportion of each cell by taking a cell's expected count divided by it's row total, you will discover that in the expected table each opinion proportion is the same for democrats and repulicans.  That is, from the expected counts, 0.404 of the democrats and 0.404 of the republicans favor the bill; 0.3 of the democrats and 0.3 of the republicans are indifferent; and 0.296 of the democrats and 0.296 of the republicans are opposed.

The statistical question becomes, "Are the observed counts so different from the expected counts that we can conclude a relationship between the two variables?"  To conduct this test we compute a Chi-square test statistic where we compare each cell's observed count to it's respective expected count.  This Chi-square test statistic is calculated as follows:

\[\chi^{2*}=\sum (O_i-E_i)^2/E_i \]

As we have done with other statistical tests, we make our decision by either comparing the value of the test statistic to a critical value (rejection region approach), or by finding the probability of getting this test statistic value or one more extreme (p-value approach).  The critical value for our Chi-square test is \(\chi^2_{\alpha}\) with degree of freedom = (r - 1) (c - 1), while the p-value is found by \(P(\chi^2>\chi^{2*})\) with degrees of freedom = (r - 1)(c - 1)

Calculating the test statistic by hand:

\[\chi^{2*}=\frac{(138-115.14)^2}{115.14}+\frac{(83-85.50)^2}{85.50}+\frac{(64-84.36)^2}{84.36}+\frac{(64-86.86)^2}{86.86}+\frac{(67-64.50)^2}{64.50}+\frac{(84-63.64)^2}{63.64}=22.152\]

with degrees for freedom equal to (2 - 1)(3 - 1) = 2.

Example: Political Affiliation and Opinion and Tax Reform

Let's apply the Chi-square Test of Independence to our example where we have as random sample of 500 U.S. adults who are questioned regarding their political affiliation and opinion on a tax reform bill. We will test if the political affiliation and their opinion on a tax reform bill are dependent at a 5% level of significance. The observed contingency table (political_affiliation.txt) is given below.   Also we often want to include each cell's expected count and contribution to the Chi-square test statistic which can be done by the software

  favor indifferent opposed total
democrat 138 83 64 285
republican 64 67 84 215
total 202 150 148 500

Minitab logo Minitab Steps for Chi-square Test for Independence

  1. The command in Minitab is: Stat > Tables > Chi-Square Test for Association 
  2. If you have summarized (i.e. observed count) from the drop down box "Summarized data in two-way table".   Select and enter the columns that contain the observed counts. Otherwise, if you have the raw data use "Raw data (categorical variables). Note that if using the raw data your data will need to consist of two columns: one with the explantory variable data (goes in the 'row' field) and the response variable data (goes in the 'column' field). 
  3. Labeling (Optional) When using the summarized data you can label the rows and columns if you have the variable labels in columns of the worksheet.  For example if we have a column with the two political party affiliations and a column with the three opinion choices we could use these columns to label the output.
  4. Click the Statistics tab.  Keep checked the four boxes already checked, but also check the box for "Each cell's contribution to the chi-square. Click OK.
  5. Click OK.

SPECIAL NOTE: If you have the observed counts in a table you can copy/paste them into Minitab.  For instance you can copy the entire observed counts table (excluding the totals!) for our example and paste these into Mintiab starting with the first empty cell of a column. 

Cell Contents: Count, Expected count, Contribution to Chi-square

 
favor
indiffer
opposed
All
1
138
83
64
285
 
115.14
85.50
84.36
 
  4.5836 0.0731 4.9138  
         
2
64
67
84
215
 
86.86
64.50
63.64
 
  6.0163 0.0969 6.5317  
         
All
202
150
148
500

Pearson Chi-Sq = 4.539 + 0.073 + 4.914 + 6.016 + 0.097 + 6.514 = 22.152 DF = 2, P-Value = 0.000 

Likelihood Ratio Chi-Square (IGNORE THIS Ignore the Fisher's p-value! The p-value highlighted above is calculated using the methods we learned in this lesson.  More specifically, the chi-square we learned is referred to as the Pearson Chi-square. The Fisher's test uses a different method than what we explained in this lesson to calculate a test statistic and p-value.  This method incorporates a log of the ratio of observed to expected values.  Just a different technique that is more complicated to do by-hand.  Minitab automaticallly includes both results in its output.)

The Chi-square test statistic is 22.152 and calculated by summing all the individual cell's Chi-square contributions:

4.584 + 0.073 + 4.914 + 6.016 + 0.097 + 6.532 = 22.152

The p-value is found by \(P(\chi^2>22.152)\) with degrees of freedom = (2-1)(3-1) = 2.  Minitab calculates this p-value to be less than 0.001 and reports it as 0.000.  Given this p-value of 0.000 is less than alpha of 0.05, we reject the null hypothesis that political affiliation and their opinion on a tax reform bill are independent. We conclude that they are dependent, that there is an association between the two variables.

Minitab logo

Using Minitab

Click on this link to follow along with how to perform a Chi-Square test of independence from summarized data in Minitab.

Minitab Movie icon Click on the 'Minitab Movie' icon to display a walk through of 'Using Minitab to Perform a Chi-Square Test of Independence from Summarized Data'.

Condition for Using the Chi-square Test

Exercise caution when there are small expected counts. Minitab will give a count of the number of cells that have expected frequencies less than five. Some statisticians hesitate to use the chi-square test if more than 20% of the cells have expected frequencies below five, especially if the p-value is small and these cells give a large contribution to the total chi-square value.

image of an automobile tireExample: Tire Quality

The operations manager of a company that manufactures tires wants to determine whether there are any differences in the quality of workmanship among the three daily shifts. She randomly selects 496 tires and carefully inspects them. Each tire is either classified as perfect, satisfactory, or defective, and the shift that produced it is also recorded. The two categorical variables of interest are: shift and condition of the tire produced. The data (shift_quality.txt) can be summarized by the accompanying two-way table. Do these data provide sufficient evidence at the 5% significance level to infer that there are differences in quality among the three shifts?

  Perfect Satisfactory Defective Total
Shift 1 106 124 1 231
Shift 2 67 85 1 153
Shift 3 37 72 3 112
Total 210 281 5 496

Minitab output:

Chi-square Test
Expected counts are printed below observed counts

 
C1
C2
C3
Total
1
106
124
1
231
 
97.80
130.87
2.33
 
         
2
67
85
1
153
 
64.78
86.68
1.54
 
         
3
37
72
3
112
 
47.42
63.45
1.13
 
         
Total
210
281
5
496
         

Chi-Sq = 8.647 DF = 4, P-Value = 0.071

Note: 3 cells with expected counts less than 5.0.

In the above example, we don't have a significant result at 5% significance level since the p-value (0.071) is greater than 0.05. Even if we did have a significant result, we still cannot trust the result, because there are 3 (33.3% of) cells with expected counts < 5.0.

CAUTION: Sometimes researchers will classify quantitative data into categories (e.g. take height measurements and categorize as 'below average', 'average', and 'above average'.  Doing so results in a loss of information - one cannot do the reverse of taking the categories and reproducing the raw quantitative measurements.  Instead of categorizing, the data should be analyzed using quantitative methods.

The 2 × 2 Table: Chi-square Test Analagous to Z-Test of Two Independent Proportions

Say we have study of two categorical variables each with only two levels.  One of the response levels is considered the "success" response and the other the "failure" response.  A general 2 × 2 table of the observed counts would be as follows:

  Success Failure Total
Group 1 A B A + B
Group 2 C D C + D

The observed counts in this table represent the following proportions:

  Success Failure Total
Group 1 \(\frac{A}{A+B}=\hat{p_1}\) \(1-\hat{p_1}\) A + B
Group 2 \(\frac{C}{C+D}=\hat{p_2}\) \(1-\hat{p_2}\) C + D

Recall from our Z-test of two proportions that our null hypothesis is that the two population proportions, \(p_1\) and \(p_2\), were assummed equal while our alternative hypothesis was that they were not equal.  This null hypothesis would be analagous to the two groups being independent.  Also, if the two success proportions are equal then the two failure proportions would also be equal. Note as well that with our Z-test the conditions were that the number of successes and failures for each group was at least 5.  That equates to the Chi-square conditions that all expected cells in a 2 × 2 table be at least 5.  (Remember at least 80% of all cells need an expected count of at least 5.  With 80% of 4 equal to 3.2 this means all 4 cells must satisfy the condition). 

When we run a Chi-square test of independence on a 2 × 2 table, the resulting Ch-square test statistic would be equal to the square of the Z-test statistic from the Z-test of two independent proportions.  Consider the following example where we form a 2 × 2 for the Political Party and Opinion by only considering the Favor and Opposed responses:

  favor oppose Total
democrat 138 64 202
republican 64 84 148
Total 202 148 350

The Chi-square test produces a test statistic of 22.00 with p-value 0.000

The Z-test comparing the two sample proportions of \(\hat{p_d}=\frac{138}{202}=0.683\) minus \(\hat{p_r}=\frac{64}{148}=0.432\) results in a Z-test statistic of 4.69 with p-value of 0.000.  If we square the Z-test statistic we get 4.692 = 21.99 or 22.00 with rounding error.

So which test is better when we have a 2 × 2 table?  The are the same from statistical decision standpoint: A signficant Chi-square test would be similar to concluding a difference in the two proportions.  The benefit of the two-proportion test is that we can calculate a confidence interval for this difference to generate an estimate of just how large the difference might be.