The chi squared test of independence is usually used to check if two categorical (or nominal) variables are independent or not. However when you have a dataset that consists of several nominal variables, you can still apply the test to answer the more general question "are any of these variables related to one another?"
KeyConnect, an online chi squared calculator implements a solution to this in 5 steps when the data consists of several nominal or categorical variables. The table below shows an example of such a dataset.
Keep in mind that a typical application of the chi squared test requires building a contingency table using the data. KeyConnect automates this step and allows you to view the contingency table between all pairs of variables thus significantly reducing your effort in applying the chi squared test.
Step 1: Identify number of classes (C) in each category or variable
In the above example, we have 3 classes for variable 1, “Income Class”, thus C1 = 3. Similarly C2 = 4, C3 = 4, C4 = 4. KeyConnect identifies the number of classes (C) for each variable automatically.
Step 2: Build contingency tables (T) for all pairs of variables
The next step of the computation is selecting a pair of parameters (or columns in the spreadsheet), and creating a contingency table between the pair. For a dataset with p parameters, the total number of contingency tables will be p * (p-1)/2. This example has 4 variables, so we will need to create 6 contingency tables.
KeyConnect starts by building a contingency table between “Income” and “children”. But before this step, it needs to compute the degree of freedom, df = (C1 – 1) * (C2 -1). Next it creates a blank table with C1 (=3) columns and C2 (=4) rows. This table is populated by counting the number of observations for each cell. For example, the number of cases where Income Class = L and # of children = 2 or 3 and so on.
Step 2 is repeated for all 6 contingency tables.
Step 3: Compute margin sums, frequencies and observed Chi squared values for each contingency table
This follows standard chi squared test procedure as described in earlier articles and eBook. At the end of this step, we will have 6 tables (for the above case) similar to this shown below.
The sum of the numbers from all the cells in this table will give the Observed Chi squared value for the two variables: Income Class and # of Children. The next step is to check if this observed value is less than a critical value.
Step 4: Compare Summation of Observed chi squared value (Ob) to critical chi squared value (Cr)
An user input that is taken along with the raw data is the confidence (or alpha level). KeyConnect has several commonly used alpha levels that a user can pick from a drop down menu. This is done right after loading the raw data into KeyConnect.
For an alpha level that is chosen, KeyConnect then compares the Observed chi squared summation (Ob) to the critical value (Cr) which is obtained from standard statistical tables which are internally stored within KeyConnect. Critical chi squared value is based on the degree of freedom, df (computed earlier in Step 2) and alpha level.
Step 5: Determine if a connection must be made between the variables
If Observed chi squared summation, Ob < critical value, Cr then V1 and V2 are NOT related, hence no connection are made. If observed chi-squared summation, Ob > critical value, Cr then V1 and V2 are related, hence a connection is established. These connections are visualized using the circle connect chart, similar to the one shown here.
As you can see, some of the lines may be thicker than others. This indicates the "Strength of connection" or degree of dependence calculated using this simple formula: [(Ob-Cr)/Ob].
KeyConnect also shows a table of all chi squared values computed as described in another article on this blog.
Download our free ebook on chi squared test of independence below.