### 5 simple steps to apply chi-square test for business analytics

In a previous article, we talked about the background of the **chi-square test of independence** and how it can address common business analytics problems. In this article, we demonstrate the actual usage with an example. Remember that the **chi-square test of independence** helps to find out if two business parameters, X and Y, are related to or are independent of each other.

**Business Problem:**

A specialty retail chain wants to determine if their strategy for changing the product mix has resulted in increased revenues. Their products are categorized into eight types according to price range. The category prices range from $30 per item to $120+ per item. Management decided that in order to increase sales, they need to reduce their higher priced inventory ($120+ range) by 50%.

**Question:**

Based on the data shown below, has their strategy worked?

**The 5-step solution process:**

**Step 1**: Identify the X's and Y's

This is the most important step, because the steps that follow is simply an algorithm that any tool can run through. Convention dictates that X's are usually the parameters that can be changed or controlled. In this case, the X is the strategy, and its data are the columns which represents all sales before strategy change and after strategy change. Therefore the Y's are the sales by category, whose data are rows which represent the different price categories.

**Step 2:** Calculate the margin summations.

Simply sum all rows and columns and enter these sums on the "margins".

**Step 3:** Complete the contingency table.

The contingency table is the same dimension as the data. Its entries are calculated as shown in the diagram.

**Step 4:** Calculate the observed chi-square value based on contingency table.

**Step 5:** Use standard tables to compare if **observed chi-square** to **critical value of chi-square** for the problem's **degree of freedom** and confidence level (also known as alpha).

The degree of freedom is simply = (number of rows -1)*(number of columns -1) in our original data table.

df = (8-1)*(2-1) = 7

Let us use a 90% level of confidence, which means alpha = 0.1

- If observed chi-square < critical chi-square, then variables are not related
- If observed chi-square > critical chi-square, then variables are not independent (and hence may be related).

## Comments

In other words, X and Y are not related.More precisely,

the data don't show that X and Y are related.Failure to reject the null is not the same as being able to accept the null -- it might be the case that the experiment was simply inadequate. Perhaps the real relationship will emerge with more data.