A chi-square test for independence may give a clear answer about possible independence among two categorical variables. But it tells by itself nothing beyond this, and that's a bit disappointing. In this video, I'll try to take the interpretation of the chi square test a step further and explain how you can learn more about the strength and pattern of the association between the two variables. You can interpret a value of the chi squared statistic with an expected value by using the chi squared distribution with the appropriate choice for the degrees of freedom parameter. It answers the question, whether the particular value for the statistic you found is exceptional. If that's the case, you find the null hypothesis of independence so unlikely that you reject it. Without correcting for the number of cells, the size of the chi squared statistic can however not be interpreted by itself and doesn't tell anything about the effect size, that is, the strength of the association. Several indices have been created to express the strength of association between two nominal variables, and the most popular is Cramér's V. This is the equation by which you can calculate it. You have to take the square root of the chi squared statistic value, divided by the total number of cells and the index m. M is a smaller value of the number of rows and the number of columns minus one. The value for Cramér's V ranges from zero to one, regardless of the size of a contingency table. The value of zero means that there is no association between the variables and the value of one means that there is a perfect association. This is the case when you would know the category for one variable if you knew the category of the other. Let’s calculate Cramér’s V for a few contingency tables to see how it works out. This is a 3 by 2 table with no association. The Chi-squared value is 0.5 and Cramér’s V is 0.07 For this purpose, the residuals purcell can be used. The residuals purcell are the difference between the observed and expected frequencies. However, these have to be standardized somehow. Because the residual of 1 in a cell with an expected value of 2 is a lot bigger, relatively, than the residual of 1 with an expected value of 200. If we divide the residual with the standard error for a sample distribution of the residual, we would get what we want. The standard error is calculated by multiplication of the expected counts in the cell, with one minus its marginal column probability and with one minus its marginal row probability. And subsequently, taking the square root. The resulting standardized residuals follow a zed distribution, so their values can directly be interpreted as how many standard deviations the observed frequencies are away from the expected frequencies. Let's apply the analysis with standardized residuals to this table. As you can see, the values in the second row are much higher than those in the first. Now if you calculate expected values, and next the residuals by cell, you can see that, for instance, this cell, and this cell right below, have comparable residuals, whereas in this cell, the residuals are almost three times as small as here. Let's go on to calculate the standardized residuals by dividing each residual by standard error. It turns out that in the first pair of cells we considered, with comparable residuals. The standardized residuals in the second row is a lot smaller. In the second pair of cells on the other end with quite different residuals. The standardized residuals are in fact quite comparable. The table with standardized residual shows that especially these cells in the third row with values of 1.6 and minus 2.1 show observed values which deviate a lot from what's expected, if the variables were independent. Let me summarize what I explained in this video. You can express the strength between categories in a contingency table with indices that are based on the chi squared statistic. For a two by two table, the phi is a useful index. It has a value of zero for no association, and a value of one for a perfect association. For tables, Kramer's V can be used in a similar way. However, this index is harder to interpret because the maximum at three inches depends on the exact size of a table. The pattern of association in a contingency table can be analyzed by considering standardized residuals. These are calculated by the difference between observed and expected counts, divided by the standard error of the residual, and I interpret it in the same way as that values.