Hello and welcome to this video on Primary and Secondary Quantitative Data Analysis. Do you remember the analogy of preparing a car for a race? When we did the data preparation in the previous video, we were preparing the car for the race which is the analysis. We looked into descriptive statistics, and research assumptions all of which are data preparation methods before embarking on the analysis. In this video, we learn about various inferential tests that you can perform on both primary and secondary data, and how to interpret the results of the analysis. Now, there are various ways you can conduct statistical analysis. Here, we will discuss four commonly used methods in urban research, namely: the Chi-Square Test, the T-Test, ANOVA, and the Regression Test. You may be wondering what determines which tests I must use. Well, this depends on the research question that you have, and the level of measurement of the data. I will try and illustrate each of these tests and hopefully you will be able to determine the most suitable test to answer your research question. Let us begin with the Chi-Square Test. Before performing this test, there are various conditions that should be met. The first is that your dependent variable should be nominal, or categorical. Secondly, your independent variables should consist of nominal groups for example male or female, employed or unemployed, yes or no, and so on. Thirdly, you should have independence of observation. This means that each subject contributes data to only one cell, therefore the sum of all cells frequencies in the table must be the same as the number of subjects in the sample. Let me give you an example. Let us say you are studying the differences of male and female in wanting a career with a sample of 45 people. When you sum up the number of male and female responses in either wanting a career or not wanting a career, the total should be 45 as exemplified on the screen here. The Chi-Square Tests whether there is a relationship between two categorical variables. You test if the observed frequencies are significantly different from the expected frequencies with the statistical formula shown here. When performing an inferential test, you begin from a point of unknown meaning that there is no relationship between the variables. This is what is called a Null Hypothesis. The opposite, which would mean that there is a relationship between the variables, becomes the Alternative Hypothesis. Let me elaborate this further using the research question about male and female career parts. In this example, the null hypothesis would be that men and women do not differ with respect to wanting a career. The alternative hypothesis would be that men and women do differ with respect to wanting a career. Similar to the test of normality mentioned in the previous video, a significance value, greater than 0.05, would imply that we accept the null hypothesis and a value less than 0.05 would imply that we reject the null hypothesis and accept its alternative. In the example used, if the significance is less than 0.05, it would mean that there is a relationship between gender, and wanting a career, and that the data supports the alternative hypothesis. The next test we will look at is the T-Test. This tests whether the means scores of a scale variable are equal for two groups. Just as in the chi-square test, there are some conditions that need to be met before performing the analysis. Firstly, the dependent variable should be measured on a continuous scale, that is, it measured at the interval or ratio level. Secondly, the independent variable should consist of two categorical independent groups, for example, male or female, employed or unemployed, city A or city B, native or foreigner and so on. Thirdly, there should be independence of observations which means that there is no relationship between the observations in each group, or between the group themselves. For example, there must be different participants in each group with no participant being in more than one group. Fourth, your dependent variable should be approximately normally distributed. Lastly, there should be homogeneity of variance meaning that the variance within each of the populations is equal. We discussed variance in the previous video on descriptive statistics. The figure here presents three scenarios for differences between means. The first thing to notice about this three situations is that the difference between the means is the same in all three. But you should also notice that the three situations do not look the same, they tell very different stories. The top example shows a case with moderate variability of scores within each group. The second situation shows the higher variability case, and the third shows the case with low variability. Clearly, we would conclude that the two groups appear more different or distinct in the bottom, or low variability case. Why is this? Because there's a relatively little overlap between the two bell-shaped curves. In the higher variability case, the group differences appears least striking because the two bell-shaped distributions overlap so much. This leads us to a very important conclusion. When we're looking at the differences between the scores for two groups, we have to judge the difference between their means relative to the spread or variability of their scores. The t-test does just this. Let us assume that our research investigates whether men or women differ in terms of their height. Now when performing the t-test, you would get two output tables. One is related to the assumption of homogeneity that I mentioned earlier. The statistical test is called the Levene's test and it informs us whether or not equal variance is assumed. Looking at the example here, the first table shows the output of the Levene's test and the second table shows the output of the t-test. The output would inform us whether to refer to the top row or the bottom row. In this example, we are testing whether men and women differ in terms of their height. The result of the Levene's test is greater than 0.05 which means that we accept our null hypothesis that equal variance is assumed. We would then proceed to read from the top row where we would look at the result of the t-test. Our null hypothesis for the t-test is that men and women do not differ in terms of their height. Reading from the two-tailed significant output, we see that the value is less than 0.05, which means that we reject the null hypothesis and we accept its alternative. We then conclude that men and women do differ in terms of their height. We have now looked into two kinds of inferential tests, the conditions that have to be met before we perform the tests, and the interpretation of the results using the significance levels. In the next video, will discuss the two other tests that I mentioned at the beginning, namely: the ANOVA test and the Linear Regression test. Thank you for watching and see you in the next video to learn more about inferential tests.