Interacting a continuous and dummy variable. As I mentioned, it's relatively straightforward to implement a regression model with an interaction term, but the interpretation of the results can be tricky. This is because you can't simply interpret the coefficient on one variable while holding the others constant. The approach to interpreting a model with an interaction term depends on the type of interaction. By the end of this video, you should be able to interpret the results of a regression model that includes an interaction between a continuous variable and a dummy variable. Let's begin by breaking down a regression model that includes an interaction term with a continuous variable and a dummy variable. Suppose we have the model Y = alpha + beta1X + beta2D + beta3X x D + the error term. In this model, let X be a continuous variable, and D be a dummy variable that can take on the value of zero or one. This model allows the effect of X to depend on the value of D and vice versa. Alpha is the intercept for the group D = 0, why is this? Well, let's set D = 0, notice that the model reduces to Y = alpha + Beta1X + the error term. So you can see that alpha is the intercept for the line that captures the relationship between X and Y when D = 0. Alpha + beta2 is the intercept for the group D = 1, why is this the case? Well, let's now set D = 1. Notice that in this case the model reduces to Y = alpha + beta1X + beta2 + beta3X x the error term. All I've done is plug one in for D and then let's combine like terms. When you do that you get the following, Y = the quantity of alpha + beta2 + the quantity of beta1 + beta 3X + the error term. So alpha + beta2 is the intercept for the line that captures the relationship between X and Y when D = 1. Beta1 is the effect of a one unit increase in X when D = 0, why is this? Well, think about what the regression line would look like for the D = 0 group. The slope of this line, the coefficient on X, is beta1. Finally, beta1 + beta3 is the effect of a one-unit increase in X when D = 1. This comes from the regression line for the D = 1 group. If you combine like terms, it becomes clear that the slope of this line, the coefficient on X, is beta1 + beta3. We'll continue to walk through this logic on the next slide, but feel free to pause the video here if you need more time to digest the content. Visualizing the two OLS lines, here I've once again placed the full model at the top of the slide. If we set the dummy variable equal to either 1 or 0, we can see how the model reduces. If we set the dummy variable equal to 1 and combine like terms, we can once again see that the intercept is equal to alpha + beta2, and the slope is equal to beta1 + beta3. If we set the dummy variable equal to 0, we see that the intercept is alpha and the slope is beta1. These two lines are plotted in the graph on the right. Notice that beta2 captures the difference in the intercepts between the two lines, and beta3 captures the difference in the slopes between the two lines. Spend some time examining and working with this interactive model. Be sure you understand why and how the model is really composed of two lines. It might help to write out the regression lines yourself. Plug in 0 for D and then 1 for D, and see how the line reduces when you do this. Let's walk through an example of a model that includes an interaction between a continuous variable and a dummy variable. Suppose we want to learn about the interactive effect of democracy and corruption on a country's economic performance. In other words, we want to know whether the effect of corruption on a country's economy depends on whether or not that country is a democracy. We might collect data and use the following three variables, a country's per capita GDP, measured in thousands of US dollars. A corruption index that measures the perceived level of corruption in a country on a scale of zero to ten. And a dummy variable that indicates whether the country is a democracy or not. Our PRF is as follows, a country's per capita GDP = alpha + beta1 times corruption + beta2 times democracy + beta3 times the interaction of corruption and democracy + the error term. The next step would be to estimate the model using a dataset and then interpret the results. The estimated model appears at the top of the slide. We can see that estimated per capita GDP = 35.58- 2.11 times Corruption + 16.8 times Democracy- 5.33 times the interaction between corruption and democracy. We can breakdown this estimated model into two lines, one for the non-democratic countries and one for the democratic countries. By doing this we'll be able to see if the effect of corruption on a country's economy is different between these two groups. If we set the democracy dummy equal to 0, we can see that the estimated model reduces to the following, per capita GDP = 35.58- 2.11 x Corruption. If we set the democracy dummy equal to 1, we can see that the estimated model reduces to the following, per capita GDP = 52.4- 7.44 x Corruption. Notice that both the intercepts and the slopes are quite different between the two lines. The difference in slope indicates that changes in perceived corruption have a larger effect in democratic countries than in non-democratic countries. We can easily see this difference by visualizing the two lines. The graph on this slide plots the estimated regression line for both the democratic countries and the non-democratic countries. Notice that the line for the democratic countries is much steeper. An increase in perceived corruption in democratic countries is associated with a sharper decline in economic performance than the same increase in perceived corruption in non-democratic countries. If you are an expert on this topic, you'd be able to offer insight as to why this might be. And perhaps you'd be able to think of other variables that might be useful to consider as part of this analysis. Another important way to interpret the results of a model that includes an interaction term is to calculate predicted values. At the top of the slide I've included the estimated line for the non-democratic countries and the estimated line for the democratic countries. We can calculate the predicted per capita GDP for a non-democratic country with a low level of perceived corruption, let's say one. A country is likely to have an expected per capita GDP of $33,570. We can likewise calculate the predicted per capita GDP for a democratic country with a low level of perceived corruption. A country like this has an expected per capita GDP of $44,960. If you were presenting the results from this model in a paper or to an audience, you could construct a table of predicted values. And even show how these predictions would change for different levels of perceived corruption. The key takeaway point is that developing and estimating a model should not be the end of an analysis. A good researcher will give careful thought to interpreting and presenting the results in a way that is meaningful.