So here's the example.

The categorical data like major has more than two possibilities.

Here the adviser had broken the majors to four possibilities,

business, engineering, liberal arts and ag.

So if you can code a student as either business, engineer,

or liberal arts and ag, all you need to do is to just pick three.

And again it doesn't matter which one you pick.

And in this case I pick business, engineer, and liberal arts to code and

I left ag out which means all of our data will be compared to students

who are left out, and those will be the students who have majored in agriculture.

So if a student has a value for one in engineering, it means that they

are an engineering student, if they have a value for business, it means that they

are business students, and as you can see the rest of it would be zero here.

We have assumed that they are not doing double majors in business and engineering,

and so on.

And we would have students like this one who has zero for

all the variables so this person must be an ag major.

So if they're not an engineer or business or

liberal arts, the only thing remaining is that they're ag.

So whenever you have a categorical variable you need to take the number of

possibilities minus one and include that many dummy variables in your models.

So here we have four majors, we need four minus one or three dummy variables.

And then we code it as zeros and ones, and then you are ready to run the problem.

So, we will go to data, you will go to data analysis,

pick regression, input your y-value, prediction is for the starting salaries so

the y-value is right here, I'll pick all of the data that I have and

then input is the GPA as well as the person's major and I pick all of that.

I make sure that it knows that I have labels included and then I say OK.

And here's our analysis.

First of all adjusted r square is

of significantly higher than what it used to be.

So 82.4% of variations in the starting salary of these graduates can be

explained by their GPA and their major.

So now let's look at the variables that we have and make sure that every single

variable is significant in our model and for that I will go to the last table.

So looking at the P value I see that all the values are less than 0.05.

So every variable that we have identified GPA and major has

a significant relationship with a starting salary, what we are trying to predict.

And now we are ready to make the prediction.

So let's take these values like I like to do, copy them down here and now I'm going

to say student number one, student number two, number three and student number four.

I am going to predict point estimate of prediction, so

I'm going to make the prediction here for each type of student.

So let's say this student has a GPA of 3.6 and is a engineering student.

So that means they are not business or they are liberal arts, so

this is all I need to provide.

So I'm going to do the prediction right here which is my intercept, and

I'm going to copy my formula so I'm going to lock the cells for

the coefficients for my model.

So I'm going to press F4 to lock it.

Plus, and then I'm going to use the function I told you about, SUMPRODUCT.

So I'm going to pick SUMPRODUCT and SUMPRODUCT is going to take these

variables, remember from GPA onwards, and I am going to say F4 again to lock them.

Second array is basically what is here.

And if the cell is empty, it is just going to replace it with zero, so

you are not going to get any problems.

This is the value we get.

Okay, let's now focus on this column, and then I will change the value so

you can see how these coefficients will matter.

So let's assume the next student has a GPA of 2.6 and still an engineering student.

What would their expected salary be?

So if I just drag this out you would see that their salary difference

is what the GPA coefficient is.

So for every complete point that it drops the students salary goes down by $1142 so

that's what you see the difference between these two be.

So if I just quickly do that you will see that that's the difference you will see