So we can see that moving even one data point to be an outlier will effect

the correlation coefficient greatly because it is so sensitive to outliers.

Let's take a look at this practice question.

Which of the following is the best guess for the correlation between percentage

living in poverty and percentage of high school graduates?

Note that we haven't provided a formula for the correlation coefficient.

There of course is one.

And you will get to use computation to calculate correlation coefficient, but

there's absolutely no reason in this day and age to try to calculate that by hand.

However, given a bunch of choices, we should be able to pinpoint

which of these following sounds like a reasonable guesstimate for

the correlation between these two variables.

First off, we can get rid of 1.5 or -1.5 right off the bat.

Because we know that the correlation coefficient can only be between

negative one and positive one.

We also see that the relationship between these two variables is negative.

Therefore, any positive correlation coefficient doesn't make sense here.

So next we need to choose between negative 0.75 and negative 0.1.

Note that negative 0.75 is much closer to negative 1,

meaning that it indicates a much stronger relationship.

So the question becomes, do we see a strong relationship here, or

a pretty weak relationship?

Sometimes it helps to look at the negative spaces on our plot.

So we can see for example, that there are some negative spaces on our plot and

if we were to block those off it would be a little easier to see that there is

indeed a somewhat strong relationship between these two variables,

even with all the scatter around the line.

Therefore, the correct answer here is going to be -0.75.

A correlation coefficient of negative 0.1 would look like much more of

a random scatter that takes place of the entire plot without leaving any negative

spaces for us to get rid off so that we can better see the linear relationship.