Ideally, you would like to see that there's a similar variability at

every different parent height.

And you would like to see no big outliers and you would like to see them centered,

kind of nicely around zero.

That means that the line is fitting pretty well.

There's actually a whole set of residual diagnostics that you can do to

check to make sure that the lines fitting well.

But the things that you're definitely looking for are outliers,

distributions that are skewed and you're looking for any clusters of points that

appear to cluster, say away from these line when you're looking at the residuals.

You can color it by, color these dots by a whole bunch of other

different variables and see if there's a diagnostic for

why maybe the linear regression isn't working very well.

Keep in mind again that you can always fit a line but

the line doesn't always make sense.

Here again is Anscombe's quartet, so all of these lines are the same exact

line with the same exact parameters and significance and everything else.

So you get the exact same intercept and slope estimates.

But here for example, you see a curvilinear relationship.

Here you see a crazy outlier and again a crazy outlier right here.

So what you're looking for when you're fitting a line,

this is what you're sort of expecting to see, a sort of a scatter plot of points,

then it's a cloud of points like that.

If you see more specific relationships, you know that you have to do a more

specific model, a model that either accounts for quadratic variation or

a model that accounts for outliers.

To account for the fact that it's not really just a linear regression line that

you're actually supposed to be fitting there.

So this is actually a whole class.

I've done a lecture on it.

I'll do a couple more lectures but

it's sort of a very quick overview of regression models.

If you take the regressions model course in the John Hopkins Data Science

Specialization, you'll cover a whole bunch of diagnostics and ideas.

We've covered the basics here so you'll know what to fit,

but the diagnostics require a lot more intuition and thinking.

The basic thing to keep in mind, though, is does the line fit?

Is it make sense?

Not just does it fit statistically but does it make sense to fit a line?

And then there are great additional notes in this book here and

in the corresponding class on linear models and

the class on statistics for the life sciences on edX.