So in a previous lecture,

we introduced some code

and some equations that allowed us

to balance model performance versus model complexity.

So we had some model that said,

how accurate is the model

versus how complex is the model.

In-between those two things,

we had some trade-off parameter.

We didn't know how to set the trade-off parameter yet.

So the purpose of this lecture is going to be to

introduce the concept of a validation set,

which we might use to select those trade-off parameters,

and then we'll explore

the relationship between parameters

like Theta and these hyperparameters like Lambda,

which is a trade-off, and finally,

we'll introduce the complete training validation

and test pipeline.

Okay. So just to recap what we

saw in the last few lectures.

We saw how the training set can be

used to evaluate model performance,

but it can only do so on data that we've seen before.

We need to introduce a test set if we'd like to estimate

how well will the model actually

generalize to unseen data,

and we saw how

a regularizer can be used to mitigate overfitting.

In other words, how I can balance or

trade-off model performance versus model complexity.

So specifically, how we did that in a previous lecture,

was to optimize an equation to looks like the following.

We have on the left-hand side of this equation,

a mean squared error,

which says essentially how accurate is

a particular model defined by the parameter vector Theta,

and on the right-hand side,

we have this term that penalizes model complexity.

In this case, we're penalizing the sum of

squared errors which would

encourage our model to choose parameters that

are approximately uniform or close to zero.

We could instead use something like

the sum of absolute values.

But in any case, we have one part of the model that

rewards accuracy and another part of

the model which penalizes complexity.

So we would like both of these things to be low.

The MSE should be low or the model should be accurate,

and the complexity should be low.

So the right-hand side of

the equation should be small as well.

Then, we have this value in the middle, Lambda,

which trades-off how much accuracy we

want versus how much complexity we want.

Okay. So what value should Lambda take?

We would like some value

of this regularization parameter.

It says we want good model accuracy

or low MSE on the left-hand side.

We also want low complexity

or a low sum of

squared parameters on the right-hand side.