So, some related problems that, that so, closer analysis are, is

useful for kind of identifying these types of patterns, but we

can maybe take a little more, a slightly more formal approach,

that kind of takes advantage of the matrix structure of the data.

And so the base, there are two kinds of

problems you might want to look at and so if

you have a lot of variables and we want,

we want to create a new set of variables that

are uncorrelated and explain as much variance as possible.

So the idea is that we have a lot of different variables.

Suppose we have hundreds or maybe thousands

or tens of thousands of variables in

our data set and the idea is

that they're not all independent measurements of something.

Right?

So a lot of them will be related to each other.

They will be correlated with each other.

So for example, you'll have two measurements

that are like height and weight and so

those will obviously be related to each

other and so they're not all independent kind

of like factors.

And you see the idea is that we want to create a set of variables that is smaller

than the original set of variables that we

have and that are all uncorrelated with each other.

So that they kind of represent different types of variation in your data set.

And similarly, we want this reduced set of variables to explain

as much of the variability in your data set, as possible.