So in this case we may look to stratify students at this university.

Let's say by level of study.

Namely, we have as our strata, the different levels of programme

namely the bachelor's degree, the undergraduate students, those studying for

a masters degree, and the PhD community as well.

Such that, with this stratification of program level, we are dividing,

partitioning a total list of students into what we call MECE,

mutually exclusive and collectively exhaustive groups.

Mutually exclusive means that students will belong to at

most one of those groups.

For example, you cannot be studying let's say for an undergraduate qualification and

a post graduate qualification at the same university at the same point in time.

And collectively exhaustive in that each student must belong to one or

other of those strata.

Hence, every student must be studying one of those degrees, and

at most one of those degrees.

So once we've segregated our students out by the level of study, we would

then just take a series of simple random samples, one from each of those strata.

And hence this guarantees we get a representative sample in terms of drawing

from the full cross section of levels of study within the university.

Of course, a natural question to ask is, how many should we choose from

each of those groupings, the undergraduate, Masters and PhD?

Well, for that we may wish to consider the relative size of the strata.

Let's imagine a university had 50% undergraduates,

40% master students, and 10% PhD students.

So given,

the undergraduates would be the largest cohort of students at the institution.

It would make sense that they should occupy equivalently 50%

of our overall sample size.

And similarly, 40% of students should be drawn from the master's community.

And the remaining 10% from the PhD community.

So this is so called a proportionate stratified sampling

where we take into account the relative size of those groups.

Of course if each group was of the same size then we would

take exactly the same numbers of members from each of those strata.

Of course one further consideration might be the standard deviation

of the characteristic of interest within each of those groups.

Because if we had a group which had quite a small standard deviation for

the characteristic of interest, and of course that just means there's not

much variation of that characteristic within that particular natural grouping.

And hence there would be no need to sample a particularly large number

from that group because if you just observe a handful,

that's likely to give you a fairly clear picture of the group as a whole.

And hence, so-called disproportionate stratified sampling

would take into account not just the size of the strata but

also the degree of variation which exists within each group, i.e.

by taking into account the standard deviation

of the characteristic of interest in each group.

And finally, we move on to cluster sampling.

So let's return to this case of investigating student satisfaction.

Now, of course, we may not be interested in a single University, but rather,

all students at any university in a particular country.

So we might choose different institutions,

different universities as so-called clusters.

Now, I agree that no two universities are identical, but

they will be fairly similar to each other.

Two different universities.

They're both going to gave some undergraduate students, masters,

and PhD students.

And no doubt studying wide range of different disciplines as well.

So with cluster sampling, there were different forms of cluster sampling.

But at the simplest case, we would consider dividing our total population,

for example, students, into these mutually exclusive and

collectively exhaustive clusters, i.e., the different universities.

And then to try and save some time and

money, we wouldn't necessarily consider students at all institutions, but

rather we could take a random sample of all of these clusters.

For example, using simple random sampling.

And once those particular subset of universities within a country have

been chosen, then in a one-stage cluster sample,

we would then consider all students within those chosen universities.

But of course we know universities can have large numbers of students and

hence we may wish to have multiple stages of our sampling

whereby from those selected institutions, we then may wish to

introduce some multistage sampling whereby we then further do some stratification.

And from those selected institutions,

we then may wish to stratify by undergraduate programs, masters, and PhD.

And of course we could refine this process at many stages if so required.

So that's just to give you a taste of the different kinds of sampling techniques

out there.

Please do review the online materials for this MOOC, which consider the relative

advantages and disadvantages of the random versus non-random forms of sampling.

And the different constituent parts there of.

[MUSIC]