As Belt, you lead a project to improve your customer satisfaction,

you want to know whether your customers are satisfied with the company's product,

and therefore you wish to measure customer satisfaction.

You know that your organization has 40,000 customers,

how would you go about investigate if your customers are satisfied?

Well, you could try to ask every single customer the question,

are you satisfied with our product?

However, this will prove impractical, as it is simply too much work.

These 40,000 customers are all the customers that your organization sells to,

they are, therefore, the population that is of interest for

the question, are my customers satisfied?

To answer this question, you can select just a small group of,

say, 100 customers to ask this question, in statistics,

this small group is called a sample.

This sample is selected from the whole group of all customers,

called the population, and this is the topic of this video.

I will teach you the concepts of population and sample,

furthermore, I will show you how statisticians

use a sample to make statements about the population.

A population is a collection of all units of interest, that is,

in this example, all customers, and the sample is a subset of the population,

in this example, a selection of 100 customers.

On a more abstract level, we can look at it that the population

is a bunch of question marks, which are all the customers,

and we have no clue about their opinion on our product.

From these unknowns, we select some units,

some customers, these customers form our sample, and

each customer in the sample is one measurement represented by an X.

So, X1 for the 1st customer, X2 for the second customer in

the sample until we have 100 customers in our sample, X100.

Each customer in the sample we will ask the question,

are you satisfied with our product?

And their answer will be our data, you, as a researcher, will have to

select which units, or customers, you want to have in your sample,

and this should be done randomly to get a representative sample.

How to select representative samples is the topic of this video.

Let's go back to our 40,000 customers and a sample of 100.

Now you ask these 100 customers, are you satisfied with our product?

Imagine you find that 75 of the 100 customers is satisfied, that is, 75%.

Now, does this imply that we can conclude that 75% of our customers, that is,

30,000 customers in the population will also be satisfied with our product.

This is what statistics is all about, making claims about

a population based on the sample, and how precise are these claims?

That will depend, of course, on various aspects, and

one of them is the sample size, this will all be discussed in this video.

I will visualize these steps of making a claim about population using a diagram.

The population is represented by question marks, then we select our sample, and

then we collect data for each unit in a sample and analyze the sample data.

Finally, we use this analysis to make claims about the entire population.

Let's take a look at another example.

Remember that we were studying coffee beans and

you were interested in measuring the caffeine percentage in these beans.

We measured the caffeine percentage per batch,

what will be the population of coffee batches that I need to study?

Well, that will be the coffee batches produced this morning, this afternoon,

and during the night shift, but we would also have to include all batches

produced yesterday and tomorrow, so how will I describe the population?

The population would be all batches of coffee produced, and

to be produced, the batches of coffee that are selected to draw

conclusions about the population are called the sample.

Let's summarize, the population is all units of interest, all batches of coffee,

you select some of these batches to study, that's your sample.

You collect data on the units in your sample,

we can analyze this data using descriptive techniques, such as histogram,

the mean, and the standard deviation, for that, see the earlier videos.

However, in the end, we want to draw conclusions about all produced

coffee batches, the entire population, and

not just about the 40 batches of coffee that we studied in our sample.

So, we will use the sample distribution to estimate the population distribution and

we will use the sample statistics to estimate the population parameters,

but this is a topic for another video.

For now, you should remember a sample is a subset of the entire population,

and the sample is used to say something about the population.