Next, let's look at some examples of how we would calculate base weights. So the first one I've got here is a stratified simple random sample of establishments. So I've got five strata here, manufacturing, retail, wholesale, service, and finance. And I've got the number of establishments in the stratum listed here. They range from 600 up to 2,300. And you can see that the sample size, as shown in this column, is the same. 50 establishments from each stratum, and I'm drawing them with simple, random sampling. So I calculate the base weight as follows. I take the population size 600, it's cap Nh, divided by the sample size, 50 in this case, which comes out to be 12 over in this column. So I go down through every stratum doing this. The finance, for example, is 500 divided by 50, which is 10. And we get the set of base weights in this column. That means each one of these 50 units in stratum 1 gets a base weight of 12. Each one of the 50 units in stratum 2 gets a base weight of 24 and so on. Now one more thing to note about this is if I sum up the base weights across all sample units and all strata I'm going to get exactly cap n. Now, why does that happen? It's because I've got a base weight of 12 for each unit in the first stratum. There are 50 albums. 12 times 50 is 600. It works the same way in each of the strata, and when I sum across all these, I'm going to recover the population count of 5,000. Now, that is not always exactly true. Generally, what we're going to have is that the sum of the base weights is going to be an estimate of the population size. It may not be exactly cap in, although in stratified simple random sampling and simple random sampling, that will be exactly true. So let's look at another example. This one is probability proportional size, so I've got a really, really small example here. My entire population is four schools. Here they are, 1 2 3 4, and I've listed off the number of students in each school, so these goes 50 30 20 100. And the first thing I do is calculate the proportion of size in column C here, so if I take 50 over 200, that's one-fourth or 0.25. 100 over 200 is 0.50. Now, I calculate the selection probability in this particular PPS method of selecting samples as the sample size times the proportionate measure size. So in my little example, what I'm saying is that we'll just do sample size 2 right here. So in this column, I take 2 times the proportionate size, so 0.2 times 0.25 is 0.5, and so forth. So you'll see, the last one is a probability selection of 1.00, which just means that it's so big that we're sure to hit it when we take a PPS sample. And I calculate the base weights as the inverse of these selection probabilities. So 1 over 0.5 is 2, 1 and over 0.3 is 3.33, and I've done this for all 4 units in the population, but I'm only going to select the sample of 2. So suppose that schools 2 and 4 are the ones that I pick, so I've got one school with a base weight of 3.33, another with base weight of 1. If I add that up, I get 4.33 as shown here. Now, that's not equal to the population size of four. It's in the neighborhood but not exactly. This is typical in PPS samples. What it amounts to is selecting proportional to size is not a very good way to estimating the total of the zero one characteristic, typically. All right, finally let's look at an example of a two stage sample. So I'm going to take my school example from the last page and just add on another stage of sampling. So I've got just repeating my school selection probabilities here in the sample size 2. There they are, and I say that I'm going to select the same two schools, number two and number four here. And then within each school, I'm going to select ten students by simple random sampling as stated here. So I've got ten selected here and ten selected there. So I'm picking 10 out of the 30 that are in school 2, and 10 out of 100 that are in school 4. So the conditional selection probability in school 2 is 10 over 30 or one-third, and the conditional selection probability of a student in school 4 is 10 over 100 or 0.1, as shown here. Now, when I take the overall student selection probability, I'm going to get 0.3 times a third is 0.1 in school 2, and 1 times 0.1 is 0.1 in school 4. I invert those, and sure enough, you see, I get exactly the same base weight in these two cases. That's not an accident, so we can just sketch out the calculation here. If I look at school 2, let's do that one. I've got a selection probability of 2 times and the proportion of its size of school 2. So that's 30 out of, you see I've got the grand total here of 200 in the universe. So 2 times 30 over 300, and then within the scale, I'm going to pick 10, add up the 30 that are there. So 30 is canceled out, and I get what? 20 over 200, And what is that? That's just one-tenth, as shown above. So you can write this at in symbols, but what happens is you're always going to get this cancellation. So regardless of whether the schools are different sizes or not, my 30 is going to cancel with 30 down here. I'm going to have 100 canceling with 100. If I pick school 3, I would have 20 canceling with 20. And in all cases, I'm going to get the same 2 times 10 over 200 calculation. And this, as you may remember from the earlier course, is called a self-weighting sample where you end up with everything having the same weight. So that's a convenient way to do things, and it could be efficient if there's no reason to think that certain students are worth more in the sense of reducing the variance of what you're trying to estimate than others.