We live in a complex world with diverse people, firms, and governments whose behaviors aggregate to produce novel, unexpected phenomena. We see political uprisings, market crashes, and a never ending array of social trends. How do we make sense of it? Models. Evidence shows that people who think with models consistently outperform those who don't. And, moreover people who think with lots of models outperform people who use only one. Why do models make us better thinkers? Models help us to better organize information - to make sense of that fire hose or hairball of data (choose your metaphor) available on the Internet. Models improve our abilities to make accurate forecasts. They help us make better decisions and adopt more effective strategies. They even can improve our ability to design institutions and procedures. In this class, I present a starter kit of models: I start with models of tipping points. I move on to cover models explain the wisdom of crowds, models that show why some countries are rich and some are poor, and models that help unpack the strategic decisions of firm and politicians.
The models covered in this class provide a foundation for future social science classes, whether they be in economics, political science, business, or sociology. Mastering this material will give you a huge leg up in advanced courses. They also help you in life. Here's how the course will work. For each model, I present a short, easily digestible overview lecture. Then, I'll dig deeper. I'll go into the technical details of the model. Those technical lectures won't require calculus but be prepared for some algebra. For all the lectures, I'll offer some questions and we'll have quizzes and even a final exam. If you decide to do the deep dive, and take all the quizzes and the exam, you'll receive a Course Certificate. If you just decide to follow along for the introductory lectures to gain some exposure that's fine too. It's all free. And it's all here to help make you a better thinker!

From the lesson

Aggregation & Decision Models

In this section, we explore the mysteries of aggregation, i.e. adding things up. We start by considering how numbers aggregate, focusing on the Central Limit Theorem. We then turn to adding up rules. We consider the Game of Life and one dimensional cellular automata models. Both models show how simple rules can combine to produce interesting phenomena. Last, we consider aggregating preferences. Here we see how individual preferences can be rational, but the aggregates need not be.There exist many great places on the web to read more about the Central Limit Theorem, the Binomial Distribution, Six Sigma, The Game of LIfe, and so on. I've included some links to get you started. The readings for cellular automata and for diverse preferences are short excerpts from my books Complex Adaptive Social Systems and The Difference Respectively.

Professor of Complex Systems, Political Science, and Economics Center for the Study of Complex Systems

Hi. In this lecture we're going to talk about

Â

a really simple model of aggregation.

Â

So, here's the thing I want to model

Â

I want to model a situation where I've got a group of people

Â

-- it could be 100, it could be 1000 --

Â

and each one is independently

Â

going to make a decision to do something.

Â

It could be to, y'know, go to the gym.

Â

It could be to go to the beach.

Â

It could be to go to the grocery store.

Â

What I want to try and understand is that

Â

we've got a whole bunch of people

Â

each one that is making these independent decisions

Â

What's the number of people that shows up?

Â

Now, to characterize that I'm going to use an idea

Â

called the probability distribution.

Â

So, to make this simple, let's suppose that there is

Â

a small group of people, like my family,

Â

which has four people in it.

Â

And I want to know "What's the distribution of number

Â

of four people who go for a walk on a given Saturday?"

Â

Well, if I think about the numbers could be --

Â

there could be 0 people that go, there could be 1,

Â

there could be 2, there could be 3, or it could be

Â

that all 4 of us decide to go for the walk, right?

Â

The dog would prefer if all four of us went,

Â

but, y'know, there's going be some number that goes.

Â

So, I could take -- I could keep track of data.

Â

I could, y'know, chart this on, like, my wall somewhere.

Â

I doubt we could, right? And you can ask,

Â

"What's the likelihood that nobody went for a walk?"

Â

Maybe that's 10%.

Â

Now, what's the likelihood that 1 person went for a walk?

Â

Well that might be 15%.

Â

What about 2 people? That might be 40%.

Â

And what about 3 people? That might also be 15%.

Â

And then, what's the likelihood that 4 of us went for a walk?

Â

That might be, let's say, 20%.

Â

Now the thing to know about a probability distribution is that

Â

each one of these probabilities is less that one, right?

Â

And if we sum them up we get 25 plus 40 is 65

Â

plus 15 is 80; plus 20 is 100.

Â

So we get a total of 100%.

Â

So a probability distribution tells us is

Â

what are the different things that could happen

Â

-- 0,1,2,3 and 4 --

Â

and then it tells us the likelihood of each of those things.

Â

OK, so here's sorta the huge result that we're going to

Â

leverage to understand how things add up.

Â

There's a theorem called the Central Limit Theorem.

Â

And what the Central Limit Theorem tells us is that

Â

if I add up all the whole bunch of individual, independent events

Â

So what does 'independent' mean?

Â

It means my decision to go to the beach

Â

is independent of your decision to go to the beach,

Â

which is independent of your cousin Mary's

Â

decision to go to the beach.

Â

So, by independent, I mean not influenced.

Â

So, I don't care whether you're going to the beach or not.

Â

I'm going to make my decision on my own,

Â

completely independent of what you decide to do

Â

... or your cousin Mary.

Â

So, what the Central Limit Theorem tells us is that

Â

if a whole bunch of people make a whole bunch

Â

of independent decisions, the distribution that we get

Â

has this nice bell-shaped curve.

Â

And this bell-shaped curve means that like

Â

the most likely outcome is the one right in the middle.

Â

So, there's a lot of structure to what happens.

Â

And that means that we can predict a lof ot things.

Â

We can tell a lot about what's going on in the world

Â

And that's what we're going to learn about in this lecture.

Â

It's going to be a lot of fun.

Â

To get an understanding of where these distributions

Â

come from, let's start really simple.

Â

Suppose I flip a coin twice.

Â

And I want to know "What are the odds of getting a head?"

Â

What's the probability distribution over heads.

Â

Well, what could I get?

Â

I could get tails-tails, and that would be 0 heads.

Â

I could get tails-heads, or heads-tails

Â

both of these would be 1 head.

Â

Or I could get heads-heads.

Â

And that would be 2 heads.

Â

So, what's the probability of each of these?

Â

The probability of getting tails-tails is just 1/4.

Â

The probability of getting 1 head is 1/2.

Â

And the probability of getting 2 heads is 1/4.

Â

So, I'm going to get a probability distribution,

Â

if I do it out like this 0, 1, 2

Â

There's a 1/4 chance of that and a 1/2 chance of that

Â

and a 1/4 chance of that

Â

You notice, it sorta looks like a little bell curve.

Â

OK. Let's suppose I flip it 4 times.

Â

Well, it gets harder.

Â

I could think, OK, what are the odds of getting no heads?

Â

I could get tails-tails-tails-tails

Â

well, how do I figure out the probability of that?

Â

Well, 1/2 time 1/2 times 1/2 times 1/2 -- 4 one halves --

Â

that's 2 times 2 times 2 times 2 ... so that's 1/16.

Â

What are the odds of getting one head?

Â

Well, I could get the head first, and then 3 tails...

Â

I could get it second,

Â

I could get it third,

Â

and it could come last.

Â

So, there's four places that it could show up.

Â

So that means there's a 4/16 chance.

Â

Well, I could do all sorts of math again for

Â

what are the odds of getting two heads?

Â

And I'd actually get 6/16.

Â

And 3 heads, well, that's the same as getting 1 head

Â

really, right, because tails and heads are interchangable.

Â

So what I'd get, I'd get this again --

Â

If I drew this distribution out, I'd get a peak at 2 heads, right?

Â

I'm going to get a nice bell curve, right?

Â

So, I'm going to get this thing where

Â

there's very little chance of getting no heads

Â

not that much chance of getting 4 heads

Â

but the most likely thing is getting 2 heads.

Â

So I can count all this stuff and it's fun...

Â

big data, lots of data, and we want to try

Â

But, here is the problem:

Â

Remember we have talk about it. [inaudible]

Â

and understand it.

Â

Often we have more than 2 or 4, we have 'n'

Â

and that is a huge number.

Â

So if we're talking about New York City that can be 10 million people.

Â

If we're talking about Ann Harbour, where I live, that's still like a hundred thousand people.

Â

So I don't want to be sitting there writing tails, tails, tails, tails, tails a hundred thousand times.

Â

I want to have a model that will help me explain it.

Â

So what you can do is if you have n things, the mean, the expected number should be

Â

N over 2, right, should be half of n.

Â

But what we'd like to do is understand sort of what that distribution looks like.

Â

Well, what we know from statistics is that distribution is actually gonna be a nice bell-curve

Â

and the mean, right in the middle of this thing, is gonna be N/2 and this just gonna

Â

sort of flow out nice and symmetrically from each side

Â

Now there's a fancy equation, a formula that tells you what this line looks like.

Â

We're not going to get into that but if you take the Statistics class

Â

which I'd encourage you do - it's a lot of fun - you could learn exactly

Â

what this formula is and how it works, OK?

Â

We just wanna use it as a model for understanding how things aggregate.

Â

So, we're gonna take some leaps ahead in statistics.

Â

Here's the trick though, we gotta be a little bit careful.

Â

Flipping a coin is always equally likely, it's either a head or a tail, each one is 50/50

Â

But if I'm worried about people going to the beach

Â

right, or people going to the supermarket, or people showing up for their flight

Â

that's not a 50/50 proposition, right.

Â

So maybe 90% of people may show up for their flight

Â

and maybe only 10% of people of 15% go out to the beach.

Â

So I'd like to change that 1/2 into something else

Â

Well, I can introduce something called the binomial distribution

Â

where instead of having 1/2, that gives some probability p of doing the thing.

Â

So let's suppose going to the beach happens 15% of the time

Â

Well then, if I had a 1000 people, and p = 15%, then p times N is a 150

Â

so I expect to have a 150 show up.

Â

So that makes sense but then I can ask well, what's the distribution now

Â

I mean 150 is the average but I could have 200, I could have 74

Â

Well again, what the central limit theorem tells us is that we're

Â

gonna get a nice bell-curve, right here you got this nice shape here

Â

[inaudible] with the mean here, which is p times N

Â

well this will be provided if N is big enough, right

Â

but if I can get a pretty large N, you're gonna get this nice bell-curve and the mean is gonna be right at p times N

Â

Okay, there's more of that. Here's where it gets a little bit complicated but also interesting.

Â

There's something called the standard deviation and this is this thing called sigma

Â

which is [inaudible] called the standard deviation. Now, when I draw the normal curve,

Â

there's gonna be a mean, that's this point right here at the center

Â

And then there's gonna be a standard deviation which basically tells us how far spread out

Â

that curve is

Â

And what I mean by that is how far spread out the different outcomes are

Â

So it turns out there's this nice structure to any normal distribution. If you tell me the mean,

Â

and then you tell me the standard deviation,

Â

it's always gonna be the case that 68%

Â

of all outcomes will be between -1 and +1 standard deviation

Â

So, if it's got a big standard deviation

Â

That means that that range could be really wide

Â

If it's got a small standard deviation that means that range should be really tight but

Â

if you tell me the mean and tell me the standard deviation

Â

it's always gonna be the case that 68% of the time I'm between -1 and +1 standard deviation

Â

Now, in fact since that's true for one, it's also gonna be true for 2, true for 3 and true for 4, right

Â

So there's gonna be a 95% chance I'm within 2 standard deviations

Â

So wait, why do we care about this, why do we care about this stuff

Â

Here's why. Now I got this model that says if I add up a bunch of independent events,

Â

here's what the mean is, right.

Â

Now, in a second, I'm going to show you the formula for the standard deviation

Â

so then it'll tell you what sigma is here

Â

Well, if you know the mean and you know sigma,

Â

then I can give you range and I can tell you, you know, that 95% of the time

Â

I'm gonna be between -2 sigma and +2 sigma

Â

So if I said the mean number of people that showed up is a hundred

Â

and that's the mean, right, and the standard deviation is only 2

Â

Well then you'd know 95% of the time, you're gonna be between 96 and 104

Â

So you'd know, okay, I should prepare for pretty much exactly 100 people

Â

If I told you the standard deviation was 15,

Â

then you'd know it can be anywhere between 70 to130

Â

So that's what we want to try and use this model to explain [inaudible]

Â

how wide a range of outcomes we're likely to see in any particular setting

Â

So let's go back to our simple binomial distribution where the probability was 1/2

Â

The mean, remember, is just N over 2

Â

the standard deviation is the square root of N over 2

Â

Well, you can do a little bit of math and show that

Â

So, let's suppose I have N = 100

Â

So if N = 100, that tells me the mean is gonna be 50 so if I flip a coin a hundred times,

Â

guess what, the average is 50, no surprise

Â

But, standard deviation is the square root of N/2

Â

What's the square root of 100, that's 10

Â

So this is 10 over 2 so that gives 5

Â

So what that tells me is if I think in binomial distribution

Â

right, if I draw this thing out

Â

I've got a mean of 50, and then I've got a standard deviation of 5

Â

so that means between 55 and 45 -- 68% of all outcomes

Â

So, if you want, you can do this at home -- it'll take a while -- flip a coin a hundred times

Â

Count how many heads you get. Flip it again, count how many heads again

Â

Do that a whole bunch of times, you'll find that 68% of the time, you get between 45 heads and 55 heads

Â

So what this model gives us is it gives a sense of how strange of outcomes we'll get

Â

So, we know that most of the time, 68% of the time, we'll be between 45 and 55, right

Â

So, our mean is 50, 1 standard deviation is 55 and 45, that means 2 standard deviations is 60 and 40

Â

What that tells us is 95% of the time, you're gonna be between 40 and 60 heads

Â

And, 99% of the time, you're gonna be between 35 and 65

Â

So basically it's, you're almost never gonna throw fewer than 35 heads, and never throw more than 65 heads

Â

And so this is what sort of power that Central Limit Theorem is, right

Â

It gives us a sense of not only the average, but also what the spread will be

Â

Okay, remember this is a simple case. This is the P = 1/2 case

Â

And what we'd like is we want it for the more general case where the probability of something happening can be anything

Â

Right, this is this p over N thing

Â

What turns out here, we're okay because the standard deviation is just p times 1 - p times N then square root the whole thing

Â

So the case where p = 1/2 right, then we have the square root of 1/2 times 1/2 times N

Â

But notice I've got a 1/2 squared here inside so we can just pull that outside so it's just 1/2 the square root of N

Â

so that's where that square root of N over 2 came from

Â

So now, for the binomial distribution, I've got this clean formula as well

Â

And we can use that to model and understand stuff that's a little bit more interesting than just flipping a coin

Â

Let's a real example, let's have some fun

Â

So, how, most of us have probably been bumped off a plane before

Â

You show up at the airport and there's like too many people showed up for the plane

Â

And you think why did they do this, but the reason that they sometimes have to [inaudible] is they oversell

Â

And the reason they oversell tickets is because not everybody shows up

Â

So if you're running an airline and you've got 400 seats, and you know people show up, you know, 90% of the time

Â

You want to sell more than those 400 seats, right, so that your plane is pretty much full

Â

So let's do an example. Let's suppose, make it simple, that our plane [inaudible] got 380 seats

Â

So let's suppose we got a Boeing 747 with 380 seats

Â

Let's suppose that 90% of the time, people show up

Â

So we've gathered, we run an airline, we've gathered lots of data

Â

We pretty much know 90% of the time, people show up and that it's independent

Â

So one person's decision to show up doesn't [inaudible] have anything to do with anybody else's

Â

Now, that might not be true, right. Because if it's snowy, if I'm late, you're likely to be late

Â

But let's just suppose that these things are independent. And let's suppose that we sell 400 tickets

Â

Now we're trying to get some understanding why, what is that mean

Â

What's the likelihood that if we sell 400, that we're going to have more than 380 people show up

Â

Here's where the model can help us. It'll be able to tell us what the mean is, it will also tell us what the standard deviation is

Â

So the mean, right if I sell 400 tickets, and on average 90% of people show up, that means I should sell on average 360 tickets

Â

That's less than 380 seats but it should be fine, but what I care about is more than 380 people show up

Â

'cause they're gonna be like, I paid for this to go to Florida, I want to go to Florida, I don't want to be bumped

Â

So more than 380 show up, guess what, they're gonna be mad, right

Â

So the 360 doesn't tell us enough, we want to know something about the distribution

Â

Okay, well look, we've got a formula, right, remember

Â

So N was 400, and p was .9 so p times N is 360, that's our mean

Â

Now, the standard deviation we can solve for pretty easily. That's just the square root

Â

of p, which is .9 times 1-p, which is .1, times N which is 400

Â

So if we multiply right out that's .9 times .1 times 400

Â

.1 times 400 is 40, times .9 is 36, that gives the squared of 36, which is 6

Â

So 6, is our standard deviation. Now I get a bell-curve with a mean of 360 and a standard deviation of 6

Â

Well, that's useful, that can help us 'cause let's go back and let's look

Â

That means our mean's 360, our standard deviation is 6, so that means 68% of the time, we're gonna between 354 and 366. That's great.

Â

It means that 95% of the time, we'll between 348 and 372, also great

Â

It means 99.75% of the time, we'll be between 378 and 342.

Â

Well, how many seats do we have, we have 380 seats, so this means that 99.75 -- actually more than that, right

Â

More than 99.75% of the time we won't overbook.

Â

So here's the Central Limit Theorem, let's, let's say it formally.

Â

Central Limit Theorem [inaudible] is the following. We got a whole bunch of random variables

Â

so those could be decisions to show up to a flight or not so in most case the random variables are just 1s and 0s

Â

Or they could be, you know, the weight of your bag. Each person's weight of their bag is [inaudible] independent variable.

Â

As long as those things are independent, so that means it, each person's decision doesn't depend on somebody else's or how much stuff I jam in my bag doesn't affect how much stuff you jam in your bag

Â

And that those things have finite variance -- what does that mean -- that means that they're bounded

Â

So we know we can't have super huge values, like so my bag couldn't weigh billions and billions of pounds

Â

So long as there's sort of you know, the possible range of [inaudible] that each one can take is bounded in some way

Â

Or doesn't with some high probability take huge, huge values then when you add those things up

Â

When you sum them up, you're gonna get a normal distribution which means a bell-curve, which means we can predict stuff

Â

We can use that model [inaudible] make sense of how the world works

Â

Now, let's step back for just a second and think about like, why this is so cool

Â

Suppose it weren't true, here's a little thought experiment

Â

Suppose it were the case that when I added up a bunch of independent events

Â

Most of the time, I get something nice, then there were some spiky probability of some huge event

Â

over here. What would this mean. Well, this would mean like sometimes

Â

you go to the grocery store, and there would be like, 1000 people there

Â

or sometimes you'd be like I'm just gonna run to the bathroom and there'd be 300 people in line, right

Â

A lot of the predictability of the world, a lot of the predictability of these sort of daily comings and goings

Â

stem from the fact that this can't happen and that we get these nice bell-curves

Â

Because if individual people, individual firms, individual groups of people

Â

make decisions that don't depend on what other people decide

Â

[inaudible] independent decisions, then what you're gonna get is

Â

you're gonna get sort of nice, regular stuff, according to a bell-curve

Â

Yeah, sure there'll be traffic jams, sure there'll be a lot of people at the mall

Â

There will be days where you get a lot coming on, and there'll be days where nothing much is going on

Â

But most of the time, you're gonna get things in that little region which is gonna be predictable and understandable

Â

Now, is everything normally distributed? No, it's not.

Â

What about stock returns? If you look at stock returns, you'll actually see that

Â

there's far too many days where really nothing happens, and there's far too many days where there's huge gains

Â

and far too many days where there's huge losses

Â

And what's going on there is this is that the actions are no longer independent

Â

For example, prices are going up, a lot of people may buy and that's going to cause prices to go up even further

Â

And if prices start to fall, people may sell and that can cause prices to fall even further

Â

So when events fail to become independent, fail to satisfy independence's assumption,

Â

then we can get more big events than we'd expected and more small events that we'd expected

Â

So let's put a [inaudible] on this, what have we got

Â

If we use the Central Limit Theorem as a model, and we use this model to explain

Â

how if we add up a bunch of independent events, then what we get is

Â

we get a nice normal distribution, right

Â

And we can understand the mean, we can understand the standard deviation

Â

We can use that to predict how likely things are to occur, right

Â

We also learn that like, it's that independence that gives us that normality, right

Â

Without independence, we could get really big events, really small events

Â

We can get all sorts of strange stuff happening

Â

So where we're gonna go next, I'm gonna get take, there's a brief lecture on something called

Â

the Six Sigma that pushes this idea sort of, the predictability of the system a little bit

Â

further than we had before. But then after that, we're gonna start

Â

y'know, we're [inaudible]

Â

having systems where there's interdependent actions

Â

and we have those interdependent actions, we're no longer going to get these sort of nice bell-curves

Â

We're going to get all sorts of really interesting, strange stuff

Â

It's going to be a lot of fun. Alright, thank you.