Welcome back, in the previous lecture, we talked about how we could use models to

become clearer thinkers. In this lecture, what we're gonna do is talk about how we

can use models with data. And this is an important reason why people use models, in

fact when you talk to scientists about why they use models whether they are social

scientists or natural scientists. What they'll typically say is well we use

models to take them to data, to basically use and understand data in better ways.

What I am going to do is unpack that in several directions. I wanna give some

specific reasons or ways in which people use models with data. Alright so the first

one first real reason is just to understand some basic patterns in the

data. So what do I need? Well you could look at data and it could just be a

straight line, and nothing could change. So for you look at a system where there's

not enough energy in the system we know that energy is neither lost nor gained so

energy is a constant. And we have a model that explains why we see energy being a

constant. Alternatively we can see something that's a straight line, and

increasing line. When you're on a model that explains that. And then we also

talked about how we can see patterns in data. So we could see things that go up

and down slowly like this, like business cycles and we have models that tell us why

we see these kinds of cyclic curves. We could have something that's much more

spiking. We could have a model that explains that. So. Again we talked about how there

this sort of hairball of data, this firehose of data. There's tons of data out

there. That datas gonna have patterns to it. And what we can do is use models to

understand why we see those particular patterns. Okay. In addition to the

patterns, there's also the use of models to predict specific points. So suppose you

are looking for a house and you see this house that's for sale and you're

wondering, I wonder how much that house is gonna cost. Well, you can have a model

that says okay, the price of the house depends on it's size. So here's sort of the size of

the house in square feet. And here's the price. We just put dollar sign there for

price. And maybe you get a linear model. And your linear model says basically for

every, you know, additional square foot the price of the house goes $100 or $200

or something like that. Well then if this is your model, so on your model you've got

a house that's got this many square feet it's 2,000 square feet, right, and you go

up here and find the point it's $100 per square foot then you're model would

predict that the house is $200,000 so. We can use just a simple model to make some

sort of prediction about, just in ballpark, how much a particular house

would cost, so this is, again, a common use of models to either construct a model

and from that model, you predict a point value. Okay, third reason why we use

models. It's not so much to predict the points, but to produce bounds. So suppose

you're the economic advisor to the president, not a job you'd necessarily

want, [laugh], but suppose you are. And the president comes to you and says,

what's inflation gonna be next year or next month? Well, you know, inflation

doesn't move that quickly. You might be able to say to the president, well, you

know, I think it's gonna be 1.2%. And you might be pretty confident that it's 1.2%.

But suppose the president says, you know what? I'm just doing some long range

forecasts, so, what if, what's inflation gonna be ten years from now? Well, who

knows what inflation is going to be ten years from now? So you may have some fairy

sophisticated models, but they're not going to give you a point estimate. So,

instead, what they might say is that I can tell you with pretty high probability that

it's going to be between zero and three percent. So it gives you a range. Right?

So what your model won't tell you exactly what's going to happen, cause there's too

many contingencies out there, there's too much complexity, too much uncertainty. You

can't say for sure, but your model might give you some bounds about what's going to

happen, and that can be really useful for making policy decisions. Okay. Reason

Four. Retrodiction. What do I mean by that? Well, you can use models with data

to predict. Past. Now there's a couple reasons you might do this. One reason is

you might not have data from the past, you might want to sort of use models to

figure out, what do we think the past was like? And this is think, you know,

geologists do this. You know, biologists do this, anthropologists do this,

archaeologists do this. They use models and data to try and figure out, what do we

think you know, temperature was like, how many animals do you think there were, what

were these civilizations like, those sorts of things. If you have the data, then you

can use models to see how good they are so you can actually retrodict data to

see if in fact your model would've worked, let me explain what it means, now suppose.

We're looking at some data streams. Perhaps it's, let's stick with that

employment. Suppose the unemployment data looks like this for some period of time.

Right. And now what you're doing is, is you're saying okay. We've got a model.

We're gonna ask how well that model will do. So what you do is you sort of fix

that. You give that model data up to here. So it's fitting pretty well. And then at

this point. Right here, you say hey, let's see how our model would predict from here

on now. If you run your model, it sort of goes like this. If it goes like that, you

can say, you know, our model in the past, if we were using the same model in the

past, it wouldn't have worked. And so that makes you fairly dubious about whether the

model's gonna work now. So, retrodiction, going back and testing past data, is a

good way to test how good your model really works. Fifth reason, predicting

other stuff. So you might construct a model for one reason. Let's suppose you're

really interested in the unemployment rate. You know, you construct a model to

predict the unemployment rate. But out of that pops out the inflation rate, so you

get something else. This is a good way to tell, you know, how strong your model

is.'Cause typically, you construct a model for one reason that gives you other stuff.

There's another type of predicting other that's way cool about models.

So when they developed the first models of the solar system, right? The heliocentric

model, the sun in the center, right? So you've got the sun sitting here in the

center, and the planets orbiting. The math didn't quite work out right. And they figured

out, there must be a big planet out here. That's causing the orbits of the

other planet to be skewed a little bit. And the big planet was Neptune. They

couldn't see it. But their model predicted it. So the model predicted something,

something else, something other, that was evident in the data. So models can

predict stuff. Other than what you expect them to predict. Which was really

cool. Alright, six, 63, to inform data collection. So let's suppose that you're

interested educational reform which is something I'm interested in. You want to

think okay, how do we make better schools? Well, what you can, remember in our last

lecture about being a clear better thinker. One thing models force us to do

is name the parts. So, I want to think, how are schools, how to make better

schools? Well there's a lot of data out there on school performance. So what i

want is, is I want some sort of model that explains why students do poorly and why

students do well. So you think, well what are the parts of that model? Well it might

things like Teacher quality, we call that TQ, right? There might be parental status,

we call that PS, whether your parents went to college, whether they got high school

degrees, whether they're doctors, lawyers, that sort of thing. There might be total

spending in the school district, that might matter, right? Things like class

size, just put CS for class size. Class size probably matters a lot. Right? You

might argue that, you know, technology. Matters is there technology in the

classroom. You might even argue, you know, there's general health. Is health a big

consideration. And you can even, you know argue, what is the, what are the other

students like in the school? What are the other peer effects? What is the effect of

what other students do? So if you don't have a model, you don't even know what

data to go get. So models help you to figure, okay, what data should we get, and

what data should be included, and what data, what data should we go out there and

find, so that use of models can be very useful since it tells you what data to go

out there and get. Our last two. For why you model art a little bit different, but

they're, they're similar to one another. And that is that we can use data, right?

To sort of tell us more about the model, and then we can use the model to tell us

more about the world. So let me, let me explain what I mean a little bit.

[inaudible] confused. So, one thing that these models force us to estimate hidden

parameters in the model. So, here's a, sort of a classic model from. Disease from

epidemiology the study of disease, is called the SIR model, so there's three

types of people, there's susceptible people, there's infected people, and

there's recovered people, so there's a disease you could be susceptible to it,

you could be infected, or you could be recovered and when you're recovered then

you're immune. You're not gonna get it again. So let's suppose that you know, you

work for the Center for Disease Control, and something you see, oh my gosh, people

are getting sick. But you don't know, there's some sort of flu going on. But

you're not quite sure how this is spreading. Is it spreading, is it

airborne, right? Is this virus spreading, you know, through mucus or something?

You're not sure. And you're also not sure how virulent it is, so you're not sure how

many people are gonna get the disease. What you've got, let's draw a little graph

where you get time on this axis. And you've got the number of people. Who have

the disease. And, what you can do is you can sort of see. Over time, exactly how

many people are getting the disease. Well, if you can see over time how many are

getting it from that data, you can predict how virulent the disease is. Like, how

likely it is to pass from one person to the other. And that's gonna allow you to

figure out, is the disease gonna go like this, or is it gonna go like that? And so,

from that data, you can estimate hidden parameters, right? Namely, how virulent

the disease is. Like, you can't tell by looking at data how likely one person is

to get it from another. You know, from just, you can't tell by looking at the

world. But by looking at how many people get it, you can go back and estimate. That

parameter. You can figure it out. That's what's really cool. Alright? Last reason,

calibration, so calibration refers to sort of constructing a model and then

calibrating it as close as possible to the real world. Let me give an example here.

So suppose I want to write a model of forest fires. So I'm going to draw some

really bad trees here. Here's a tree. Here's another tree, right. And I want to

know what's the probability, these are horrible trees, what's the probability

that the fire moves, right, from this tree to this tree. How fast does it move and

all that sort of stuff. Well what I can do, what I can gather is if the state

exists, tons of data about past forest fires, and with that past data I can

calibrate a really accurate model of forest fires. How likely are they to

spread? How you know their speed depends on how dry the trees are, how much

precipitation there's been, what the wind speed is, all that sort of stuff. Once

I've got all that data that would allow me then to figure out. You know, how

dangerous are particular forests? Right? I could say, oh my gosh, northern New Mexico

hasn't had rain in over two years. Here's how dry the soil is. Here's how dry the

trees are here is, you know, how many acres of forest we have, here's what the

wind speed is, and you can know exactly how dangerous a particular forest happens

to be at that particular moment in time. So you use all sorts of past indexes to

calibrate a particular model, you know, your big model and then you can use that

model. To construct policy. And that's what we're going to talk about in the next

lecture, right, how do we use models to make decisions, to strategize, right, and

to design things. Thank you.