Hi. In this lecture we're gonna talk about bringing modules to data and when I say data what I mean is you know you. Pull all sorts of numbers off the web, or maybe you've got you know, in you?re, for your business or your home you got all this data out there and how much money you've spent each month or something like that. We [inaudible] how we can use models to understand that data. And we're gonna start simple and we're gonna build up. So we're gonna start with something that I call categorical models. Now in a categorical model what you do is you just sort of place all the different data in different boxes. So for example, suppose you had a bunch of data on how long people live, and you want to try to understand what. Allows someone to live a long and healthy life. We might create one box of people who exercise and one box of people who don't exercise and we can just, by looking at how much variation there is in each of those boxes and what the means are in those two boxes you can figure out does that distinction make that categorization between exercising and not exercising, does that actually help understand all the variation in the data? After you do categorical models we're gonna move on to linear models. Now why don't I give a distinction between a linear model and a line. So a line is just something that we plot, right? So we just have like, you know, here's the Y variable, here's the X variable, and we draw this line. In a linear model what we assume is we assume that this Y variable depends on X. Right? And so what this is, is this, this is some sort of relationship. So in this axis, X could literally be the amount of exercise you do, and Y could be how long you live. And so what you have is this line is sort of saying that how long you live is a function of how much you exercise. So you're literally thinking [inaudible] Y. Is some function of X. Okay, so after we do linear models, and I explain what they mean. And I have a short lecture sort of describing how we fit data to linear models. So if you've got a bunch of data, and then you wanna construct a linear model, how do we do it? Here's a simple example. Suppose I've got a bunch of data here like this. And I want to ask, you know what's the best line to go through that data? Well clearly this would be a terrible line because it doesn't, it's not near the data. This line here that they've done this black line looks like a pretty good line. It looks pretty close to the data, so we're going to see how exactly to draw the line and what criteria do you use, right, to make sure you've got the best line? After we've done linear models, we're gonna move nonlinear models. And we'll show that the techniques are actually fairly similar. Now, what do I mean by nonlinear? Well, one simple way that somebody could be nonlinear, is they could start out straight and then kind of flatten out. Or, something could start out slow, and then get big. Or it could sorta do both. It could start out slow, and then get big, and then flatten out, right? So there's all different shapes a function could take. You know, John Von Neumann, one time, said, the set of non linear functions is like the set of non elephants. And by that, he meant that the, the number of non linear functions is enormous. So we'll talk a little bit about how we could use some of the same techniques used for linear models to create non linear models. Now after we do that we're gonna conclude this, this unit by talking about something I call the big coefficient. So what do I mean? Well when you have a linear model, right? You have like Y equals A1, X1 plus A2, X2. So X1 and X2 are the variables. These are the things that determine the outcome. So we'll talk about for example school quality. So X1 might be what the class size is. And X2 might be how good the teacher is. And so this A1 and A2 are the coefficients, and these coefficients tell how important is the variable. So the bigger the coefficient, the more important the variable. So when I say the big coefficient. What I mean is making policies or making the decision based on which one of these coefficients is biggest. Now that makes a lot of sense. So what I'm gonna do is, I'm first going to argue that boy, you know, better to use the big coefficient then to just sort of do seat of the pants thinking. We're going to see why that's the case. Linear models are just better, right? We've seen a little of this before but we're going to go into more detail. Linear models are just better than just sort of thinking off the cuff. Then I'm also going to criticize a little bit, right? Because I'm gonna say one problem with big coefficient is it only works in the area in which we've got the data. And oftentimes, if we want to make the world a lot better place we have to shift into an entirely new reality. We've got to shift to a place where there is no data. So I'm gonna draw distinction between what I call the big coefficient, and what I'm gonna call the new reality. Brighter situations that are just maybe a lot better than what we currently have. Alright, so that's a summary of where we're gonna go. We're gonna start out with categorical models, then linear models, show how to fit lines to data, right? And we're gonna do some nonlinear models, and then we'll wrap it all up by talking about this idea of the big coefficient. Alright, thank you.