[MUSIC] In this video, I'm going to show you how to use the Pandas library to load a CSV file into your notebook, and then we're going to plot some of that data and see how it looks. So first of all, I'm going to go to my Python notebook management environment, and you can see here, I've got my list of notebooks, but what I'm going to do is, I'm going to add a data file. So I'm going to go to this upload button over here and pull in the data file, and it's called happyscore_income.csv. So it should ask me if I want to upload it, like that, so I see this view here, and I hit Upload, and then you can see it's appeared in my set of files. So my next step is, I'm going to create a nice new notebook that I'm going to work in. And it's going to be a Python 3 notebook, as usual, and then I'm going to start writing my code. So here's my starter cell. So first of all we're going to import the Pandas library. And just as with the other imports, we sort of give it a friendly name, which, in this case, is pd. So let's just run that, so it pulls in the pd library. My next step is, I'm going to load the data into a variable, so pd.read_csv, and I can specify the name, it's called happyscore. So you'll notice that I'm getting these dropdown menus. So that's a feature of the notebook environment that when I TAB, not only can I complete variable names, but I can also complete the names of functions, and even it can search in the current folder for files. So it's really fast to kind of type in and find different functions inside those libraries. Okay, so I've run that, and then I've got some data. So let's just do a print out on the data to see what it looks like. Okay, and you can see I've got this, basically, a kind of table of data here, and it covers various fields. I've got the country, adjusted_satisfaction, other things in there. And there's something in there called happiness_score, and that's what I'm interested in. So what I want to do next is, Get that happiness score out. So I'm going to pull it out. And because I like working in NumPy, I'm just going to pull it off of the Pandas data there and convert it into a simple NumPy array. So I'll do happy = data, and I'm going to pull out the happiness field. So it's called happyScore. And then I'm going to pull out income. And that's called avg_income. Okay, and then, let's just have a quick look at those: happy looks like that. And what I'm going to do so I don't have to run this load CSV command every time I execute the cell, I'm just going to create a new cell and jump into that and work in there. So I've got happy is happyScore. Income is avg_income. And then I'm going to print out happy. So happy is a bunch of values. So let's just see, can we do happy.max()? So the highest value is 7.5. The lowest value is 2.3, and I can do income.min() and income.max(). Okay, great, so that's just poking around the data a bit. Now what if I want to plot it? So let's create another new cell, and this time we're going to do the standard imports on the plotting library. So matplotlib.pyplot as plt. And i'm going to see if I can do some scatter plots. So what I'd like to see is the income plotted against the happiness. So we might say that the happiness is the dependent variable, in theory, and income is the independent variable. So as income increases, does happiness increase? So we can do that by saying plt.scatter, and I'm going to pass it... (x is going to be going on that way) so that's going to be the income, (and the happiness is going up that way) so that's going to be happy. Okay, so let's run that. So I should see eventually our plot. So if it doesn't run the first time, just run it again. Maybe it didn't quite come out, for whatever reason. So that's my plot. So let's just check that out a little bit. So you can see the income is going along here. In fact, I can label it, if I want to. I can do plt.xlabel, and I'm going to call it ‘income’, and I guess it's in dollars. And then I'm going to do ylabel, and that's going to be happiness, ‘happy score’. So now it should have those labels on there. So you can see now, I've got my plot with the labels and the plot there. So let's just sum up what we've done there. So we've used the Pandas library to load a CVS data file into the memory. And then we've pulled out two of the columns, the average income and the happy score column and used those to plot a scatter plot. [MUSIC]