Last week you looked at creating a synthetic seasonal data set that contained trend, seasonality, and a bit of noise. You also looked at some statistical methods for analyzing the data set and making predictions from it. Some of the results you got were actually quite good, but there was no machine learning applied yet. This week, you're going to look at using some machine learning methods with the same data. Let's see where machine learning can take us. First of all, as with any other ML problem, we have to divide our data into features and labels. In this case our feature is effectively a number of values in the series, with our label being the next value. We'll call that number of values that will treat as our feature, the window size, where we're taking a window of the data and training an ML model to predict the next value. So for example, if we take our time series data, say, 30 days at a time, we'll use 30 values as the feature and the next value is the label. Then over time, we'll train a neural network to match the 30 features to the single label. So let's, for example, use the tf.data.Dataset class to create some data for us, we'll make a range of 10 values. When we print them we'll see a series of data from 0 to 9. So now let's make it a little bit more interesting. We'll use the dataset.window to expand our data set using windowing. Its parameters are the size of the window and how much we want to shift by each time. So if we set a window size of 5 with a shift of 1 when we print it we'll see something like this, 01234, which just stops there because it's five values, then we see 12345 etc, etc,. Once we get towards the end of the data set we'll have less values because they just don't exist. So we'll get 6789, and then 789, etc, etc,. So let's edit our window a little bit, so that we have regularly sized data. We can do that with an additional parameter on the window called drop_remainder. And if we set this to true, it will truncate the data by dropping all of the remainders. Namely, this means it will only give us windows of five items. So when we print it, it will now look like this, starting at 01234 and ending at 56789. Great, now let's put these into numpy lists so that we can start using them with machine learning. Good news is, is that this is super easy, we just call the .numpy method on each item in the data set, and when we print we now see that we have a numpy list. Okay, next up is to split the data into features and labels. For each item in the list it kind of makes sense to have all of the values but the last one to be the feature, and then the last one can be the label. And this can be achieved with mapping, like this, where we split into everything but the last one with :-1, and then just the last one itself with -1. Which gives us this output when we print, which now looks like a nice set of features and labels. Typically, you would shuffle their data before training. And this is possible using the shuffle method. We call it with the buffer size of ten, And when we print the results, we'll see our features and label sets have been shuffled. Finally, we can look at batching the data, and this is done with the batch method. It'll take a size parameter, and in this case it's 2. So what we'll do is we'll batch the data into sets of two, and if we print them out, we'll see this. We now have three batches of two data items each. And if you look at the first set, you'll see the corresponding x and y. So when x is four, five, six and seven, our y is eight, or when x is zero, one, two, three, you'll see our y is four. Okay, now that you've seen the tools that let us create a series of x and y's, or features and labels, you have everything you need to work on a data set in order to get predictions from it. We'll take a look at a screen cast of this code next, before moving on to creating our first neural networks to run predictions on this data.