Now there's one more trick we can add to the ARMA model to make it more accurate in its predictions, that's called an ARIMA model with an extra I in the middle. What does it signify? The I stands for integrated, we are combining ARMA techniques into a single integrated model, hence, it is called ARIMA. What is the I needed for? Remember the step in the beginning where we had to stationarize the data before modeling? The I helps with that, let's see more about I next. In an ARIMA model, there will be three parameters that will be needed. The p parameter indicates how many prior periods we are taking into consideration for explained autocorrelation. The q parameter indicates how many prior time periods we are considering for observing sudden trend changes. The additional d parameter signifies the difference d where we are now predicting the difference between one prior period and the new period rather than predicting the new periods value itself. So this is important because it helps us d trend or data and approach stationarity. Remember, d =1 may cause stationarity for a data set, which is not stationary before we build a model, while d= 2 may capture exponential movements in our time series, but it's not frequently used. There are three questions to address in the ARIMA model. The first item is the I, how many differences if any are needed to make the data stationary? Let's assume the data is already stationary. The second term is the AR term, how many lags do we include for autoregression part of the equation in this case one? The third item is the MA term, how many lags do we include in the moving average part of the equation in this case two? We take d differences, we take p lags of AR, we take q lags of MA and putting them together, our model is ARIMA pdq. We see that d is typically 0 but could be 1. We also realized that p can be 0 or 1 or 2 or higher. We also realize that q can be 0 or 1 or 2 or higher. That's more than a dozen combinations. This is how we can get a variety of models from a single time series. We can add many more variations to the above by including the following seasonality or SARIMA, AR, FIMA or SARIMAX and you can add your own variations here. It should be clear by now that selecting model parameters for an ARIMA model is an iterative exercise, hence evaluating each model with is unique set of pdq parameters is key. Choosing useful pdq values and adding seasonal effects is almost entirely a context driven endeavor. In addition, there are a few key tactics we can explore. One, plotting our residuals, as you can see in the model on the left and the chart on the right. If we do not observe a pattern in our residual error terms we can stop iterating. In this instance, the model indeed seems good since the residue was on the right don't have a clear pattern. Second, Ljung-Box Test, we can mathematically test the above assumption using the Ljung-Box Test.