14.5 Residuals are i.i.d.: autocorrelation

One of the typical characteristics of time series models is the dynamic relation between variables. Even if fundamentally, the sales of ice cream on Monday do not impact sales of the same ice cream on Tuesday, they might impact advertising expenses or sales of a competing product on Tuesday, Wednesday or next week. Missing this structure might lead to the autocorrelation of residuals, influencing the estimates of parameters and final forecasts. Autocorrelations might also arise due to wrong transformations of variables, where the model would systematically underforecast the actuals, producing autocorrelated residuals. In this section, we will see one of the potential ways for the regression diagnostics and try to improve the model stepwise, trying out different orders of the ARIMA model (Section 9).

As an example, we continue with the same seatbelts data, dropping the dynamic part to see what would happen in this case:

adamModelSeat09 <- adam(Seatbelts, "NNN", lags=12,
## [1] 2487.714

There are different ways to diagnose this model. We start with a basic plot of residuals over time (Figure 14.17):

plot(adamModelSeat09, 8, main="")
Standardised residuals vs time for the model 9.

Figure 14.17: Standardised residuals vs time for the model 9.

We see in Figure 14.17 that on one hand the residuals still contain seasonality and on the other, they do not look stationary. We could conduct ADF and / or KPSS tests (which will be discussed in one of the later Chapters of Svetunkov, 2021b) to get a formal answer to the stationarity question:

##  KPSS Test for Level Stationarity
## data:  resid(adamModelSeat09)
## KPSS Level = 0.69641, Truncation lag parameter = 4, p-value = 0.01387
## Warning in tseries::adf.test(resid(adamModelSeat09)): p-value smaller than
## printed p-value
##  Augmented Dickey-Fuller Test
## data:  resid(adamModelSeat09)
## Dickey-Fuller = -5.6449, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

The tests have opposite null hypotheses, and in our case, we would reject H\(_0\) for both of them on a 5% significance level. This means that they contradict each other, and we need to use our judgment to decide whether the residuals are stationary or not. First, I will see what happens with the model when we do take differences:

# SARIMAX(0,1,0)(0,0,0)_12
adamModelSeat10 <- adam(Seatbelts,"NNN",lags=12,
## [1] 2390.028

This leads to an improvement in AICc in comparison with the previous model. The residuals of the model are now also much better behaved (Figure 14.18):

plot(adamModelSeat10, 8, main="")
Standardised residuals vs time for the model 10.

Figure 14.18: Standardised residuals vs time for the model 10.

In order to see whether there are any other dynamic elements left, we will plot ACF and PACF (discussed in Subsections 8.3.2 and 8.3.3) of residuals (Figure 14.19):

ACF and PACF of residuals of the model 10.

Figure 14.19: ACF and PACF of residuals of the model 10.

In the case of the adam() objects, these plots, by default, will always have the range for the y-axis from -1 to 1 and will start from lag one on the x-axis. The red horizontal lines represent the “non-rejection” region. If a point lies inside the region, it is not statistically different from zero on the selected confidence level (the uncertainty around it is so high that it covers zero). The points with numbers are those that are statistically significantly different from zero. So, the ACF / PACF analysis might show the statistically significant lags on the selected level (the default one is 0.95). Given that this is a statistical instrument, we expect that approximately (1-level)% (e.g. 5%) of lags lie outside these bounds just due to randomness, even if the null hypothesis is correct. So it is okay if we do not see all points lying inside them. However, we should not see any patterns there, and we might need to investigate the suspicious lags (low orders of up to 3 – 5 and the seasonal lags if they appear). In our example in Figure 14.19, we see that there is a suspicious lag one on ACF and a suspicious lag two on the PACF, which might indicate that some dynamic elements are missing (e.g. MA(1) or AR(2)). Furthermore, there are spikes in lag 12 for both ACF and PACF. While it is not clear what specifically is needed here, we can try out several models and see which one is better to determine the appropriate order of ARIMA. We should start with the seasonal part of the model, as it might obscure the non-seasonal one.

# SARIMAX(0,1,0)(1,0,0)_12
adamModelSeat11 <- adam(Seatbelts,"NNN",lags=12,
## [1] 2390.919
# SARIMAX(0,1,0)(0,0,1)_12
adamModelSeat12 <- adam(Seatbelts,"NNN",lags=12,
## [1] 2405.051
# SARIMAX(0,1,0)(1,0,1)_12
adamModelSeat13 <- adam(Seatbelts,"NNN",lags=12,
## [1] 2349.809

Based on this analysis, we would be inclined to include both seasonal AR(1) and seasonal MA(1) in the model. Next step in our iterative process – another ACF / PACF plot of the residuals:

ACF and PACF of residuals of the model 13.

Figure 14.20: ACF and PACF of residuals of the model 13.

In this case, there is a big spike on ACF for lag 1, and a smaller one for the lag 2 with no significant lags afterwards, so we can try adding MA(1) and MA(2) components in the model:

# ARIMAX(0,1,1)(1,0,1)_12
adamModelSeat14 <- adam(Seatbelts,"NNN",lags=12,
## [1] 2327.42
# ARIMAX(0,1,2)(1,0,1)_12
adamModelSeat15 <- adam(Seatbelts,"NNN",lags=12,
## [1] 2361.103

Choosing between the two models above, we should select model 14, which has a lower AICc than model 15. So, adding MA(1) to the model leads to further improvement in AICc compared to the previous models. Using this iterative procedure, we could continue our investigations to find the most suitable SARIMAX model for the data. Nevertheless, this example should suffice in providing a general idea of how it can be done. What we could do else to simplify the process is to use the automated ARIMA selection algorithm (see discussion in Section 15.2) in adam(), which is built on the principles discussed in this section:

adamModelSeat16 <-
## [1] "SARIMAX(0,1,1)[1](0,1,1)[12]"
## [1] 2200.467

This new constructed SARIMAX(0,1,1)(0,1,1)\(_{12}\) model has lower AICc than the previous one and should be used instead. In fact, it is even better than the ETSX(MNM) (model 5) from the previous section in terms of AICc. Its residuals are much better behaved than the ones of model 5 (we might need to analyse the residuals for the potential outliers (as discussed in Subsection 14.4) in this model, though):

plot(adamModelSeat16, c(8,10:11))
Residuals of the model 16.

Figure 14.21: Residuals of the model 16.

So for the purposes of analytics and forecasting, we would be inclined to use SARIMAX(0,1,1)(0,1,1)\(_{12}\) rather than ETSX(MNM).

As a final word for this section, we have focused our discussion on the visual analysis of time series, ignoring the statistical tests (we only used ADF and KPSS). Yes, there is Durbin-Watson (Wikipedia, 2021f) test for AR(1) in residuals, and yes, there are Ljung-Box (Wikipedia, 2021g), Box-Pierce and Breusch–Godfrey (Wikipedia, 2021h) tests for multiple AR elements. But visual inspection of time series is not less powerful than hypothesis testing. It makes you think and analyse the model and its assumptions, while the tests are the lazy way out that might lead to wrong conclusions because they have the standard limitations of any hypothesis tests (as discussed in Section 5.3 of Svetunkov, 2021b). After all, if you fail to reject H\(_0\), it does not mean that the effect does not exist. Having said that, the statistical tests become extremely useful when you need to process many time series simultaneously and cannot inspect them manually. So, if you are in that situation, I would recommend reading more about them, but I do not aim to retell the content of Wikipedia in this textbook.


• Svetunkov, I., 2021b. Statistics for business analytics. https://openforecast.org/sba/ (version: 01.10.2021)
• Wikipedia, 2021f. Durbin–Watson statistic. https://en.wikipedia.org/wiki/Durbin%E2%80%93Watson_statistic (version: 2021-05-13)
• Wikipedia, 2021g. Ljung–Box test. https://en.wikipedia.org/wiki/Ljung%E2%80%93Box_test (version: 2021-05-13)
• Wikipedia, 2021h. Breusch–Godfrey test. https://en.wikipedia.org/wiki/Breusch%E2%80%93Godfrey_test (version: 2021-05-13)