This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

14.7 Residuals are i.i.d.: zero expectation

In ADAM framework, this assumption only works for the additive error models. In case of the multiplicative error models, it is changed to “expectation of the error term is equal to one.” It does not make sense to check this assumption unconditionally, because it does not mean anything in sample. Yes, it will hold automatically in sample in case of OLS estimation, and the observed mean of the residuals might not be equal to zero in other cases, but this does not give any useful information. In fact, when we work with exponential smoothing models, the in sample residuals being equal to zero might imply for some of them that the final values of components are identical to the initial ones. For example, in case of ETS(A,N,N), we can use the measurement equation from (4.6) to express the final value of level via the previous values up until \(t=0\): \[\begin{equation} \begin{aligned} \hat{l}_t &= \hat{l}_{t-1} + \hat{\alpha} e_t = \hat{l}_{t-2} + \hat{\alpha} e_{t-1} + \hat{\alpha} e_t = \\ & \hat{l}_0 + \hat{\alpha} \sum_{j=1}^t e_{t-j} . \end{aligned} \tag{14.3} \end{equation}\] If the mean of the residuals in sample is indeed equal to zero then the equation (14.3) reduces to \(\hat{l}_t=\hat{l}_0\). So, this assumption cannot be checked in sample, meaning that it is all about the true model and the asymptotic behaviour rather than the model applied to the data.

The only part of this assumption that can be checked is whether the expectation of the residuals conditional on some variables is equal to zero (or one). In a way, this comes to making sure that there are no patterns in the residuals and thus no consecutive parts of the data, where residuals have systematically non-zero expectation.

There are different ways to diagnose the issue. The first one is the already discussed plot of standardised (or studentised) residuals vs fitted values from Section 14.3. The other one is the plot of residuals over time, something that we have already discussed in Section 14.5. In addition, you can also plot residuals vs some of variables in order to see if they cause the change in mean. But in a way all these methods might also mean that the residuals are autocorrelated and / or some transformations of variables are needed.

There is also an effect related to this, which is called “endogeneity.” According to econometrics literature it implies that the residuals are correlated with some variables. This becomes equivalent to the situation when the expectation of residuals changes with the change of a variable. The most prominent cause of this is the omitted variables (discussed in Section 14.1), which can be sometimes diagnosed by looking at correlations between the residuals and omitted variables. While econometricians propose using other estimation methods (such as Instrumental Variables) in order to diminish the effect of endogeneity, the forecasters cannot allow themselves doing that, because we need to fix the problem in order to get more adequate forecasts. Unfortunately, there is no universal recipe for the solution of this problem, but in some cases transforming variables, adding the omitted ones or substituting them by proxies (if we the variables are unavailable) might resolve the problem.

14.7.1 Multistep forecast errors have zero mean

This follows from the previous assumption if the model is correctly specified and its residuals are i.i.d. In that situation, we would expect the multiple steps ahead forecast errors to have zero mean. In practice, this might be violated if there are some structural changes or level shifts that are not taken into account by the model. The only thing to note is that this approach requires setting of the forecast horizon \(h\). This should typically come from the task itself and the decisions made based on that.

The diagnostics of this assumption can be done using rmultistep() method for adam(). This method would apply the estimated model and produce multiple steps ahead forecasts from each in-sample observation to the horizon \(h\), stacking the forecast errors by rows. Whether we apply additive or multiplicative error model, the method will report the residual \(e_t\).

adamModelSeat05ResidMulti <- rmultistep(adamModelSeat05, h=12)
colnames(adamModelSeat05ResidMulti) <- c(1:12)
boxplot(adamModelSeat05ResidMulti, xlab="horizon")
abline(h=0, col="red")
points(apply(adamModelSeat05ResidMulti,2,mean), col="red", pch=16)

As the plot in the Figure above demonstrates, the mean of the residuals increases with the increase of the horizon, thus implying that some information has not been taken into account by the model.