As discussed in Section 16.3, it is difficult to capture the impact of the uncertainty about the parameters on the states of the model and, as a result, difficult to take it into account on the forecasting stage. Furthermore, so far, we have only discussed pure additive models, for which it is at least possible to do some derivations. When it comes to models with multiplicative components, it becomes nearly impossible to demonstrate how the uncertainty propagates over time. To overcome these limitations, we develop a simulation-based approach that relies on the selected model form.
The idea of the approach is to get the covariance matrix of the parameters of the selected model (see Section 16.1) and then generate \(n\) sets of parameters randomly from a rectified multivariate normal distribution using the matrix and the values of estimated parameters. After that, the model is applied to the data with each generated parameter combination to get the states, fitted values, and residuals. This way, we propagate the uncertainty about the parameters from the first observation to the last. The final states can also be used to produce point forecasts and prediction intervals based on each set of parameters. These scenarios allow creating more adequate prediction intervals from the model and/or confidence intervals for the fitted values, states and conditional expectations. All of this is done without additional assumptions (as it is done in bagging), relying entirely on the model. However, the approach is computationally expensive, as it requires fitting all the \(n\) models to the data, although no estimation is needed. If the uncertainty about the model needs to be taken into account, then the combination of models can be used, as described in Section 15.4.
smooth package has the method
reapply() that implements this approach for
adam() models. This works with ADAM ETS, ARIMA, regression and any combination of the three. Here is an example in R with \(n=1000\):
## Warning: The covariance matrix of parameters is not positive semi-definite. I ## will try fixing this, but it might make sense re-estimating adam(), tuning the ## optimiser.
Figure 16.2 demonstrates how the approach works on the example of
AirPassengers data with ETS(M,M,M) model. The grey areas around the fitted line show quantiles from the fitted values, forming confidence intervals of width 95%, 80%, 60%, 40% and 20%. They show how the fitted value would vary if the parameters would differ from the estimated ones. Note that there was a warning about the covariance matrix of parameters, which typically arises if the optimal value of the loss function was not reached. If this happens, I would recommend tuning the optimiser (see Section 11.4). For example, we could try more iterations via setting the
maxeval parameter to a higher value or re-estimating the model, providing the estimates of parameters in
B. If these fail and the bounds from the
reapply() are too wide, it might make sense to consider backcasting for the initialisation.
adamETSAirRefitted object contains several variables, including:
adamETSAirRefitted$states– the array of states of dimensions \(k \times (T+m) \times n\), where \(m\) is the maximum lag of the model, \(k\) is the number of components and \(T\) is the sample size;
adamETSAirRefitted$refitted– distribution of fitted values of dimensions \(T \times n\);
adamETSAirRefitted$transition– the array of transition matrices of the size \(k \times k \times n\);
adamETSAirRefitted$measurement– the array of measurement matrices of the size \((T+m) \times k \times n\);
adamETSAirRefitted$persistence– the persistence matrix of the size \(k \times n\);
The last three will contain the random parameters (smoothing, damping and AR/MA parameters), which is why they are provided together with the other values.
As mentioned earlier, ADAM ARIMA also supports this approach. Here is an example on artificial, non-seasonal data (see Figure 16.3):
Note that the more complicated the fitted model is, the more difficult it is to optimise it, and thus the more difficult it is to get accurate estimates of the covariance matrix of parameters. This might result in highly uncertain states and thus fitted values. The safer approach, in this case, is using bootstrap for the estimation of the covariance matrix, but this is more computationally expensive and would only work on longer time series. See example in R (and Figure 16.4):
The approach described in this section is still a work in progress. While it works in theory, there are computational difficulties with calculating the Hessian matrix. If the covariance matrix is not estimated accurately, it might contain high variances, leading to the higher than needed uncertainty of the model. This will result in unreasonable confidence bounds and lead to extremely wide prediction intervals.