Chapter 15 Model selection and combinations in ADAM

So far, we have managed to avoid discussing the topic of model selection and combinations. However, it is important to understand how to select the most appropriate model and capture the uncertainty around the selection (see discussion of sources of uncertainty in Section 1.3 of Svetunkov, 2022a). There are several ways to decide which model to use, and there are several dimensions in which a decision needs to be made:

  1. Which of the models to use: ETS / ARIMA / ETS+ARIMA / Regression / ETSX / ARIMAX / ETSX+ARIMA?
  2. What components of the ETS model to select?
  3. What order of ARIMA model to select?
  4. Which of the explanatory variables to use?
  5. What distribution to use?
  6. Should we select the model or combine forecasts from different ones?
  7. Do we need all models in the pool?
  8. How should we do all the above?
  9. What about the demand occurrence part of the model? (luckily, this question has already been answered in Subsection 13.1.6).

In this chapter, we discuss these questions. We start with principles based on information criteria (addressed in Chapter 13 of Svetunkov, 2022a) for ETS and ARIMA. We then move to selecting explanatory variables and finish with topics related to the combination of models.

Before we do that, we need to recall the distributional assumptions in ADAM, which play an essential role if the model is estimated via the maximisation of the likelihood function (Section 11.1). In this case, an information criterion (IC) can be calculated and used for the selection of the most appropriate model. Based on this, we can fit several ADAMs with different distributions and then select the one that leads to the lowest IC. Here is the list of the supported distributions in ADAM:

  • Normal;
  • Laplace;
  • S;
  • Generalised Normal;
  • Log-Normal;
  • Inverse Gaussian;
  • Gamma.

The function auto.adam() implements this automatic selection of distribution based on IC for the provided vector of distribution by a user. This selection procedure can be combined with other selection techniques for different elements of the ADAM discussed in the following sections of the monograph.

Here is an example of selection of distribution for a specific model, ETS(M,M,N) on Box-Jenkins data using auto.adam():

adamAutoETSBJ <- auto.adam(BJsales, model="MMN", h=10, holdout=TRUE)
adamAutoETSBJ
## Time elapsed: 0.24 seconds
## Model estimated using auto.adam() function: ETS(MMN)
## Distribution assumed in the model: Log-Normal
## Loss function type: likelihood; Loss function value: 245.3716
## Persistence vector g:
##  alpha   beta 
## 1.0000 0.2406 
## 
## Sample size: 140
## Number of estimated parameters: 5
## Number of degrees of freedom: 135
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 500.7432 501.1909 515.4514 516.5577 
## 
## Forecast errors:
## ME: 3.219; MAE: 3.332; RMSE: 3.786
## sCE: 14.133%; Asymmetry: 91.6%; sMAE: 1.463%; sMSE: 0.028%
## MASE: 2.819; RMSSE: 2.484; rMAE: 0.925; rRMSE: 0.922

In this case, the function has applied one and the same model but with different distributions, estimated it using likelihood and selected the one that has the lowest AICc value. It looks like log-normal is the most appropriate distribution for ETS(M,M,N) on this data.

References

• Svetunkov, I., 2022a. Statistics for business analytics. https://openforecast.org/sba/ (version: 31.03.2022)