Chapter 15 Model selection and combinations in ADAM
So far we have avoided the discussion of the topic of model selection and combinations. However, we cannot avoid it any more, we need to understand how to select the most appropriate model and how to capture the uncertainty around the model selection (see discussion of sources of uncertainty in Chapter 1 of Svetunkov (2021c)). There are several ways to decide which model to use, and there are several dimensions, in which a decision needs to be made:
- Which of the models to use: ETS / ARIMA / ETS+ARIMA / Regression / ETSX / ARIMAX / ETSX+ARIMA?
- What components of the ETS model to select?
- What order of ARIMA model to select?
- Which of the explanatory variables to use?
- What distribution to use?
- Should we select model or combine forecasts from different ones?
- Do we need all models in the pool?
- How should we do all the above?
- What about the demand occurrence part of the model? (luckily, this question has already been answered in Section 13.1.6).
In this chapter, we discuss the aspects, related to model selection and combinations in ADAM. We start the discussion with principles based on information criteria (discussed in Chapter 13 of Svetunkov (2021c)) for both ETS and ARIMA. We then move to the explanatory variables selection and finish with topics, related to combinations of models.
Before we do that, we need to recall the distributional assumptions (see Chapter 3 of Svetunkov (2021c)) in ADAM, which play an important role if the model is estimated via the maximisation of likelihood function. In this case an information criterion (IC) can be calculated and used for the selection of the most appropriate model. Based on this, we can fit several ADAM models with different distributions and then select the one that leads to the lowest IC. Here is the list of the supported distributions in ADAM:
- Generalised Normal;
- Log Normal;
- Inverse Gaussian;
auto.adam() implements this automatic selection of distribution based on IC for the provided vector of
distribution by user. This selection procedure can be combined together with other selection techniques for different elements of ADAM model discussed in the following sections of the textbook.
Here is an example of selection of distribution for a specific model, ETS(M,M,N) on Box-Jenkins data using
<- auto.adam(BJsales, model="MMN", h=10, holdout=TRUE) adamModel adamModel
## Time elapsed: 0.26 seconds ## Model estimated using auto.adam() function: ETS(MMN) ## Distribution assumed in the model: Log Normal ## Loss function type: likelihood; Loss function value: 245.3716 ## Persistence vector g: ## alpha beta ## 1.0000 0.2406 ## ## Sample size: 140 ## Number of estimated parameters: 5 ## Number of degrees of freedom: 135 ## Information criteria: ## AIC AICc BIC BICc ## 500.7432 501.1909 515.4514 516.5577 ## ## Forecast errors: ## ME: 3.219; MAE: 3.332; RMSE: 3.786 ## sCE: 14.133%; Asymmetry: 91.6%; sMAE: 1.463%; sMSE: 0.028% ## MASE: 2.819; RMSSE: 2.484; rMAE: 0.925; rRMSE: 0.922
In this case the function has applied one and the same model but with different distributions, estimated it using likelihood and selected the one that has the lowest AICc value.