Chapter 15 Model selection and combinations in ADAM
When it comes to time series analysis and to forecasting a specific time series, there are several ways to decide, which model to use, and there are several dimensions, in which a decision needs to be made:
- Which of the models to use: ETS / ARIMA / ETS+ARIMA / Regression / ETSX / ARIMAX / ETSX+ARIMA?
- What components of the ETS model to select?
- What order of ARIMA model to select?
- Which of the explanatory variables to use?
- What distribution to use?
- Should we select model or combine forecasts from different ones?
- Do we need all models in the pool?
- How should we do all the above?
- What about the demand occurrence part of the model? (this question has already been answered in Section 13.1.6).
In this chapter, we discuss all aspects, related to model selection and combinations in ADAM. We will start the discussion with principles based on information criteria, we will then move to more complicated topics, related to pooling and then we will finish with selection and combinations based on rolling origin.
Before we do that, we need to recall the distributional assumptions (see Chapter 3 of Svetunkov (2021c)) in ADAM, which play an important role if the model is estimated via the maximisation of likelihood function. In this case an information criterion (IC) can be calculated and used for the selection of the most appropriate model. Based on this, we can fit several ADAM models with different distributions and then select the one that leads to the lowest IC. Here is the list of the supported distributions in ADAM:
- Generalised Normal;
- Log Normal;
- Inverse Gaussian;
auto.adam() implements this automatic selection of distribution based on IC for the provided vector of
distribution by user. This selection procedure can be combined together with other selection techniques for different elements of ADAM model discussed in the following sections of the textbook.
Here is an example of selection of distribution for a specific model, ETS(M,M,N) on Box-Jenkins data using
<- auto.adam(BJsales, model="MMN", silent=FALSE, h=10, holdout=TRUE)adamModel
## Evaluating models with different distributions... dnorm , dlaplace , ds , dgnorm , dlnorm , dinvgauss , dgamma , Done!
## Time elapsed: 0.27 seconds ## Model estimated using auto.adam() function: ETS(MMN) ## Distribution assumed in the model: Log Normal ## Loss function type: likelihood; Loss function value: 245.3716 ## Persistence vector g: ## alpha beta ## 1.0000 0.2406 ## ## Sample size: 140 ## Number of estimated parameters: 5 ## Number of degrees of freedom: 135 ## Information criteria: ## AIC AICc BIC BICc ## 500.7432 501.1909 515.4514 516.5577 ## ## Forecast errors: ## ME: 3.219; MAE: 3.332; RMSE: 3.786 ## sCE: 14.133%; Asymmetry: 91.6%; sMAE: 1.463%; sMSE: 0.028% ## MASE: 2.819; RMSSE: 2.484; rMAE: 0.925; rRMSE: 0.922
In this case the function has applied one and the same model but with different distributions, estimated it using likelihood and selected the one that has the lowest AICc value.