This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

15.3 Explanatory variables selection

There are different approaches for automatic variables selection, but not all of them are efficient in the context of dynamic models. For example, backward stepwise might be either not feasible in case of small samples or may take too much time to converge to an optimal solution (it has polynomial computational time). This is because the ADAMX model needs to be refitted and reestimated over and over again using recursive relations based, for example, on the state space model (10.4). The classical stepwise forward might also be too slow, because it has polynomial computational time. So, there need to be some simplifications, which will make variables selection in ADAMX doable in a reasonable time.

In order to make the mechanism doable in a limitted time, we rely on Sagaert and Svetunkov (2021) approach of stepwise trace forward selection of variables. It is the approach that uses the partial correlations between variables in order to identify, which of the variables to include on each iteration, and has because of that linear computational time. Still, doing that in the proper ADAMX would take more time than needed, so one of the possibles solutions is to do variables selection in ADAMX in the following steps:

  1. Estimate and fit the ETS model;
  2. Extract the residuals of the ETS model;
  3. Select the most suitable variables, explaining the residuals, based on an information criterion;
  4. Estimate the ADAMX model with the selected explanatory variables.

The residuals in step (2) might vary from model to model, depending on the type of the error term and the selected distribution:

  • Normal, Laplace, S, Generalised Normal or Asymmetric Laplace: \(e_t\);
  • Additive error and Log Normal, Inverse Gaussian or Gamma: \(\left(1+\frac{e_t}{\hat{y}_t} \right)\);
  • Multiplicative error and Log Normal, Inverse Gaussian or Gamma: \(1+e_t\).

So, the extracted residuals should be formulated based on the distributional assumptions of each model.

In R, step (3) is done using the stepwise() function from greybox package, which supports all the distributions discussed in the previous chapters.

While the suggested approach has obvious limitations (e.g. smoothing parameters can be higher than needed, explaining the variability otherwise explained by variables), it is efficient in terms of computational time.

In order to see how it works, we use SeatBelt data:

SeatbeltsData <- Seatbelts[,c("drivers","kms","PetrolPrice","law")]

We have already had a look at this data earlier, so we can move directly to the selection part:

adamModelETSXMNMSelect <- adam(SeatbeltsData, "MNM",
                               h=12, holdout=TRUE,
                               regressors="select")
plot(forecast(adamModelETSXMNMSelect, h=12, interval="prediction"))

summary(adamModelETSXMNMSelect)
## Warning: Observed Fisher Information is not positive semi-definite, which means
## that the likelihood was not maximised properly. Consider reestimating the model,
## tuning the optimiser or using bootstrap via bootstrap=TRUE.
## 
## Model estimated using adam() function: ETSX(MNM)
## Response variable: drivers
## Distribution used in the estimation: Gamma
## Loss function type: likelihood; Loss function value: 1117.189
## Coefficients:
##              Estimate Std. Error Lower 2.5% Upper 97.5%  
## alpha          0.2877     0.0856     0.1186      0.4565 *
## gamma          0.0000     0.0414     0.0000      0.0816  
## level       1655.9759    91.0281  1476.2378   1835.4961 *
## seasonal_1     1.0099     0.0155     0.9808      1.0459 *
## seasonal_2     0.9053     0.0153     0.8762      0.9413 *
## seasonal_3     0.9352     0.0156     0.9061      0.9712 *
## seasonal_4     0.8696     0.0147     0.8405      0.9056 *
## seasonal_5     0.9465     0.0162     0.9174      0.9825 *
## seasonal_6     0.9152     0.0155     0.8861      0.9513 *
## seasonal_7     0.9623     0.0160     0.9332      0.9983 *
## seasonal_8     0.9706     0.0159     0.9416      1.0067 *
## seasonal_9     1.0026     0.0169     0.9735      1.0386 *
## seasonal_10    1.0824     0.0178     1.0533      1.1184 *
## seasonal_11    1.2012     0.0183     1.1721      1.2372 *
## law            0.0200     0.1050    -0.1873      0.2271  
## 
## Sample size: 180
## Number of estimated parameters: 16
## Number of degrees of freedom: 164
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 2266.378 2269.715 2317.465 2326.131

Note that the function might complain about the observed Fisher Information. This only means that the estimated variances of parameters might be lower than they should be in reality.

Based on the summary from the model, we can see that neither kms, nor PetrolPrice improve the model in terms of AICc. We could check them manually in order to see if the selection worked out well in our case (construct sink regression as a benchmark):

adamModelETSXMNMSink <- adam(SeatbeltsData, "MNM",
                             h=12, holdout=TRUE)
summary(adamModelETSXMNMSink)
## Warning: Observed Fisher Information is not positive semi-definite, which means
## that the likelihood was not maximised properly. Consider reestimating the model,
## tuning the optimiser or using bootstrap via bootstrap=TRUE.
## 
## Model estimated using adam() function: ETSX(MNM)
## Response variable: drivers
## Distribution used in the estimation: Gamma
## Loss function type: likelihood; Loss function value: 1234.291
## Coefficients:
##             Estimate Std. Error Lower 2.5% Upper 97.5%  
## alpha         0.9508     2.5779     0.0000      1.0000  
## gamma         0.0000     0.0132     0.0000      0.0260  
## level        23.2952     1.2746    20.7783     25.8086 *
## seasonal_1    1.1340     0.0621     1.0115      4.2959 *
## seasonal_2    0.9924     0.9574     0.8698      4.1542 *
## seasonal_3    0.9248     0.8608     0.8023      4.0867 *
## seasonal_4    0.8342     0.7952     0.7117      3.9960 *
## seasonal_5    0.9068     0.9038     0.7843      4.0686 *
## seasonal_6    0.8625     0.9196     0.7400      4.0243 *
## seasonal_7    0.8396     0.8211     0.7171      4.0014 *
## seasonal_8    0.8477     0.7802     0.7252      4.0095 *
## seasonal_9    0.9798     1.0017     0.8573      4.1417 *
## seasonal_10   1.1417     1.3043     1.0192      4.3036 *
## seasonal_11   1.3273     1.6033     1.2047      4.4891 *
## kms           0.0000     0.0000    -0.0001      0.0001  
## PetrolPrice  -2.7216     3.1200    -8.8827      3.4312  
## law           0.0181     5.2239   -10.2976     10.3197  
## 
## Sample size: 180
## Number of estimated parameters: 18
## Number of degrees of freedom: 162
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 2504.581 2508.829 2562.054 2573.085

We can see that the sink regression model has a higher AICc value than the model with the selected variables, which means that the latter is closer to the “true model.” While adamModelETSXMNMSelect might not be the best possible model in terms of information criteria, it is still a reasonable one and allows making different decisions.

References

• Sagaert, Y.R., Svetunkov, I., 2021. Variables Selection Using Partial Correlations and Information Criteria.