This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

11.6 Examples of application

For this example, we will use the data of Road Casualties in Great Britain 1969–84, Seatbelts dataset in datasets package for R, which contains several variables, the description for which is provided in the documentation for the data (can be accessed via ?Seatbelts command). The variable of interest in this case is drivers, and the dataset contains more variables than needed, so we will restrict the data with drivers, kms (distance driven), PetrolPrice and law - the latter three seem to influence the number of injured / killed drivers in principle:

The dynamics of these variables over time is shown on figure 11.1

The time series dynamics of variables from Seatbelts dataset.

Figure 11.1: The time series dynamics of variables from Seatbelts dataset.

It is apparent that the drivers variable exhibits seasonality, but does not seem to have a trend. The type of seasonality is difficult to determine, but we will assume that it is multiplicative. So a simple ETS(M,N,M) model applies to the data will produce the following (we will withhold the last 12 observations for the forecast evaluation):

This simple model already does a fine job in fitting and forecasting the data, although the forecast is biased and is lower than needed because of the sudden drop in the level of series, which can only be explained by the introduction of the new law in the UK in 1983, making the seatbelts compulsory for drivers. Due to the sudden drop, the smoothing parameter for the level of series is higher than needed, leading to wider intervals and less accurate forecasts, here is the output of the model:

## Time elapsed: 0.14 seconds
## Model estimated using adam() function: ETS(MNM)
## Distribution assumed in the model: Inverse Gaussian
## Loss function type: likelihood; Loss function value: 1126.128
## Persistence vector g:
##  alpha  gamma 
## 0.4189 0.0000 
## 
## Sample size: 180
## Number of estimated parameters: 15
## Number of degrees of freedom: 165
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 2282.256 2285.182 2330.150 2337.749 
## 
## Forecast errors:
## ME: 117.633; MAE: 117.633; RMSE: 137.342
## sCE: 83.505%; sMAE: 6.959%; sMSE: 0.66%
## MASE: 0.682; RMSSE: 0.61; rMAE: 0.503; rRMSE: 0.541

In order to further explore the data we will produce the scatterplots and boxplots between the variables using spread() function from greybox package:

The relation between variables from Seatbelts dataset

Figure 11.2: The relation between variables from Seatbelts dataset

The plot on Figure 11.2 shows that there is a negative relation between kms and drivers: the higher the distance driven, the lower the total of car drivers killed or seriously injuried. A similar relation is observed between the PetrolPrice and drivers (when the prices are high, people tend to drive less, thus causing less incidents). Interestingly, the increase of both variables causes the variance of the response variable to decrease (heteroscedasticity effect). Using multiplicative error model and including the variables in logarithms in this case might address this potential issue. Note that we do not need to take the logarithm of drivers, as we already use the model with multiplicative error. Finally, the legislation of a new law seems to have caused the decrease in the number of causalities. In order to have a better model in terms of explanatory and predictive power, we should include all three variables in the model. This is how we can make it in ADAM:

The parameter formula in general is not compulsory and can either be substituted by formula=drivers~. or dropped completely - the function would fit the model of the first variable in the matrix from everything else. We need it in our case, because we introduce log-transformations of some of explanatory variables. The forecast from the second model is slightly more accurate and, what is even more important, the prediction interval is narrower, because now the model takes the external information into account. Here is the summary of the second model:

## Time elapsed: 0.53 seconds
## Model estimated using adam() function: ETSX(MNM)
## Distribution assumed in the model: Inverse Gaussian
## Loss function type: likelihood; Loss function value: 1114.719
## Persistence vector g (excluding xreg):
##  alpha  gamma 
## 0.2133 0.0004 
## 
## Sample size: 180
## Number of estimated parameters: 18
## Number of degrees of freedom: 162
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 2265.438 2269.687 2322.911 2333.942 
## 
## Forecast errors:
## ME: 97.735; MAE: 99.106; RMSE: 126.019
## sCE: 69.38%; sMAE: 5.863%; sMSE: 0.556%
## MASE: 0.575; RMSSE: 0.559; rMAE: 0.424; rRMSE: 0.496

The model with explanatory variables is already more precise than the simple univariate ETS(M,N,M) (e.g. MASE on the holdout is lower), but we could try introducing the update of the parameters for the explanatory variables, just to see how it works (it might be unnecessary for this data):

In this specific case, the difference between the ETSX and ETSX{D} models is infinitesimal in terms of the accuracy of final forecasts and prediction intervals. Here is the output of the model:

## Time elapsed: 0.31 seconds
## Model estimated using adam() function: ETSX(MNM){D}
## Distribution assumed in the model: Inverse Gaussian
## Loss function type: likelihood; Loss function value: 1116.1
## Persistence vector g (excluding xreg):
##  alpha  gamma 
## 0.2013 0.0000 
## 
## Sample size: 180
## Number of estimated parameters: 21
## Number of degrees of freedom: 159
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 2274.201 2280.049 2341.253 2356.437 
## 
## Forecast errors:
## ME: 94.913; MAE: 95.68; RMSE: 125.612
## sCE: 67.377%; sMAE: 5.66%; sMSE: 0.552%
## MASE: 0.555; RMSSE: 0.558; rMAE: 0.409; rRMSE: 0.495

We can spot that the error measures of the dynamic model are a bit higher than the ones from the static one (e.g., compare MASE and RMSSE of models). In addition, the information criteria are slightly lower for the static model, so based on all of this, we should probably use the static one for the forecasting and anlytical purposes. In order to see the effect of the explanatory variables on the number of incidents with drivers, we can look at the parameters for those variables:

##         log.kms. log.PetrolPrice.              law 
##       -0.1198583       -0.2719426       -0.2387506

Based on that, we can point out that the introduction of the law reduced on average the number of incidents by approximately 24%, while the increase of the petrol price by 1% leads on average to decrease in the number of incidents by 0.27%. Finally, the distance has a negative impact on incidents as well, reducing it on average by 0.2% for each 1% increase in the distance. All of this is the standard interpretation of parameters, which we can use based on the estimated model. We will discuss how to do analysis using ADAM in future chapters, introducing the standard errors and confidence intervals for the parameters.

Finally, adam() has some shortcuts in cases, when a matrix of variables is provided with no formula, assuming that the necessary expansion has already been done. This leads to the decrease in computational time of the function and becomes especially useful when working on large samples of data. Here is an example with ETSX(M,N,N):