This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

6.6 Examples of application

6.6.1 Non-seasonal data

In order to see how the pure additive ADAM ETS works, we will try it out using adam() function from smooth package for R on Box-Jenkins sales data. We start with plotting the data:

plot(BJsales)

The series seem to exhibit trend, so we will apply ETS(A,A,N) model:

adamModel <- adam(BJsales, "AAN")
adamModel
## Time elapsed: 0.05 seconds
## Model estimated using adam() function: ETS(AAN)
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 258.9257
## Persistence vector g:
##  alpha   beta 
## 1.0000 0.2469 
## 
## Sample size: 150
## Number of estimated parameters: 5
## Number of degrees of freedom: 145
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 527.8514 528.2681 542.9046 543.9485

The output of the model summarises, which specific model was estimated, assuming what distribution, how it was estimated, what are the values of smoothing parameters, the sample size, degrees of freedom and also produces information criteria. We can compare this model with the ETS(A,N,N) in order to see, which of them performs better in terms of information criteria (e.g. in terms of AICc):

adam(BJsales, "ANN")
## Time elapsed: 0.02 seconds
## Model estimated using adam() function: ETS(ANN)
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 276.0603
## Persistence vector g:
##  alpha 
## 0.9996 
## 
## Sample size: 150
## Number of estimated parameters: 3
## Number of degrees of freedom: 147
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 558.1207 558.2850 567.1526 567.5644

In this situation the information criteria for ETS(A,N,N) are higher than for ETS(A,A,N), so we should use the latter for forecasting purposes. We can produced point forecasts and prediction interval (in this example we will constract 90% and 95% ones) and plot them:

plot(forecast(adamModel,h=10,interval="prediction",level=c(0.9,0.95)))

Notice that the smoothing parameters of ETS(A,A,N) are very high, with \(\alpha=1\). This might mean that the maximum of the likelihood is achieved in the admissible bounds. We can try it out and see what happens:

adamModel <- adam(BJsales, "AAN", bounds="admissible")
adamModel
## Time elapsed: 0.08 seconds
## Model estimated using adam() function: ETS(AAN)
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 258.5198
## Persistence vector g:
##  alpha   beta 
## 1.0381 0.2260 
## 
## Sample size: 150
## Number of estimated parameters: 5
## Number of degrees of freedom: 145
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 527.0397 527.4563 542.0929 543.1367
plot(forecast(adamModel,h=10,interval="prediction",level=c(0.9,0.95)))

Both smoothing parameters are now higher, which implies that the uncertainty about the future values of states is higher as well, which is then reflected in the slightly wider prediction interval. Although the values are larger than one, the model is still stable. In order to see that we can calculate the discount matrix using the objects returned by the function:

discountMatrix <- adamModel$transition - adamModel$persistence %*% adamModel$measurement[nobs(adamModel),,drop=FALSE]
eigen(discountMatrix)$values
## [1]  0.7844854 -0.0485393

Notice that the absolute values of the both eigenvalues in the matrix are less than one, which means that the newer observations have higher weights than the older ones and that the absolute values of weights decrease over time, making the model stable.

If we want to test ADAM ETS with another distribution, it can be done using the respective parameter (here we use Generalised Normal, estimating the shape together with the other parameters):

adamModel <- adam(BJsales, "AAN", distribution="dgnorm")
print(adamModel,digits=3)
## Time elapsed: 0.05 seconds
## Model estimated using adam() function: ETS(AAN)
## Distribution assumed in the model: Generalised Normal with shape=1.855
## Loss function type: likelihood; Loss function value: 258.739
## Persistence vector g:
## alpha  beta 
## 0.999 0.243 
## 
## Sample size: 150
## Number of estimated parameters: 6
## Number of degrees of freedom: 144
## Information criteria:
##     AIC    AICc     BIC    BICc 
## 529.478 530.066 547.542 549.014
plot(forecast(adamModel,h=10,interval="prediction"))

The prediction interval in this case is slightly weider than in the previous one, because \(\mathcal{GN}\) distribution with \(\beta=\) 1.86 has fatter tails than the normal distribution.

6.6.2 Seasonal data

Now we will check what happens in the case of seasonal data. We use AirPassengers, which actually has multiplicative seasonality, but for purposes of demonstration we will see what happens, when we use the wrong model. We will withhold 12 observations to look closer at the performance of the ETS(A,A,A) model in this case:

adamModel <- adam(AirPassengers, "AAA", lags=12, h=12, holdout=TRUE)

The lags parameter in this specific case is not necessary, because the function will get the frequency from the ts object automatically. If we were to provide a vector of values instead of the ts object, we would need to specify the correct lag. Note that 1 (lag for level and trend) is not important either - the function will always use it anyway.

In some cases, the optimiser might converge to the local minimum, so if you find the results unsatisfactory, it might make sense to reestimate the model tuning the parameters of the optimiser. Here is an example (we increase the number of iterations in the optimisation and set new starting values for the smoothing parameters):

adamModel$B[1:3] <- c(0.2,0.1,0.3)
adamModel <- adam(AirPassengers, "AAA", lags=12, h=12, holdout=TRUE,
                  B=adamModel$B, maxeval=1000)
adamModel
## Time elapsed: 0.21 seconds
## Model estimated using adam() function: ETS(AAA)
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 544.0244
## Persistence vector g:
##  alpha   beta  gamma 
## 0.6481 0.0000 0.3519 
## 
## Sample size: 132
## Number of estimated parameters: 17
## Number of degrees of freedom: 115
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 1122.049 1127.417 1171.056 1184.163 
## 
## Forecast errors:
## ME: 13.014; MAE: 26.798; RMSE: 34.574
## sCE: 59.495%; sMAE: 10.209%; sMSE: 1.735%
## MASE: 1.113; RMSSE: 1.103; rMAE: 0.353; rRMSE: 0.336

Notice that because we fit the additive seasonal model to the data with multiplicative seasonality, the smoothing parameter \(\gamma\) has become quite big - the seasonal component needs to be updated in order to keep up for the changing seasonal profile. In addition, because we used holdout parameter, the function now also reports the errors for the point forecasts on that test part of data. This can be useful, when you want to compare the performance of several models on a time series. Here how the forecast from ETS(A,A,A) looks on this data:

plot(forecast(adamModel,h=12,interval="prediction"))

While the fit to the data is far from perfect, due to a pure coincidence the point forecast from this model is quite decent.

In order to see how the ADAM ETS decomposes the data into components, we can plot it via the plot() method with which parameter:

plot(adamModel,which=12)

We can see on this graph that the residuals still contain some seasonality in them, so there is a room for improvement. Most probably, this happened because the data exhibits multiplicative seasonality rather than the additive one. For now, we do not aim to fix this issue.