This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

9.5 Examples in R

9.5.1 Non-seasonal data

Using the time series from the Box-Jenkins textbook (Box and Jenkins 1976), we will fit ARIMA model to the data, but based on our judgment rather than their approach. Just a reminder, here how the data looks (series BJsales):

It seems to exhibit the trend in the data, so we can consider ARIMA(1,1,2), ARIMA(0,2,2) and ARIMA(1,1,1). We do not consider models with drift in this example, because they would imply the same slope over time, which does not seem to be the case here:

adamModelARIMA <- vector("list",3)
names(adamModelARIMA) <- c("ARIMA(1,1,2)", "ARIMA(0,2,2)", "ARIMA(1,1,1)")

Note that we need to tell adam to use model="NNN" in order to switch off the ETS part of the model. Comparing information criteria (we will use AICc) of the three models, we can select the most appropriate one:

sapply(adamModelARIMA, AICc)
## ARIMA(1,1,2) ARIMA(0,2,2) ARIMA(1,1,1)
##     503.8081     497.0114     504.9779

Note that this comparison is possible in adam() (and in ssarima() and msarima()) because the implemented ARIMA is formulated in state space form, sidestepping the issue of the conventional ARIMA (where taking differences reduces the sample size). Based on this comparison, it looks like the ARIMA(0,2,2) is the most appropriate model (among the three) to the data. Here how the fit and the forecast from the model look (figure 9.1):

plot(adamModelARIMA[[2]], 7)

Comparing this model with the ETS(A,A,N), we will see a slight difference, because the two models are initialised and estimated differently:

adam(BJsales, "AAN", h=10, holdout=TRUE, silent=FALSE)

## Time elapsed: 0.04 seconds
## Model estimated using adam() function: ETS(AAN)
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 243.4587
## Persistence vector g:
##  alpha   beta
## 0.9997 0.2403
##
## Sample size: 140
## Number of estimated parameters: 5
## Number of degrees of freedom: 135
## Information criteria:
##      AIC     AICc      BIC     BICc
## 496.9173 497.3651 511.6255 512.7319
##
## Forecast errors:
## ME: 3.225; MAE: 3.338; RMSE: 3.793
## sCE: 14.162%; sMAE: 1.465%; sMSE: 0.028%
## MASE: 2.824; RMSSE: 2.488; rMAE: 0.927; rRMSE: 0.923

If we are interested in a more classical Box-Jenkins approach, we can always analyse the residuals of the constructed model and try improving it further. Here is an example of ACF and PACF of the residuals of the ARIMA(0,2,2):

par(mfcol=c(1,2))
plot(adamModelARIMA[[2]], c(10,11), main="")

As we see from the plot above, all autocorrelation coefficients lie inside the confidence interval, implying that there are no significant AR / MA lags to include in the model.

9.5.2 Seasonal data

Similarly to the previous cases, we use Box-Jenkins AirPassengers data, which exhibits a multiplicative seasonality and a trend. We will model this using SARIMA(0,2,2)(0,1,1)$$_{12}$$, SARIMA(0,2,2)(1,1,1)$$_{12}$$ and SARIMA(0,2,2)(1,1,0)$$_{12}$$ models, which are selected to see what type of seasonal ARIMA is more appropriate to the data:

adamModelSARIMA <- vector("list",3)
orders=list(ar=c(0,0), i=c(2,1), ma=c(2,1)),
h=12, holdout=TRUE)
orders=list(ar=c(0,1), i=c(2,1), ma=c(2,1)),
h=12, holdout=TRUE)
orders=list(ar=c(0,1), i=c(2,1), ma=c(2,0)),
h=12, holdout=TRUE)
"SARIMA(0,2,2)(1,1,1)[12]",
"SARIMA(0,2,2)(1,1,0)[12]")

Note that now that we have seasonal component, we need to provide the SARIMA lags: 1 and $$m=12$$ and need to provide orders differently - as a list, specifying the values for AR, I and MA separately. This is done because the SARIMA implemented in adam() supports multiple seasonality (e.g. you can have lags=c(1,24,24*7) if you want). The resulting information criteria of models are:

sapply(adamModelSARIMA, AICc)
## SARIMA(0,2,2)(0,1,1)[12] SARIMA(0,2,2)(1,1,1)[12] SARIMA(0,2,2)(1,1,0)[12]
##                 1033.677                 1062.875                 1066.082

It looks like the first model is slightly better than the other two, so we will use it in order to produce forecasts:

plot(forecast(adamModelSARIMA[[1]], h=12, interval="prediction"))

This model is directly comparable with ETS models, so here, for example, AICc of ETS(M,A,M) on the same data:

AICc(adamModelETS <- adam(AirPassengers, "MAM", h=12, holdout=TRUE))
## [1] 974.6453

Which is lower than for the SARIMA model. This means that ETS(M,A,M) is more appropriate to the data in terms of information criteria. Let's see if there is a way to improve ETS(M,A,M) by adding some ARMA components (figure 9.3):

par(mfcol=c(1,2))
plot(adamModelETS, c(10,11), main="")

Judging by the plots, there are some significant correlation coefficients for some lags, but it is not clear, whether they appear due to the type I error or not. Just to check, we will see if adding SARIMA(0,0,0)(0,0,1)$$_12$$ helps (reduces AICc) in this case:

AICc(adam(AirPassengers, "MAM", h=12, holdout=TRUE,
order=list(ma=c(0,1)), lags=c(1,12)))
## [1] 1017.669

As we see, the increased complexity does not decrease the AICc (probably because now we need to estimate 13 parameters more than in just ETS(M,A,M)), so we should not add the SARIMA component. We could try adding other SARIMA elements to see if they improve the model or not, the reader is suggested to do that as an additional exercise.

References

Box, George, and Gwilym Jenkins. 1976. Time series analysis: forecasting and control. Holden-day, Oakland, California.