This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

4.3 Several examples of ETS and related exponential smoothing methods

There are other exponential smoothing methods, which include more components, as discussed in Section 3.1. This includes but is not limited to: Holt’s (Holt, 2004, originally proposed in 1957), Holt-Winter’s (Winters, 1960), multiplicative trend (Pegels, 1969), Damped trend Gardner and McKenzie (1985), Damped trend Holt-Winters (Gardner and McKenzie, 1989) and damped multiplicative trend (James W. Taylor, 2003a) methods. We will not discuss them here one by one, as we will not use them further in this textbook. Instead we will focus on ETS models, underlying them.

We already understand that there can be different components in time series and that they can interact with each other either in an additive or a multiplicative way, which gives us the aforementioned taxonomy, discussed in Section 3.4. In this section, we consider several examples of ETS models and their relations to the conventional exponential smoothing methods.

4.3.1 ETS(A,A,N)

This is also sometimes known as local trend model and is formulated similar to ETS(A,N,N), but with addition of the trend equation. It underlies Holt’s method (Ord et al., 1997): \[\begin{equation} \begin{aligned} & y_{t} = l_{t-1} + b_{t-1} + \epsilon_t \\ & l_t = l_{t-1} + b_{t-1} + \alpha \epsilon_t \\ & b_t = b_{t-1} + \beta \epsilon_t \end{aligned} , \tag{4.18} \end{equation}\] where \(\beta\) is the smoothing parameter for the trend component. It has a similar idea as ETS(A,N,N): the states evolve over time, and the speed of their change depends on the values of \(\alpha\) and \(\beta\). The trend is not deterministic in this model: both the intercept and the slope change over time. The higher the smoothing parameters are, the more uncertain it is, what the level and the slope will be, thus higher the uncertainty about the future values is.

Here is an example of the data that corresponds to the ETS(A,A,N) model:

y <- sim.es("AAN", 120, 1, 12, persistence=c(0.3,0.1),
            initial=c(1000,20), mean=0, sd=20)
plot(y)
Data generated from ETS(A,A,N) model.

Figure 4.10: Data generated from ETS(A,A,N) model.

The series in Figure 4.10 demonstrates trend that changes over time. If we needed to produce forecasts for this data, we would capture the dynamics of trend component and then use the last values of it for the several steps ahead prediction.

The point forecast h steps ahead from this model is a straight line with a slope \(b_t\) (as shown in Table 3.1 from Section 3.5): \[\begin{equation} \mu_{y,t+h|t} = \hat{y}_{t+h} = l_{t} + h b_t. \tag{4.19} \end{equation}\] This becomes apparent if one takes the conditional expectations E\((l_{t+h}|t)\) and E\((b_{t+h}|t)\) in the second and third equations of (4.18) and then inserts them in the measurement equation. Graphically it will look as shown in Figure 4.11:

esModel <- es(y, h=10, silent=FALSE)
ETS(A,A,N) and a point forecast produced from it.

Figure 4.11: ETS(A,A,N) and a point forecast produced from it.

If you want to experiment with the model and see how its parameters influence the fit and forecast, you can use the following R code:

esModel <- es(y$data, h=10, silent=FALSE, persistence=c(0.2,0.1))

where persistence is the vector of smoothing parameters (first \(\hat\alpha\), then \(\hat\beta\)). By changing their values, we will make model less / more responsive to the changes in the data.

4.3.2 ETS(A,Ad,N)

This is the model that underlies Damped trend method (Roberts, 1982): \[\begin{equation} \begin{aligned} & y_{t} = l_{t-1} + \phi b_{t-1} + \epsilon_t \\ & l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ & b_t = \phi b_{t-1} + \beta \epsilon_t \end{aligned} , \tag{4.20} \end{equation}\] where \(\phi\) is the dampening parameter, typically lying between 0 and 1. If it is equal to zero, then the model reduces to ETS(A,N,N), (4.6). If it is equal to one, then it becomes equivalent to ETS(A,A,N), (4.18). The dampening parameter slows down the trend, making it non-linear. An example of data that corresponds to ETS(A,Ad,N) is provided in Figure 4.12.

y <- sim.es("AAdN", 120, 1, 12, persistence=c(0.3,0.1),
            initial=c(1000,20), phi=0.95, mean=0, sd=20)
plot(y)
An example of ETS(A,Ad,N) data.

Figure 4.12: An example of ETS(A,Ad,N) data.

Visually it is typically difficult to distinguish ETS(A,A,N) from ETS(A,Ad,N) data. So, when applying ETS, some other model selection techniques are recommended (see Section 15.1).

The point forecast from this model is a bit more complicated (see Section 3.5): \[\begin{equation} \mu_{y,t+h|t} = \hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_t. \tag{4.21} \end{equation}\] It corresponds to the slowing down trajectory, as shown in Figure 4.13.

A point forecast from ETS(A,Ad,N).

Figure 4.13: A point forecast from ETS(A,Ad,N).

As can be seen in Figure 4.13, the forecast trajectory from the ETS(A,Ad,N) has a slowing down element in it. This is because of the \(\phi=0.95\) in our example.

4.3.3 ETS(A,A,M)

Finaly, this is an exotic model with additive error and trend, but multiplicative seasonality. Still, we list it here, because it underlies the Holt-Winters method (Winters, 1960): \[\begin{equation} \begin{aligned} & y_{t} = (l_{t-1} + b_{t-1}) s_{t-m} + \epsilon_t \\ & l_t = l_{t-1} + b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ & b_t = b_{t-1} + \beta \frac{\epsilon_t}{s_{t-m}} \\ & s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1}+b_{t-1}} \end{aligned} , \tag{4.22} \end{equation}\] where \(s_t\) is the seasonal component and \(\gamma\) is its smoothing parameter. This is one of the potentially unstable models, which due to the mix of components might produce unreasonable forecasts, because the seasonal component might become negative, while in the multiplicative model it should always be positive. Still, it might work on the strictly positive high level data. Figure 4.14 shows how the data for this model can look like.

y <- sim.es("AAM", 120, 1, 4, persistence=c(0.3,0.05,0.2),
            initial=c(1000,20), initialSeason=c(0.9,1.1,0.8,1.2),
            mean=0, sd=20)
plot(y)
An example of ETS(A,A,M) data.

Figure 4.14: An example of ETS(A,A,M) data.

The data in Figure 4.14 exhibits an additive trend with increasing seasonal amplitude, which are the two characteristics of the model.

Finally, the point forecast from this model build upon the ETS(A,A,N), introducing seasonal component: \[\begin{equation} \hat{y}_{t+h} = (l_{t} + h b_t) s_{t+h-m\lceil\frac{h}{m}\rceil}, \tag{4.23} \end{equation}\] where \(\lceil\frac{h}{m}\rceil\) is the rounded up value of the fraction in the brackets. The point forecast from this model is shown in Figure 4.15.

A point forecast from ETS(A,A,M).

Figure 4.15: A point forecast from ETS(A,A,M).

Remark. The point forecasts produced from this model do not correspond to the conditional expectations. This was discussed in Section 3.5.

Hyndman et al. (2008) argue that in ETS models, the error term should be aligned with the seasonal component, because it is difficult to motivate why the amplitude of seasonality increases with the increase of level, while the variability of the error term stays the same. So, they recommend using ETS(M,A,M) instead of ETS(A,A,M), if you deal with positive high volume data. This is a reasonable recommendation, but keep in mind that both models might break if you deal with low volume data.

References

• Gardner, E.S., McKenzie, E., 1989. Seasonal Exponential Smoothing with Damped Trends. Management Science. 35, 372–376. https://doi.org/10.1287/mnsc.35.3.372
• Gardner, E.S., McKenzie, E., 1985. Forecasting trends in time series. Management Science. 31, 1237–1246. https://doi.org/10.1016/0169-2070(86)90056-7
• Holt, C.C., 2004. Forecasting seasonals and trends by exponentially weighted moving averages. International Journal of Forecasting. 20, 5–10. https://doi.org/10.1016/j.ijforecast.2003.09.015
• Hyndman, R.J., Koehler, A.B., Ord, J.K., Snyder, R.D., 2008. Forecasting with Exponential Smoothing. Springer Berlin Heidelberg.
• Ord, J.K., Koehler, A.B., Snyder, R.D., 1997. Estimation and Prediction for a Class of Dynamic Nonlinear Statistical Models. Journal of the American Statistical Association. 92, 1621–1629. https://doi.org/10.1080/01621459.1997.10473684
• Pegels, C.C., 1969. Exponential Forecasting : Some New Variations. Management Science. 15, 311–315. https://www.jstor.org/stable/2628137
• Roberts, S.A., 1982. A General Class of Holt-Winters Type Forecasting Models. Management Science. 28, 808–820. https://doi.org/10.1287/mnsc.28.7.808
• Taylor, James W., 2003a. Exponential smoothing with a damped multiplicative trend. International Journal of Forecasting. 19, 715–725. https://doi.org/10.1016/S0169-2070(03)00003-7
• Winters, P.R., 1960. Forecasting Sales by Exponentially Weighted Moving Averages. Management Science. 6, 324–342. https://doi.org/10.1287/mnsc.6.3.324