$$\newcommand{\mathbbm}{\boldsymbol{\mathbf{#1}}}$$

## 4.2 Mathematical models in the ETS taxonomy

I hope that it becomes more apparent to the reader how the ETS framework is built upon the idea of time series decomposition (from Section 3.1). By introducing different components, defining their types, and adding the equations for their update, we can construct models that would work better in capturing the key features of the time series. The equations discussed in Section 3.1 represent so-called “measurement” or “observation” equations of the ETS models. But we should also consider the potential change in components over time. The “transition” or “state” equations are supposed to reflect this change: they explain how the level, trend or seasonal components evolve.

As discussed in Section 4.1, given different types of components and their interactions, we end up with 30 models in the taxonomy. Tables 4.1 and 4.2 summarise mathematically all 30 ETS models shown graphically on Figures 4.1 and 4.2, presenting formulae for measurement and transition equations.

Table 4.1: Additive error ETS models.
No trend \begin{aligned} &y_{t} = l_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1}} \end{aligned}
Additive \begin{aligned} &y_{t} = l_{t-1} + b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} + b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1}) s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} + b_{t-1}} \end{aligned}
Additive damped \begin{aligned} &y_{t} = l_{t-1} + \phi b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} + \phi b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1}) s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = \phi b_{t-1} + \beta \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} + \phi b_{t-1}} \end{aligned}
Multiplicative \begin{aligned} &y_{t} = l_{t-1} b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}} \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}} \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1} s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} b_{t-1}} \end{aligned}
Multiplicative damped \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}} \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}} \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} b_{t-1}} \end{aligned}
Table 4.2: Multiplicative error ETS models.
No trend \begin{aligned} &y_{t} = l_{t-1}(1 + \epsilon_t) \\ &l_t = l_{t-1}(1 + \alpha \epsilon_t) \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} s_{t-m}(1 + \epsilon_t) \\ &l_t = l_{t-1}(1 + \alpha \epsilon_t) \\ &s_t = s_{t-m}(1 + \gamma \epsilon_t) \end{aligned}
Additive \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1})(1 + \epsilon_t) \\ &l_t = (l_{t-1} + b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} + \beta \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + b_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &b_t = b_{t-1} + \beta \mu_{y,t} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1}) s_{t-m}(1 + \epsilon_t) \\ &l_t = (l_{t-1} + b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} + \beta (l_{t-1} + b_{t-1}) \epsilon_t \\ &s_t = s_{t-m} (1 + \gamma \epsilon_t) \end{aligned}
Additive damped \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1})(1 + \epsilon_t) \\ &l_t = (l_{t-1} + \phi b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = \phi b_{t-1} + \beta \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \mu_{y,t} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1}) s_{t-m}(1 + \epsilon_t) \\ &l_t = l_{t-1} + \phi b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = \phi b_{t-1} + \beta (l_{t-1} + \phi b_{t-1}) \epsilon_t \\ &s_t = s_{t-m}(1 + \gamma \epsilon_t) \end{aligned}
Multiplicative \begin{aligned} &y_{t} = l_{t-1} b_{t-1} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} (1 + \beta \epsilon_t) \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\mu_{y,t}}{l_{t-1}} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1} s_{t-m} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} (1 + \beta \epsilon_t) \\ &s_t = s_{t-m} (1 + \gamma \epsilon_t) \end{aligned}
Multiplicative damped \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1}^\phi (1 + \beta \epsilon_t) \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} b_{t-1}^\phi + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \mu_{y,t} \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\mu_{y,t}}{l_{t-1}} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi s_{t-m} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi \left(1 + \alpha \epsilon_t\right) \\ &b_t = b_{t-1}^\phi \left(1 + \beta \epsilon_t\right) \\ &s_t = s_{t-m} \left(1 + \gamma \epsilon_t\right) \end{aligned}

From a statistical point of view, formulae in Tables 4.1 and 4.2 correspond to the “true models” (see Section 1.4), they explain the models underlying potential data, but when it comes to their construction and estimation, the $$\epsilon_t$$ is substituted by the estimated $$e_t$$ (which is calculated differently depending on the error type), and time series components and smoothing parameters are also replaced by their estimates (e.g. $$\hat{\alpha}$$ instead of $$\alpha$$). However, if the values of these models’ parameters were known, it would be possible to produce point forecasts and conditional $$h$$ steps ahead expectations from these models, which are summarised in Table 4.3 with the following elements:

• Conditional one step ahead expectation $$\mu_{y,t} \equiv \mu_{y,t|t-1}$$;
• Multiple steps ahead point forecast $$\hat{y}_{t+h}$$;
• Conditional multiple steps ahead expectation $$\mu_{y,t+h|t}$$.

In the case of the additive error models, the point forecasts correspond to the expectations only when the expectation of the error term is zero, i.e. $$\text{E}(\epsilon_t)=0$$. In contrast, in the case of the multiplicative models, the condition is changed to $$\text{E}(1+\epsilon_t)=1$$.

Remark. Not all point forecasts of ETS models correspond to conditional expectations. This issue applies to the models with multiplicative trend and/or multiplicative seasonality. This is because the ETS model assumes that different states are correlated (they have the same source of error), and as a result, multiple steps ahead values (when h>1) of states introduce products of error terms. So, the conditional expectations in these cases might not have analytical forms (“n.c.f.” in Table 4.3 stands for “No Closed Form”), and when working with these models, simulations might be required. This does not apply to the one-step-ahead forecasts, for which all the classical formulae work. This issue is discussed in Section 6.3.

Table 4.3: Point forecasts and expectations of ETS models. n.c.f. stands for “No Closed Form”.
No trend \begin{aligned} &\mu_{y,t} = l_{t-1} \\ &\hat{y}_{t+h} = l_{t} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{aligned}
Additive \begin{aligned} &\mu_{y,t} = l_{t-1} + b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + h b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} + b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + h b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = (l_{t-1} + b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + h b_{t-1}\right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{aligned}
Additive damped \begin{aligned} &\mu_{y,t} = l_{t-1} + \phi b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} + \phi b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = (l_{t-1} + \phi b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + \sum_{j=1}^h \phi^j b_t \right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{aligned}
Multiplicative \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1} \\ &\hat{y}_{t+h} = l_{t} b_t^h \\ &\mu_{y,t+h|t} \text{ -- n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ -- n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ -- n.c.f. for } h>1 \end{aligned}
Multiplicative damped \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1}^\phi \\ &\hat{y}_{t+h} = l_{t} b_t^{\sum_{j=1}^h \phi^j} \\ &\mu_{y,t+h|t} \text{ -- n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1}^\phi + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ -- n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1}^\phi s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ -- n.c.f. for } h>1 \end{aligned}

The multiplicative error models have the same one step ahead expectations and point forecasts as the additive error ones. However, due to the multiplication by the error term, the multiple steps ahead conditional expectations between the two types of models might differ, specifically for the multiplicative trend and multiplicative seasonal models. These values do not have closed forms and can only be obtained via simulations.

Although there are 30 potential ETS models, not all of them are sensible. So, Rob Hyndman has reduced the pool of models under consideration in the ets() function of the forecast package to the following 19: ANN, AAN, AAdN, ANA, AAA, AAdA, MNN, MAN, MAdN, MNA, MAA, MAdA, MNM, MAM, MAdM, MMN, MMdN, MMM, and MMdM. In addition, the multiplicative trend models are unstable in data with outliers, so they are switched off in the ets() function by default, which reduces the pool of models further to the first 15.

The es() function from the smooth package implements the conventional ETS, supporting all 30 models and implementing some features, discussed in the original Hyndman et al. (2008) book (e.g. explanatory variables and cumulative over the lead time forecasts).

### References

• Hyndman, R.J., Koehler, A.B., Ord, J.K., Snyder, R.D., 2008. Forecasting with Exponential Smoothing. Springer Berlin Heidelberg.