This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

## 3.5 Mathematical models in the ETS taxonomy

I hope that it becomes clearer to the reader how the ETS framework is built upon the idea of time series decomposition (from Section 3.1). By introducing different components and defining their types and by adding the equations for their update, we can construct models that would work better on the time series at hands. The equations discussed in Section 3.1 represent so called “measurement” or “observation” equations of the ETS models. But we should also take into account the potential change in components over time. The “transition” or “state” equation is supposed to reflect this change: they explain, how the level, trend or seasonal components evolve over time.

As discussed in Section 3.4, given different types of components and their interactions, we end up with 30 models in the taxonomy. Tables 3.1 and 3.2 summarise mathematically all 30 ETS models shown graphically on Figures 3.10 and 3.11, presenting formulae for measurement and transition equations.

Table 3.1: Additive error ETS models
No trend \begin{aligned} &y_{t} = l_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1}} \end{aligned}
Additive trend \begin{aligned} &y_{t} = l_{t-1} + b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} + b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1}) s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} + b_{t-1}} \end{aligned}
Additive damped trend \begin{aligned} &y_{t} = l_{t-1} + \phi b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} + \phi b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1}) s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = \phi b_{t-1} + \beta \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} + \phi b_{t-1}} \end{aligned}
Multiplicative trend \begin{aligned} &y_{t} = l_{t-1} b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}} \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}} \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1} s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} b_{t-1}} \end{aligned}
Multiplicative damped trend \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}} \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}} \\ &s_t = s_{t-m} + \gamma \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} b_{t-1}} \end{aligned}
Table 3.2: Multiplicative error ETS models
No trend \begin{aligned} &y_{t} = l_{t-1}(1 + \epsilon_t) \\ &l_t = l_{t-1}(1 + \alpha \epsilon_t) \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} s_{t-m}(1 + \epsilon_t) \\ &l_t = l_{t-1}(1 + \alpha \epsilon_t) \\ &s_t = s_{t-m}(1 + \gamma \epsilon_t) \end{aligned}
Additive trend \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1})(1 + \epsilon_t) \\ &l_t = (l_{t-1} + b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} + \beta \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + b_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &b_t = b_{t-1} + \beta \mu_{y,t} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + b_{t-1}) s_{t-m}(1 + \epsilon_t) \\ &l_t = (l_{t-1} + b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} + \beta (l_{t-1} + b_{t-1}) \epsilon_t \\ &s_t = s_{t-m} (1 + \gamma \epsilon_t) \end{aligned}
Additive damped trend \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1})(1 + \epsilon_t) \\ &l_t = (l_{t-1} + \phi b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = \phi b_{t-1} + \beta \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \mu_{y,t} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} + \phi b_{t-1}) s_{t-m}(1 + \epsilon_t) \\ &l_t = l_{t-1} + \phi b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = \phi b_{t-1} + \beta (l_{t-1} + \phi b_{t-1}) \epsilon_t \\ &s_t = s_{t-m}(1 + \gamma \epsilon_t) \end{aligned}
Multiplicative trend \begin{aligned} &y_{t} = l_{t-1} b_{t-1} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} (1 + \beta \epsilon_t) \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} + \alpha \mu_{y,t} \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\mu_{y,t}}{l_{t-1}} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1} s_{t-m} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} (1 + \beta \epsilon_t) \\ &s_t = s_{t-m} (1 + \gamma \epsilon_t) \end{aligned}
Multiplicative damped trend \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1}^\phi (1 + \beta \epsilon_t) \end{aligned} \begin{aligned} &y_{t} = (l_{t-1} b_{t-1}^\phi + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \mu_{y,t} \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\mu_{y,t}}{l_{t-1}} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t} \epsilon_t \end{aligned} \begin{aligned} &y_{t} = l_{t-1} b_{t-1}^\phi s_{t-m} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi \left(1 + \alpha \epsilon_t\right) \\ &b_t = b_{t-1}^\phi \left(1 + \beta \epsilon_t\right) \\ &s_t = s_{t-m} \left(1 + \gamma \epsilon_t\right) \end{aligned}

The formulae summarised in Table 3.3

From statistical point of view, formulae in Tables 3.1 and 3.2 correspond to the “true models” (see Section 1.2 of Svetunkov, 2021c), they explain the models underlying potential data, but when it comes to their construction and estimation, the $$\epsilon_t$$ is substituted by the estimated $$e_t$$ (which is calculated differently depending on the error type), and time series components and smoothing parameters are also substituted by their estimated analogues (e.g. $$\hat{\alpha}$$ instead of $$\alpha$$). However, if the values of parameters of these models were known, then it would be possible to produce point forecasts and conditional h steps ahead expectations from these models. Table 3.3 summarises:

• Conditional one step ahead expectation $$\mu_{y,t} = \mu_{y,t|t-1}$$;
• Multiple steps ahead point forecast $$\hat{y}_{t+h}$$;
• Conditional multiple steps ahead expectation $$\mu_{y,t+h|t}$$;

In case of the additive error models, the point forecasts correspond to the expectations only when the expectation of the error term is zero, i.e. $$\text{E}(\epsilon_t)=0$$, while in case of the multiplicative one the condition is changed to $$\text{E}(1+\epsilon_t)=1$$.

Remark. Not all the point forecasts of ETS models correspond to conditional expectations. This issue applies to the models with multiplicative trend and / or multiplicative seasonality. This is because ETS model assumes that different states are correlated (they have the same source of error) and as a result multiple steps ahead values (when h>1) of states introduce products of error terms. So, the conditional expectations in these cases might not have analytical forms (“n.c.f.” in Table 3.3 stands for “no closed form”), and when working with these models, simulations might be required. This does not apply to the one step ahead forecasts, for which all the classical formulae work.

Table 3.3: Point forecasts and expectations of ETS models. n.c.f. stands for “No Closed Form.”
No trend \begin{aligned} &\mu_{y,t} = l_{t-1} \\ &\hat{y}_{t+h} = l_{t} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{aligned}
Additive trend \begin{aligned} &\mu_{y,t} = l_{t-1} + b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + h b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} + b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + h b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = (l_{t-1} + b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + h b_{t-1}\right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{aligned}
Additive damped trend \begin{aligned} &\mu_{y,t} = l_{t-1} + \phi b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} + \phi b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{aligned} \begin{aligned} &\mu_{y,t} = (l_{t-1} + \phi b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + \sum_{j=1}^h \phi^j b_t \right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{aligned}
Multiplicative trend \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1} \\ &\hat{y}_{t+h} = l_{t} b_t^h \\ &\mu_{y,t+h|t} \text{ - n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - n.c.f. for } h>1 \end{aligned}
Multiplicative damped trend \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1}^\phi \\ &\hat{y}_{t+h} = l_{t} b_t^{\sum_{j=1}^h \phi^j} \\ &\mu_{y,t+h|t} \text{ - n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1}^\phi + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - n.c.f. for } h>1 \end{aligned} \begin{aligned} &\mu_{y,t} = l_{t-1} b_{t-1}^\phi s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - n.c.f. for } h>1 \end{aligned}
Although there are 30 potential ETS models, not all of them are stable. So, Rob Hyndman has reduced the pool of models under consideration in the ets() function of forecast package to the following 19: ANN, AAN, AAdN, ANA, AAA, AAdA, MNN, MAN, MAdN, MNA, MAA, MAdA, MNM, MAM, MAdM, MMN, MMdN, MMM, MMdM. In addition, the multiplicative trend models and are unstable in cases of data with outliers, so they are switched off in the ets() function by default, which reduces the pool of models further to the first 15.