This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

4.4 Mathematical models in the ETS taxonomy

I hope that it becomes clearer to the reader how the ETS framework is built upon the idea of time series decomposition. By introducing different components and defining their types and by adding the equations for their update, we can construct models that would work better on the time series at hands. The equations discussed in the previous section represent so called "measurement" or "observation" equations of the ETS models. But we should also take into account the potential change in components over time. The "transition" or "state" equation is supposed to reflect this change: they explain, how the level, trend or seasonal components change over time.

As discussed in the previois section, given different types of components and their interactions, we end up with 30 models in the taxonomy. Tables 4.1 and 4.2 summarise mathematically all 30 ETS models shown graphically on Figures 4.1 and 4.2 in the ETS Taxonomy chapter, presenting formulae for:

  • Measurement equation;
  • Transition equation;
  • Conditional one step ahead expectation \(\mu_{y,t|t-1}\);
  • Multiple steps ahead point forecast \(\hat{y}_{t+h}\);
  • Conditional multiple steps ahead expectation \(\mu_{y,t+h|t}\);
In case of the additive error models, the point forecasts correspond to the expectations only when the expectation of the error term is zero, \(\text{E}(\epsilon_t)=0\), while in case of the multiplicative one the condition is typically that \(\text{E}(1+\epsilon_t)=1\).

However, note that not all the point forecasts correspond to the conditional expectations. This issue applies to the models with multiplicative trend and / or multiplicative seasonality. This is because SSOE models assume that different states are correlated (they have the same source of error) and as a result multiple steps ahead values (when h>1) of states introduce products of error terms. So, the conditional expectations in these cases might not have an analytical forms, and when working with these models, simulations might be required. This does not apply to the one step ahead forecasts, for which the classical formulae work.

Table 4.1: Additive error ETS models
Nonseasonal Additive seasonality Multiplicative seasonality
No trend \(\begin{split} &y_{t} = l_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} \\ &\hat{y}_{t+h} = l_{t} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = l_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = l_{t-1} s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1}} \\ &\mu_{y,t|t-1} = l_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{split}\)
Additive trend \(\begin{split} &y_{t} = l_{t-1} + b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + h b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = l_{t-1} + b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + h b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} + b_{t-1}) s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} + b_{t-1}} \\ &\mu_{y,t|t-1} = (l_{t-1} + b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + h b_{t-1}\right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{split}\)
Additive damped trend \(\begin{split} &y_{t} = l_{t-1} + \phi b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + \phi b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = l_{t-1} + \phi b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \epsilon_t \\ &s_t = s_{t-m} + \gamma \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + \phi b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} + \phi b_{t-1}) s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = \phi b_{t-1} + \beta \frac{\epsilon_t}{s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} + \phi b_{t-1}} \\ &\mu_{y,t|t-1} = (l_{t-1} + \phi b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + \sum_{j=1}^h \phi^j b_t \right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{split}\)
Multiplicative trend \(\begin{split} &y_{t} = l_{t-1} b_{t-1} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}} \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1} \\ &\hat{y}_{t+h} = l_{t} b_t^h \\ &\mu_{y,t+h|t} \text{ - no closed form for} h>1 \end{split}\) \(\begin{split} &y_{t} = l_{t-1} b_{t-1} + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}} \\ &s_t = s_{t-m} + \gamma \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form for} h>1 \end{split}\) \(\begin{split} &y_{t} = l_{t-1} b_{t-1} s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1} + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1} + \beta \frac{\epsilon_t}{l_{t-1}s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} b_{t-1}} \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form for} h>1 \end{split}\)
Multiplicative damped trend \(\begin{split} &y_{t} = l_{t-1} b_{t-1}^\phi + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}} \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1}^\phi \\ &\hat{y}_{t+h} = l_{t} b_t^{\sum_{j=1}^h \phi^j} \\ &\mu_{y,t+h|t} \text{ - no closed form for} h>1 \end{split}\) \(\begin{split} &y_{t} = l_{t-1} b_{t-1}^\phi + s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}} \\ &s_t = s_{t-m} + \gamma \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1}^\phi + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form for} h>1 \end{split}\) \(\begin{split} &y_{t} = l_{t-1} b_{t-1}^\phi s_{t-m} + \epsilon_t \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \frac{\epsilon_t}{s_{t-m}} \\ &b_t = b_{t-1}^\phi + \beta \frac{\epsilon_t}{l_{t-1}s_{t-m}} \\ &s_t = s_{t-m} + \gamma \frac{\epsilon_t}{l_{t-1} b_{t-1}} \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1}^\phi s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form for} h>1 \end{split}\)

The multiplicative error models have the same one step ahead expectations as the additive error ones, but due to the multiplication by the error term, the multiple steps ahead conditional expectations between the two models might differ, specifically for the multiplicative trend and multiplicative seasonal models.

Table 4.2: Multiplicative error ETS models
Nonseasonal Additive seasonality Multiplicative seasonality
No trend \(\begin{split} &y_{t} = l_{t-1}(1 + \epsilon_t) \\ &l_t = l_{t-1}(1 + \alpha \epsilon_t) \\ &\mu_{y,t|t-1} = l_{t-1} \\ &\hat{y}_{t+h} = l_{t} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + \alpha \mu_{y,t|t-1} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t|t-1} \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = l_{t-1} s_{t-m}(1 + \epsilon_t) \\ &l_t = l_{t-1}(1 + \alpha \epsilon_t) \\ &s_t = s_{t-m}(1 + \gamma \epsilon_t) \\ &\mu_{y,t|t-1} = l_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{split}\)
Additive trend \(\begin{split} &y_{t} = (l_{t-1} + b_{t-1})(1 + \epsilon_t) \\ &l_t = (l_{t-1} + b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} + \beta \mu_{y,t|t-1} \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + h b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} + b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + b_{t-1} + \alpha \mu_{y,t|t-1} \epsilon_t \\ &b_t = b_{t-1} + \beta \mu_{y,t|t-1} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t|t-1} \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + h b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} + b_{t-1}) s_{t-m}(1 + \epsilon_t) \\ &l_t = (l_{t-1} + b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} + \beta (l_{t-1} + b_{t-1}) \epsilon_t \\ &s_t = s_{t-m} (1 + \gamma \epsilon_t) \\ &\mu_{y,t|t-1} = (l_{t-1} + b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + h b_{t-1}\right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{split}\)
Additive damped trend \(\begin{split} &y_{t} = (l_{t-1} + \phi b_{t-1})(1 + \epsilon_t) \\ &l_t = (l_{t-1} + \phi b_{t-1})(1 + \alpha \epsilon_t) \\ &b_t = \phi b_{t-1} + \beta \mu_{y,t|t-1} \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + \phi b_{t-1} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_t \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} + \phi b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} + \phi b_{t-1} + \alpha \mu_{y,t|t-1} \epsilon_t \\ &b_t = \phi b_{t-1} + \beta \mu_{y,t|t-1} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t|t-1} \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} + \phi b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} + \sum_{j=1}^h \phi^j b_{t-1} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} + \phi b_{t-1}) s_{t-m}(1 + \epsilon_t) \\ &l_t = l_{t-1} + \phi b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = \phi b_{t-1} + \beta (l_{t-1} + \phi b_{t-1}) \epsilon_t \\ &s_t = s_{t-m}(1 + \gamma \epsilon_t) \\ &\mu_{y,t|t-1} = (l_{t-1} + \phi b_{t-1}) s_{t-m} \\ &\hat{y}_{t+h} = \left(l_{t} + \sum_{j=1}^h \phi^j b_t \right) s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} = \hat{y}_{t+h} \text{ only for } h \leq m \end{split}\)
Multiplicative trend \(\begin{split} &y_{t} = l_{t-1} b_{t-1} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} (1 + \beta \epsilon_t) \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1} \\ &\hat{y}_{t+h} = l_{t} b_t^h \\ &\mu_{y,t+h|t} \text{ - no closed form} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} b_{t-1} + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} + \alpha \mu_{y,t|t-1} \epsilon_t \\ &b_t = b_{t-1} + \beta \frac{\mu_{y,t|t-1}}{l_{t-1}} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t|t-1} \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1} + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form} \end{split}\) \(\begin{split} &y_{t} = l_{t-1} b_{t-1} s_{t-m} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1} (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1} (1 + \beta \epsilon_t) \\ &s_t = s_{t-m} (1 + \gamma \epsilon_t) \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1} s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^h s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form} \end{split}\)
Multiplicative damped trend \(\begin{split} &y_{t} = l_{t-1} b_{t-1}^\phi (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi (1 + \alpha \epsilon_t) \\ &b_t = b_{t-1}^\phi (1 + \beta \epsilon_t) \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1}^\phi \\ &\hat{y}_{t+h} = l_{t} b_t^{\sum_{j=1}^h \phi^j} \\ &\mu_{y,t+h|t} \text{ - no closed form} \end{split}\) \(\begin{split} &y_{t} = (l_{t-1} b_{t-1}^\phi + s_{t-m})(1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi + \alpha \mu_{y,t|t-1} \epsilon_t \\ &b_t = b_{t-1}^\phi + \beta \frac{\mu_{y,t|t-1}}{l_{t-1}} \epsilon_t \\ &s_t = s_{t-m} + \gamma \mu_{y,t|t-1} \epsilon_t \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1}^\phi + s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} + s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form} \end{split}\) \(\begin{split} &y_{t} = l_{t-1} b_{t-1}^\phi s_{t-m} (1 + \epsilon_t) \\ &l_t = l_{t-1} b_{t-1}^\phi \left(1 + \alpha \frac{\epsilon_t}{s_{t-m}}\right) \\ &b_t = b_{t-1}^\phi \left(1 + \beta \frac{\epsilon_t}{l_{t-1}s_{t-m}}\right) \\ &s_t = s_{t-m} \left(1 + \gamma \frac{\epsilon_t}{l_{t-1} b_{t-1}}\right) \\ &\mu_{y,t|t-1} = l_{t-1} b_{t-1}^\phi s_{t-m} \\ &\hat{y}_{t+h} = l_{t} b_{t-1}^{\sum_{j=1}^h \phi^j} s_{t+h-m\lceil\frac{h}{m}\rceil} \\ &\mu_{y,t+h|t} \text{ - no closed form} \end{split}\)

The formulae summarised above explain the models underlying potential data, but when it comes to their construction and estimation, the \(\epsilon_t\) is substituted by the estimated \(e_t\) (which is calculated differently depending on the error type), and time series components and smoothing parameters are also substituted by their estimated analogues (e.g. \(\hat{\alpha}\) instead of \(\alpha\)).

Although there are 30 potential ETS models, not all of them are stable So, Rob Hyndman has reduced the pool of models under consideration in the ets() function of forecast package to the following 19: ANN, AAN, AAdN, ANA, AAA, AAdA, MNN, MAN, MAdN, MNA, MAA, MAdA, MNM, MAM, MAdM, MMN, MMdN, MMM, MMdM. In addition, the multiplicative trend models are difficult and are unstable in cases of data with outliers, so they are switched off in the ets() function by default, which reduces the pool of models further to the first 15.