\( \newcommand{\mathbbm}[1]{\boldsymbol{\mathbf{#1}}} \)

4.6 State space form of ETS

One of the main advantages of the ETS model is its state space form, which gives it the flexibility. Hyndman et al. (2008) use the following general formulation of the model with the first equation called “measurement equation” and the second one “transition equation”: \[\begin{equation} \begin{aligned} & {y}_{t} = w(\mathbf{v}_{t-1}) + r(\mathbf{v}_{t-1}) \epsilon_t \\ & \mathbf{v}_{t} = f(\mathbf{v}_{t-1}) + g(\mathbf{v}_{t-1}) \epsilon_t \end{aligned}, \tag{4.21} \end{equation}\] where \(\mathbf{v}_t\) is the state vector, containing the components of series (level, trend, and seasonal), \(w(\cdot)\) is the measurement, \(r(\cdot)\) is the error, \(f(\cdot)\) is the transition, and \(g(\cdot)\) is the persistence functions. Depending on the types of components, these functions can have different values.

Remark. Note that Hyndman et al. (2008) use \(\mathbf{x}_{t}\) instead of \(\mathbf{v}_{t}\). I do not use their notation because I find it confusing: \(x\) is typically used to denote explanatory variables (especially in regression context), and when we use \(\mathbf{x}_{t}\) in ETS context, the states are sometimes perceived as related to explanatory variables. However, this is not the case. They relate more to time-varying parameters rather than exogenous variables in the regression context. This aspect is discussed on an example of a seasonal model in Section 10.5.

  1. Depending on the types of trend and seasonality, \(w(\mathbf{v}_{t-1})\) will be equal either to the addition or multiplication of components. The special cases were presented in Tables 4.1 and ?? in Section 4.2. For example, in case of ETS(M,M,M) it is: \(w(\mathbf{v}_{t-1}) = l_{t-1} b_{t-1} s_{t-m}\);
  2. If the error is additive, then \(r(\mathbf{v}_{t-1})=1\), otherwise (in the case of multiplicative error) it is \(r(\mathbf{v}_{t-1})=w(\mathbf{v}_{t-1})\). For example, for ETS(M,M,M) it will be \(r(\mathbf{v}_{t-1}) = l_{t-1} b_{t-1} s_{t-m}\);
  3. The transition function \(f(\cdot)\) will produce values depending on the types of trend and seasonality and will correspond to the first parts in Tables 4.1 and ?? of the transition equations (dropping the error term). This function records how components interact with each other and how they change from one observation to another (thus the term “transition”). An example is the ETS(M,M,M) model, for which the transition function will produce three values: \(l_{t-1}b_{t-1}\), \(b_{t-1}\), and \(s_{t-m}\), respectively, for the level, trend, and seasonal components. So, if we drop the persistence function \(g(\cdot)\) and the error term \(\epsilon_t\) for a moment, the second equation in (4.21) will be: \[\begin{equation} \begin{aligned} & {l}_{t} = l_{t-1} b_{t-1} \\ & b_t = b_{t-1} \\ & s_t = s_{t-m} \end{aligned} . \tag{4.22} \end{equation}\]
  4. Finally, the persistence function will differ from one model to another, but in some special cases it can either be: \(g(\mathbf{v}_{t-1})=\mathbf{g}\) if all components are additive, or \(g(\mathbf{v}_{t-1})=f(\mathbf{v}_{t-1})\mathbf{g}\) if they are all multiplicative. \(\mathbf{g}\) is the vector of smoothing parameters, called in the ETS context the “persistence vector”. An example of persistence function is the ETS(M,M,M) model, for which it is: \(l_{t-1}b_{t-1}\alpha\), \(b_{t-1}\beta\), and \(s_{t-m}\gamma\), respectively, for the level, trend, and seasonal components. Uniting this with the transition function (4.22) we get the equation from Table ??: \[\begin{equation} \begin{aligned} & {l}_{t} = l_{t-1} b_{t-1} + l_{t-1} b_{t-1} \alpha\epsilon_t \\ & b_t = b_{t-1} + b_{t-1} \beta\epsilon_t \\ & s_t = s_{t-m} + s_{t-m} \gamma\epsilon_t \end{aligned}, \tag{4.23} \end{equation}\] which can be simplified to: \[\begin{equation} \begin{aligned} & {l}_{t} = l_{t-1}b_{t-1} (1+\alpha\epsilon_t)\\ & b_t = b_{t-1} (1+\beta\epsilon_t)\\ & s_t = s_{t-m} (1+\gamma\epsilon_t) \end{aligned} . \tag{4.24} \end{equation}\] Some of the mixed models have more complicated persistence function values. For example, for ETS(A,A,M) it is: \[\begin{equation} g(\mathbf{v}_{t-1}) = \begin{pmatrix} \alpha \frac{1}{s_{t-m}} \\ \beta \frac{1}{s_{t-m}} \\ \gamma \frac{1}{l_{t-1} + b_{t-1}} \end{pmatrix} , \end{equation}\] which results in the state space model discussed in Subsection 4.4.3.

The compact form (4.21) is thus convenient, it underlies all the 30 ETS models discussed in the Sections 4.1 and 4.2. Unfortunately, they cannot be used directly for deriving conditional values, so they are needed just for the general understanding of ETS and can be used in programming.

4.6.1 Pure additive state space model

The more useful state space model in ETS framework is the pure additive one, which, based on the discussion above, is formulated as: \[\begin{equation} \begin{aligned} & {y}_{t} = \mathbf{w}' \mathbf{v}_{t-1} + \epsilon_t \\ & \mathbf{v}_{t} = \mathbf{F} \mathbf{v}_{t-1} + \mathbf{g} \epsilon_t \end{aligned}, \tag{4.25} \end{equation}\] where \(\mathbf{w}\) is the measurement vector, showing how the components form the structure, \(\mathbf{F}\) is the transition matrix, showing how components interact with each other and change over time (e.g. level is equal to the previous level plus trend), and \(\mathbf{g}\) is the persistence vector, containing smoothing parameters. The conditional expectation and variance can be derived based on (4.25), together with bounds on the smoothing parameters for any model that can be formulated in this way. And, as mentioned above, any pure additive ETS model can be written in the form (4.25), which means that all of them have relatively simple analytical formulae for the statistics mentioned above. For example, the \(h\) steps ahead conditional expectation and variance of the model (4.25) are (Hyndman et al., 2008, chap. 6): \[\begin{equation} \begin{aligned} \mu_{y,t+h} = \mathrm{E}(y_{t+h}|t) = & \mathbf{w}^\prime \mathbf{F}^{h-1} \mathbf{v}_{t} \\ \sigma^2_{h} = \mathrm{V}(y_{t+h}|t) = & \left(\mathbf{w}^\prime \mathbf{F}^{j-1} \mathbf{g} \mathbf{g}^\prime \mathbf{F}^\prime \mathbf{w} + 1 \right) \sigma^2 \end{aligned}, \tag{4.26} \end{equation}\] where \(\sigma^2\) is the variance of the error term. The formulae in (4.26) can be used for the generation of respective moments from any pure additive ETS model. The conditional expectation can also be used for some mixed models as an approximation for the true conditional mean.


• Hyndman, R.J., Koehler, A.B., Ord, J.K., Snyder, R.D., 2008. Forecasting with Exponential Smoothing: The State Space Approach. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-71918-2