9.1 State space ARIMA
9.1.1 An example of state space ARIMA
In order to understand how the state space ADAM ARIMA can be formulated, we consider an arbitrary example of SARIMA(1,1,2)(0,1,0)\(_4\): \[\begin{equation*} {y}_{t} (1- \phi_1 B)(1-B)(1-B^4) = \epsilon_t (1 + \theta_1 B + \theta_2 B^2), \end{equation*}\] which can be rewritten in the expanded form after opening the brackets: \[\begin{equation*} {y}_{t} (1-\phi_1 B -B + \phi_1 B^2 -B^4 +\phi_1 B^5 + B^5 -\phi_1 B^6) = \epsilon_t (1 + \theta_1 B + \theta_2 B^2), \end{equation*}\] and after moving all the lagged values to the right-hand side as: \[\begin{equation*} {y}_{t} = (1+\phi_1) {y}_{t-1} -\phi_1 {y}_{t-2} + {y}_{t-4} -(1+\phi_1) {y}_{t-5} + \phi_1 {y}_{t-6} + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \epsilon_t . \end{equation*}\] Now we can define the states of the model for each of the indices \(t-j\): \[\begin{equation} \begin{aligned} & v_{1,t-1} = (1+\phi_1) y_{t-1} + \theta_1 \epsilon_{t-1} \\ & v_{2,t-2} = -\phi_1 y_{t-2} + \theta_2 \epsilon_{t-2} \\ & v_{3,t-3} = 0 \\ & v_{4,t-4} = y_{t-4} \\ & v_{5,t-5} = -(1+\phi_1) y_{t-5} \\ & v_{6,t-6} = \phi_1 y_{t-6} \end{aligned} . \tag{9.1} \end{equation}\] In our example all the MA parameters are zero for \(j>2\), which is why they disappear from the states above. Furthermore, there are no elements for lag 3, so that state can be dropped. The measurement equation of the ARIMA model in this situation can be written as: \[\begin{equation*} {y}_{t} = \sum_{j=1,2,4,5,6} v_{j,t-j} + \epsilon_t , \end{equation*}\] based on which the actual value on some lag \(i\) can also be written as: \[\begin{equation} {y}_{t-i} = \sum_{j=1,2,4,5,6} v_{j,t-j-i} + \epsilon_{t-i}. \tag{9.2} \end{equation}\] Inserting (9.2) in (9.1) and shifting the lags from \(t-i\) to \(t\) in every equation, we get the transitions equation of state space ARIMA: \[\begin{equation*} \begin{aligned} & v_{1,t} = (1+\phi_1) \sum_{j=1,2,4,5,6} v_{j,t-j} + (1+\phi_1+\theta_1) \epsilon_t \\ & v_{2,t} = -\phi_1 \sum_{j=1,2,4,5,6} v_{j,t-j} + (-\phi_1+\theta_2) \epsilon_t \\ & v_{4,t} = \sum_{j=1,2,4,5,6} v_{j,t-j} + \epsilon_t \\ & v_{5,t} = -(1+\phi_1) \sum_{j=1,2,4,5,6} v_{j,t-j} -(1+\phi_1) \epsilon_t \\ & v_{6,t} = \phi_1 \sum_{j=1,2,4,5,6} v_{j,t-j} + \phi_1 \epsilon_t \end{aligned} . \end{equation*}\] This model can then be applied to the data, and forecasts can be produced similarly to how it was done for the pure additive ETS model (see Section 5.1). Furthermore, it can be shown that any ARIMA model can be written in the compact form (5.5), meaning that the same principles as for ETS can be applied to ARIMA and that the two models can be united in one framework.
9.1.2 Additive ARIMA
In a more general case, in order to develop the state space ARIMA, we will use the Multiple Seasonal ARIMA, discussed in Subsection 8.2.3: \[\begin{equation*} y_t \prod_{j=0}^n \Delta^{D_j} (B^{m_j}) \varphi^{P_j}(B^{m_j}) = \epsilon_t \prod_{j=0}^n \vartheta^{Q_j}(B^{m_j}) , \end{equation*}\] This model can be represented in an easier to digest form by expanding the polynomials and moving all the previous values to the right-hand side. In a general case we will have: \[\begin{equation} y_t = \sum_{j=1}^K \eta_j y_{t-j} + \sum_{j=1}^K \psi_j \epsilon_{t-j} + \epsilon_t , \tag{9.3} \end{equation}\] where each element \(\eta_j\) and \(\psi_j\) can be called the parameter of polynomial. In our example with SARIMA(1,1,2)(0,1,0)\(_4\) in the previous subsection they were: \[\begin{equation*} \begin{aligned} & \eta_1 = 1+\phi_1 \\ & \eta_2 = -\phi_1 \\ & \eta_3 = 0 \\ & \eta_4 = 1 \\ & \eta_5 = -(1+\phi_1) \\ & \eta_6 = \phi_1 \\ & \psi_1 = \theta_1 \\ & \psi_2 = \theta_2 \end{aligned} . \end{equation*}\] In the equation (9.3), \(K\) is the order of the highest polynomial, calculated as \(K=\max\left(\sum_{j=0}^n (P_j + D_j)m_j, \sum_{j=0}^n Q_j m_j\right)\). If, for example, the MA order is higher than the sum of ARI orders, then polynomials \(\eta_i=0\) for \(i>\sum_{j=0}^n (P_j + D_j)m_j\). The same holds for the opposite situation of the sum of ARI orders being higher than the MA orders, where \(\psi_i=0\) for all \(i>\sum_{j=0}^n Q_j m_j\). Using this idea we could define states for each of the previous elements: \[\begin{equation} v_{i,t-i} = \eta_i y_{t-i} + \theta_i \epsilon_{t-i}, \tag{9.4} \end{equation}\] leading to the following model based on (9.4) and (9.3): \[\begin{equation} y_t = \sum_{j=1}^K v_{j,t-j} + \epsilon_t . \tag{9.5} \end{equation}\] This can be considered a measurement equation of the state space ARIMA. Now if we consider the previous values of \(y_t\) based on (9.5), for \(y_{t-i}\), it will be equal to: \[\begin{equation} y_{t-i} = \sum_{j=1}^K v_{j,t-j-i} + \epsilon_{t-i} . \tag{9.6} \end{equation}\] The value (9.6) can then be inserted into (9.4), to get the set of transition equations for all \(i=1,2,\dots,K\): \[\begin{equation} v_{i,t-i} = \eta_i \sum_{j=1}^K v_{j,t-j-i} + (\eta_i + \psi_i) \epsilon_{t-i}. \tag{9.7} \end{equation}\] This leads to the SSOE state space model based on (9.6) and (9.7): \[\begin{equation} \begin{aligned} &{y}_{t} = \sum_{j=1}^K v_{j,t-j} + \epsilon_t \\ &v_{i,t} = \eta_i \sum_{j=1}^K v_{j,t-j} + (\eta_i + \psi_i) \epsilon_{t} \text{ for each } i=\{1, 2, \dots, K \} \end{aligned}, \tag{9.8} \end{equation}\] which can be formulated in the conventional form as a pure additive ADAM (Section 5.1): \[\begin{equation*} \begin{aligned} &{y}_{t} = \mathbf{w}^\prime \mathbf{v}_{t-\boldsymbol{l}} + \epsilon_t \\ &\mathbf{v}_{t} = \mathbf{F} \mathbf{v}_{t-\boldsymbol{l}} + \mathbf{g} \epsilon_t \end{aligned}, \end{equation*}\] with the following values for matrices: \[\begin{equation} \begin{aligned} \mathbf{F} = \begin{pmatrix} \eta_1 & \eta_1 & \dots & \eta_1 \\ \eta_2 & \eta_2 & \dots & \eta_2 \\ \vdots & \vdots & \ddots & \vdots \\ \eta_K & \eta_K & \dots & \eta_K \end{pmatrix}, & \mathbf{w} = \begin{pmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{pmatrix}, \\ \mathbf{g} = \begin{pmatrix} \eta_1 + \psi_1 \\ \eta_2 + \psi_2 \\ \vdots \\ \eta_K + \psi_K \end{pmatrix}, & \mathbf{v}_{t} = \begin{pmatrix} v_{1,t} \\ v_{2,t} \\ \vdots \\ v_{K,t} \end{pmatrix}, & \boldsymbol{l} = \begin{pmatrix} 1 \\ 2 \\ \vdots \\ K \end{pmatrix} \end{aligned}. \tag{9.9} \end{equation}\] I should point out that the states in this model do not have any specific meaning, they just represent a combination of lagged actual values and error terms. Furthermore, there are zero states in this model, corresponding to zero polynomials of ARI and MA. These can be dropped to make the model even more compact.
In general, state space ARIMA looks more complicated than the original one in the conventional form, but it brings the model to the same ground as ETS in ADAM (Chapter 5), making them directly comparable via information criteria and allowing us to easily combine the two models, not to mention comparing ARIMA of any order with another ARIMA (e.g. with different orders of integration) or introduce multiple seasonality and explanatory variables. Several examples of ARIMA models in ADAM framework are provided in Subsection 9.1.5.
9.1.3 State space ARIMA with constant
If we want to add the constant to the model (similar to how it was done in Section 8.1.4), we need to modify the equation (9.3): \[\begin{equation} y_t = \sum_{j=1}^K \eta_j y_{t-j} + \sum_{j=1}^K \theta_j \epsilon_{t-j} + a_0 + \epsilon_t . \tag{9.10} \end{equation}\] This then leads to the appearance of the new state: \[\begin{equation} v_{K+1,t} = a_0 , \tag{9.11} \end{equation}\] and modified measurement equation: \[\begin{equation} y_t = \sum_{j=1}^{K+1} v_{j,t-j} + \epsilon_t , \tag{9.12} \end{equation}\] with the following transition equations: \[\begin{equation} \begin{aligned} & v_{i,t} = \eta_i \sum_{j=1}^{K+1} v_{j,t-j} + (\eta_i + \theta_i) \epsilon_{t} , \text{ for } i=\{1, 2, \dots, K\} \\ & v_{K+1, t} = v_{K+1, t-1} . \end{aligned} \tag{9.13} \end{equation}\] The state space equations (9.12) and (9.13) lead to the following matrices: \[\begin{equation} \begin{aligned} \mathbf{F} = \begin{pmatrix} \eta_1 & \dots & \eta_1 & \eta_1 \\ \eta_2 & \dots & \eta_2 & \eta_2 \\ \vdots & \vdots & \ddots & \vdots \\ \eta_K & \dots & \eta_K & \eta_K \\ 0 & \dots & 0 & 1 \end{pmatrix}, & \mathbf{w} = \begin{pmatrix} 1 \\ 1 \\ \vdots \\ 1 \\ 1 \end{pmatrix}, \\ \mathbf{g} = \begin{pmatrix} \eta_1 + \theta_1 \\ \eta_2 + \theta_2 \\ \vdots \\ \eta_K + \theta_K \\ 0 \end{pmatrix}, & \mathbf{v}_{t} = \begin{pmatrix} v_{1,t} \\ v_{2,t} \\ \vdots \\ v_{K,t} \\ v_{K+1,t} \end{pmatrix}, & \boldsymbol{l} = \begin{pmatrix} 1 \\ 2 \\ \vdots \\ K \\ 1 \end{pmatrix} \end{aligned}. \tag{9.14} \end{equation}\]
Remark. Note that the constant term introduced in this model has a changing meaning, depending on the order of differences of the model. For example, if \(D_j=0\) for all \(j\), it acts as an intercept, while for the \(D_0=d=1\), it will act as a drift.
9.1.4 Multiplicative ARIMA
In order to connect ARIMA with ETS, we also need to define cases for multiplicative models. This implies that the error term \((1+\epsilon_t)\) is multiplied by components of the model. The state space ARIMA in this case can be formulated using logarithms in the following way: \[\begin{equation} \begin{aligned} &{y}_{t} = \exp \left( \sum_{j=1}^K \log v_{j,t-j} + \log(1+\epsilon_t) \right) \\ &\log v_{i,t} = \eta_i \sum_{j=1}^K \log v_{j,t-j} + (\eta_i + \theta_i) \log(1+\epsilon_t) \text{ for each } i=\{1, 2, \dots, K \} \end{aligned}. \tag{9.15} \end{equation}\] The model (9.15) can be written in the following more general form: \[\begin{equation} \begin{aligned} &{y}_{t} = \exp \left( \mathbf{w}^\prime \log \mathbf{v}_{t-\boldsymbol{l}} + \log(1+\epsilon_t) \right) \\ &\log \mathbf{v}_{t} = \mathbf{F} \log \mathbf{v}_{t-\boldsymbol{l}} + \mathbf{g} \log(1+\epsilon_t) \end{aligned}, \tag{9.16} \end{equation}\] where \(\mathbf{w}\), \(\mathbf{F}\), \(\mathbf{v}_t\), \(\mathbf{g}\), and \(\boldsymbol{l}\) are defined as before for the pure additive ARIMA (Section 9.1). This model is equivalent to applying ARIMA to log-transformed data but at the same time shares some similarities with the pure multiplicative ETS from Section 6.1. The main advantage of this formulation is that this model has analytical solutions for the conditional moments and has well-defined \(h\) steps ahead conditional distribution if the distribution of \(\log(1+\epsilon_t)\) supports convolutions. This simplifies substantially the work with the model in contrast with the pure multiplicative ETS.
To distinguish the additive ARIMA from the multiplicative one, we will use the notation “Log-ARIMA” in this book, pointing out what such model is equivalent to (applying ARIMA to the log-transformed data).
Finally, it is worth mentioning that due to the logarithmic transform, the Log-ARIMA model would be suitable for the cases of time-varying heteroscedasticity, similar to the multiplicative error ETS models.
9.1.5 Several examples of state space ARIMA in ADAM
There are several important special cases of ARIMA model that are often used in practice. We provide their state space formulations in this subsection.
9.1.5.1 ARIMA(0,1,1)
\[\begin{equation*} \begin{aligned} &(1-B) y_t = (1+\theta_1 B)\epsilon_t , \\ &\text{or} \\ &y_{t} = y_{t-1} + \theta_1 \epsilon_{t-1} + \epsilon_t , \end{aligned} \end{equation*}\] which is equivalent to: \[\begin{equation} \begin{aligned} &{y}_{t} = v_{1,t-1} + \epsilon_t \\ &v_{1,t} = v_{1,t-1} + (1 + \theta_1) \epsilon_{t} \end{aligned}. \tag{9.17} \end{equation}\]
9.1.5.2 ARIMA(0,1,1) with drift
\[\begin{equation*} \begin{aligned} &(1-B) y_t = a_0 + (1+\theta_1 B) \epsilon_t, \\ &\text{or} \\ &y_{t} = y_{t-1} + a_0 + \theta_1 \epsilon_{t-1} + \epsilon_t, \end{aligned} \end{equation*}\] which is in state space: \[\begin{equation} \begin{aligned} &{y}_{t} = v_{1,t-1} + v_{2,t-1} + \epsilon_t \\ &v_{1,t} = v_{1,t-1} + v_{2,t-1} + (1 + \theta_1) \epsilon_{t} \\ &v_{2,t} = v_{2,t-1} \end{aligned}, \tag{9.18} \end{equation}\] where \(v_{2,0}=a_0\).
9.1.5.3 ARIMA(0,2,2)
\[\begin{equation*} \begin{aligned} &(1-B)^2 y_t = (1 + \theta_1 B + \theta_2 B^2) \epsilon_t, \\ &\text{or} \\ &y_{t} = 2 y_{t-1} - y_{t-2} + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \epsilon_t . \end{aligned} \end{equation*}\] In ADAM, this is formulated as: \[\begin{equation} \begin{aligned} &{y}_{t} = v_{1,t-1} + v_{2,t-2} + \epsilon_t \\ &v_{1,t} = 2(v_{1,t-1} + v_{2,t-2}) + (2 + \theta_1) \epsilon_{t} \\ &v_{2,t} = -(v_{1,t-1} + v_{2,t-2}) + (-1 + \theta_2) \epsilon_{t} \\ \end{aligned}. \tag{9.19} \end{equation}\]
9.1.5.4 ARIMA(1,1,2)
\[\begin{equation*} \begin{aligned} &(1-B) (1-\phi_1 B) y_t = (1 + \theta_1 B + \theta_2 B^2) \epsilon_t , \\ &\text{or} \\ &y_{t} = (1+\phi_1) y_{t-1} - \phi_1 y_{t-2} + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \epsilon_t, \end{aligned} \end{equation*}\] which is equivalent to: \[\begin{equation} \begin{aligned} &{y}_{t} = v_{1,t-1} + v_{2,t-2} + \epsilon_t \\ &v_{1,t} = (1+\phi_1)(v_{1,t-1} + v_{2,t-2}) + (1 + \phi_1 + \theta_1) \epsilon_{t} \\ &v_{2,t} = -\phi_1(v_{1,t-1} + v_{2,t-2}) + (-\phi_1 + \theta_2) \epsilon_{t} \\ \end{aligned}. \tag{9.20} \end{equation}\]
9.1.5.5 Log-ARIMA(0,1,1)
This model is equivalent to ARIMA applied to the \(\log y_t\). It can be written as: \[\begin{equation*} \begin{aligned} &(1-B) \log y_t = (1+\theta_1 B) \log(1+\epsilon_t), \\ &\text{or} \\ &\log y_{t} = \log y_{t-1} + \theta_1 \log(1+\epsilon_{t-1}) + \log(1+\epsilon_t) . \end{aligned} \end{equation*}\] In ADAM, it becomes: \[\begin{equation} \begin{aligned} &{y}_{t} = \exp (\log v_{1,t-1} + \log(1+\epsilon_t)) \\ &\log v_{1,t} = \log v_{1,t-1} + (1 + \theta_1) \log(1+\epsilon_t) \end{aligned}. \tag{9.21} \end{equation}\]