10.3 Dynamic X in ADAMX

Remark. The model discussed in this section assumes particular dynamics of parameters, aligning with what the conventional ETS assumes: parameters are correlated with the states of the model. It does not treat parameters as independent as, for example, MSOE state space models do, which makes this model restrictive in its application. But this type of model works well with categorical variables, as I show later in this section.

As discussed in Section 10.1, the parameters of the explanatory variables in ADAMX can be assumed to be constant over time or can be assumed to vary according to some mechanism. The most reasonable one in the SSOE framework relies on the same error for different components of the model because this mechanism aligns with the model itself. Osman and King (2015) proposed one of such mechanisms, relying on the differences of the data. The primary motivation of their approach was to make the dynamic ADAMX model stable, which is a challenging task. However, this mechanism relies on the assumption of non-stationarity of the explanatory variables, which does not always make sense (for example, it is unreasonable in the case of promotional data). An alternative approach discussed in this section is the one initially proposed by Svetunkov (1985) based on the stochastic approximation mechanism and further developed in Svetunkov and Svetunkov (2014).

We start with the following linear regression model: \[\begin{equation} y_{t} = a_{0,t-1} + a_{1,t-1} x_{1,t} + \dots + a_{n,t-1} x_{n,t} + \epsilon_t , \tag{10.16} \end{equation}\] where all parameters vary over time and \(a_{0,t}\) represents the value from the conventional additive error ETS model. The updating mechanism for the parameters is straight forward and relies on the ratio of the error term and the respective explanatory variables: \[\begin{equation} a_{i,t} = a_{i,t-1} + \left \lbrace \begin{aligned} &\delta_i \frac{\epsilon_t}{x_{i,t}} \text{ for each } i \in \{1, \dots, n\}, \text{ if } x_{i,t}\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. , \tag{10.17} \end{equation}\] where \(\delta_i\) is the smoothing parameter of the \(i\)-th explanatory variable. The same model can be represented in the state space form, based on the equations, similar to (10.4): \[\begin{equation} \begin{aligned} & {y}_{t} = \mathbf{w}'_t \mathbf{v}_{t-\mathbf{l}} + \epsilon_t \\ & \mathbf{v}_t = \mathbf{F} \mathbf{v}_{t-\mathbf{l}} + \mathbf{z}_t \mathbf{g} \epsilon_t \end{aligned} \tag{10.18} \end{equation}\] where \(\mathbf{z}_t = \mathrm{diag}\left(\mathbf{w}_t\right)^{-1}=\mathbf{I}_{k+n} \odot (\mathbf{w}_t \mathbf{1}_{k+n})\) is the diagonal matrix consisting of inverses of explanatory variables, \(\mathbf{I}_{k+n}\) is the identity matrix for \(k\) ADAM components and \(n\) explanatory variables and \(\odot\) is Hadamard product for element-wise multiplication. This is the inverse of the diagonal matrix based on the measurement vector, for which those values that cannot be inverted (due to division by zero) are substitute by zeroes in order to reflect the condition in (10.17). In addition to what (10.4) contained, we add smoothing parameters \(\delta_i\) in the persistence vector \(\mathbf{g}\) for each of the explanatory variables.

If the error term is multiplicative, then the model changes to: \[\begin{equation} \begin{aligned} & y_{t} = \exp \left(a_{0,t-1} + a_{1,t-1} x_{1,t} + \dots + a_{n,t-1} x_{n,t} + \log(1+ \epsilon_t) \right) \\ & a_{i,t} = a_{i,t-1} + \left \lbrace \begin{aligned} &\delta_i \frac{\log(1+\epsilon_t)}{x_{i,t}} \text{ for each } i \in \{1, \dots, n\}, \text{ if } x_{i,t}\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \end{aligned} . \tag{10.19} \end{equation}\] The formulation (10.19) differs from the conventional pure multiplicative ETS model because the smoothing parameter \(\delta_i\) is not included inside the error term \(1+\epsilon_t\), which simplifies some derivations and makes the model easier to work with. Mixed ETS models can also have explanatory variables, but I suggest aligning the type of explanatory variable model with the error term.

Note that if it is suspected that the explanatory variables exhibit non-stationarity and are not cointegrated with the response variable, then their differences can be used instead of \(x_{i,t}\) in (10.18) and (10.19). In this case, the model would coincide with the one proposed by Osman and King (2015). The decision of taking the differences for the different parts of the model should be made based on each specific situation. Here is an example of the ETSX(A,N,N) model with differenced explanatory variables: \[\begin{equation} \begin{aligned} & y_{t} = a_{0,t-1} + a_{1,t-1} \Delta x_{1,t} + \dots + a_{n,t-1} \Delta x_{n,t} + \epsilon_t , \\ & a_{i,t} = a_{i,t-1} + \left \lbrace \begin{aligned} &\delta_i \frac{\epsilon_t}{\Delta x_{i,t}} \text{ for each } i \in \{1, \dots, n\}, \text{ if } \Delta x_{i,t}\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. , \end{aligned} \tag{10.20} \end{equation}\] where \(\Delta x_{i,t} = x_{i,t} -x_{i,t-1}\) is the differences of the \(i\)-th exogenous variable.

Finally, to distinguish the ADAMX with static parameters from the ADAMX with dynamic ones, we will use the letters “S” and “D” in the names of models. So, the model (10.9) can be called ETSX(A,N,N){S}, while the model (10.19), assuming that \(a_{0,t-1}=l_{t-1}\), would be called ETSX(M,N,N){D}. We use curly brackets to split the ETS states from the type of X. Furthermore, given that the model with static regressors is assumed in many contexts to be the default one, the ETSX(*,*,*){S} model can also be denoted as just ETSX(*,*,*).

10.3.1 Recursion for dynamic ADAMX

Similar to how it was discussed in Section 10.2.2, we can have two cases in the dynamic model: (1) deterministic explanatory variables, (2) stochastic explanatory variables. For illustrative purposes, we will use a non-seasonal model for which the lag vector \(\mathbf{l}\) contains ones only, keeping in mind that other pure additive models can be easily used instead. The cases of non-additive ETS models are not discussed in this part in detail – the moments for these models need to be calculated based on simulations. So, as discussed previously, the model can be written in the following general way, assuming that all elements of \(\mathbf{l}\) are equal to one: \[\begin{equation} \begin{aligned} & {y}_{t} = \mathbf{w}'_t \mathbf{v}_{t-1} + \epsilon_t \\ & \mathbf{v}_t = \mathbf{F} \mathbf{v}_{t-1} + \mathbf{z}_t \mathbf{g} \epsilon_t \end{aligned} . \tag{10.21} \end{equation}\] Based on this model, we can get the recursive relation for \(h\) steps ahead, similar to how it was done in Section 5.2: \[\begin{equation} \begin{aligned} & {y}_{t+h} = \mathbf{w}'_{t+h} \mathbf{v}_{t+h-1} + \epsilon_{t+h} \\ & \mathbf{v}_{t+h-1} = \mathbf{F} \mathbf{v}_{t+h-2} + \mathbf{z}_{t+h-1} \mathbf{g} \epsilon_{t+h-1} \end{aligned} , \tag{10.22} \end{equation}\] where the second equation can be represented based on the values available on observation \(t\): \[\begin{equation} \mathbf{v}_{t+h-1} = \mathbf{F}^{h-1} \mathbf{v}_{t} + \sum_{j=1}^{h-1} \mathbf{F}^{h-1-j} \mathbf{z}_{t+j} \mathbf{g} \epsilon_{t+j} . \tag{10.23} \end{equation}\] Substituting the equation (10.23) in the measurement equation of (10.22) leads to the final recursion: \[\begin{equation} {y}_{t+h} = \mathbf{w}'_{t+h} \mathbf{F}^{h-1} \mathbf{v}_{t} + \mathbf{w}'_{t+h} \sum_{j=1}^{h-1} \mathbf{F}^{h-1-j} \mathbf{z}_{t+j} \mathbf{g} \epsilon_{t+j} + \epsilon_{t+h} . \tag{10.24} \end{equation}\]

10.3.2 Conditional moments for deterministic explanatory variables in ADAMX{D}

Based on this recursion, we can calculate the conditional mean and variance for the model. First, we assume that the explanatory variables are controlled by an analyst, and are known for \(j=1, \dots, h\): \[\begin{equation} \begin{aligned} \mu_{y,t+h} = & \text{E}(y_{t+h}|t) = \mathbf{w}'_{t+h} \mathbf{F}^{h-1} \mathbf{v}_{t} \\ & \text{V}(y_{t+h}|t) = \left(\mathbf{w}'_{t+h} \sum_{j=1}^{h-1} \mathbf{F}^{h-1-j} \mathbf{z}_{t+j} \mathbf{g} \right)^2 \sigma^2 + \sigma^2 \end{aligned} . \tag{10.25} \end{equation}\] The formulae for conditional moments in this case look similar to the ones from the pure additive ETS model in Section 5.3 with only difference being the interaction with time varying measurument vector.

10.3.3 Conditional mean for stochastic explanatory variables in ADAMX{D}

In the case of stochastic explanatory variables, the conditional expectation is straightforward and is similar to the one in the static ADAMX model: \[\begin{equation} \mu_{y,t+h} = \text{E}(y_{t+h}|t) = \boldsymbol{\mu}'_{w,t+h} \mathbf{F}^{h-1} \mathbf{v}_{t} , \tag{10.26} \end{equation}\] where \(\boldsymbol{\mu}'_{w,t+h}\) is the vector of conditional h steps ahead expectations for each element in the \(\mathbf{w}_{t+h}\). In the case of ETS components, the vector would contain ones. However, when it comes to conditional variance, it is more complicated because it introduces complex interactions between variances of different variables and the error term. As a result, it would be easier to get the correct variance based on simulations, assuming that the explanatory variables and the error term change according to some assumed models.

References

• Osman, A.F., King, M.L., 2015. A new approach to forecasting based on exponential smoothing with independent regressors. http://econpapers.repec.org/paper/mshebswps/2015-2.htm
• Svetunkov, I., Svetunkov, S., 2014. Forecasting methods. Textbook for universities. Urait, Moscow.
• Svetunkov, S., 1985. Adaptive methods in the process of optimisation of regimes of electricity consumption.