10.2 Conditional expectation and variance of ADAMX
10.2.1 ADAMX with deterministic explanatory variables
The conventional ETS and ARIMA models have a severe limitation, which will be discussed in Chapter 16: it assumes that the model’s parameters are known, i.e. there is no variability in them and that the in-sample estimates are fixed no matter how the sample size changes. While in the case of point forecasts, this is not important, this affects the conditional variance and prediction intervals, which appear too narrow in many studies (e.g. Athanasopoulos et al., 2011). In case of regression, this limitation is lifted: the uncertainty of parameters in it translates to the final uncertainty in the confidence/prediction interval. ADAMX, having both dynamic (ETS/ARIMA) and static (regression) parts, has the limitation of the former, which can be resolved only with more complicated approaches (Section 16.5). As a result, the conditional mean and variance of the conventional ADAMX assume that the parameters \(a_0, \dots a_n\) are known, leading to the following formulae in the case of the pure additive model, based on what was discussed in Section 5.3: \[\begin{equation} \begin{aligned} & \mu_{y,t+h} = \mathrm{E}(y_{t+h}|t) = \sum_{i=1}^d \left(\mathbf{w}_{m_i,t}' \mathbf{F}_{m_i}^{\lceil\frac{h}{m_i}\rceil-1} \right) \mathbf{v}_{t} \\ & \mathrm{V}(y_{t+h}|t) = \left( \sum_{i=1}^d \left(\mathbf{w}_{m_i,t}' \sum_{j=1}^{\lceil\frac{h}{m_i}\rceil-1} \mathbf{F}_{m_i}^{j-1} \mathbf{g}_{m_i} \mathbf{g}'_{m_i} (\mathbf{F}_{m_i}')^{j-1} \mathbf{w}_{m_i,t} \right) + 1 \right) \sigma^2 \end{aligned}, \tag{10.11} \end{equation}\] the main difference from the moments of the conventional model (from Section 5.3) being the index \(t\) in the measurement vector \(\mathbf{w}_t\). As an example, here is how the two statistics will look in the case of ETSX(A,N,N): \[\begin{equation} \begin{aligned} \mu_{y,t+h} = \mathrm{E}(y_{t+h}|t) = & l_{t} + \sum_{i=1}^n a_i x_{i,t+h} \\ \mathrm{V}(y_{t+h}|t) = & \left((h-1) \alpha^2 + 1 \right) \sigma^2 \end{aligned}, \tag{10.12} \end{equation}\] where the variance ignores the potential variability rising from the explanatory variables because of the limitations discussed above. The formulae assume that the future values of explanatory variables \(x_{i,t}\) are known (the variable is deterministic). As a result, the prediction and confidence intervals for the ADAMX would typically be narrower than expected and would only be adequate in cases of large samples, where the Law of Large Numbers would start working (Section 6.1 of Svetunkov, 2022), reducing the variance of parameters (this is assuming that the typical assumptions of the model from Subsection 1.4.1 hold).
10.2.2 ADAMX with stochastic explanatory variables
Note that the ADAMX works well in cases when the future values of \(x_{i,t+h}\) are known. It is a realistic assumption when we control the explanatory variables (e.g. prices and promotions for our product). But when the variables are out of our control, they need to be forecasted somehow. In this case we are assuming that each \(x_{i,t}\) is a stochastic variable with some dynamic conditional one step ahead expectation \(\mu_{x_{i,t}}\) and a one step ahead variance \(\sigma^2_{x_{i,1}}\).
Remark. In this case, we treat the available explanatory variables as models on their own, not just as values given to us from above. This assumption of randomness will change the conditional moments of the model.
Here is what we will have for the moments of the model in the case of ETSX(A,N,N) (given that the typical assumptions from Subsection 1.4.1 hold): \[\begin{equation} \begin{aligned} & \mu_{y,t+h} = \mathrm{E}(y_{t+h}|t) = l_{t} + \sum_{i=1}^n a_i \mu_{x_{i,t+h}} \\ & \mathrm{V}(y_{t+h}|t) = \left((h-1) \alpha^2 + 1 \right) \sigma^2 + \sum_{i=1}^n a^2_i \sigma^2_{x_{i,h}} + 2 \sum_{i=1}^{n-1} \sum_{j=i+1}^n a_i a_j \sigma_{x_{i,h},x_{j,h}} \end{aligned}, \tag{10.13} \end{equation}\] where \(\sigma^2_{x_{i,h}}\) is the conditional variance of \(x_{i}\) \(h\) steps ahead, \(\sigma_{x_{i,h},x_{j,h}}\) is the \(h\) steps ahead covariance between the explanatory variables \(x_{i,t+h}\) and \(x_{j,t+h}\), both conditional on the information available at the observation \(t\). Similarly, if we are interested in one step ahead point forecast from the model, it should take the randomness of explanatory variables into account and become: \[\begin{equation} \begin{aligned} \mu_{y,t |t-1} = & \left. \mathrm{E}\left(l_{t-1} + \sum_{i=1}^n a_i x_{i,t} + \epsilon_{t} \right| t-1 \right) = \\ = & l_{t-1} + \sum_{i=1}^n a_i \mu_{x_{i,t}} \end{aligned}. \tag{10.14} \end{equation}\] So, in the case of ADAMX with random explanatory variables, the model should be constructed based on the expectations of those variables, not the random values themselves. This explains, for example, why Athanasopoulos et al. (2011) found that some models with predicted explanatory variables work better than the models with the original variables. This means that when estimating the model, such as ETS(A,N,N), the following should be constructed: \[\begin{equation} \begin{aligned} & \hat{y}_{t} = \hat{l}_{t-1} + \sum_{i=1}^n \hat{a}_{i,t-1} \hat{x}_{i,t} \\ & e_t = y_t -\hat{y}_{t} \\ & \hat{l}_{t} = \hat{l}_{t-1} + \hat{\alpha} e_t \\ & \hat{a}_{i,t} = \hat{a}_{i,t-1} \text{ for each } i \in \{1, \dots, n\} \end{aligned}, \tag{10.15} \end{equation}\] where \(\hat{x}_{i,t}\) is the in-sample conditional one step ahead mean for the explanatory variable \(x_i\).
Finally, as discussed previously, the conditional moments for the pure multiplicative and mixed models do not generally have closed forms, implying that the simulations need to be carried out. The situation becomes more challenging in the case of random explanatory variables because that randomness needs to be introduced in the model itself and propagated throughout the time series. This is not a trivial task, which has not been resolved yet.