## 12.2 Estimation of multiple seasonal model

While the main principles of model estimation discussed in Chapter 11 can be widely used for the multiple seasonal models, there are some specific aspects that require additional attention. They mainly apply to ADAM ETS and to ADAM ARIMA.

### 12.2.1 ADAM ETS issues

Estimating a multiple seasonal ETS model is challenging because it implies a large optimisation task. The number of parameters related to seasonal components is equal in general to \(\sum_{j=1}^n m_j + n\): \(\sum_{j=1}^n m_j\) initial values and \(n\) smoothing parameters. For example, in the case of hourly data, a triple seasonal model for hours of day, hours of week, and hours of year will have: \(m_1 = 24\), \(m_2 = 24 \times 7 = 168\), and \(m_3= 24 \times 365 = 8760\), resulting overall in \(24 + 168 + 8760 + 3 = 8955\) parameters related to seasonal components to estimate. This is not a trivial task and would take hours to converge to optimum unless the pre-initials (Section 11.4) are already close to optimum. So, if you want to construct multiple seasonal ADAM ETS model, it makes sense to use a different initialisation (see discussion in Section 11.4), reducing the number of estimated parameters. A possible solution in this case is backcasting (Subsection 11.4.1). The number of parameters in our example would reduce from 8955 to 3 (smoothing parameters), substantially speeding up the model estimation process.

Another consideration is a fitting model to the data. In the conventional ETS, the size of the transition matrix is equal to the number of initial states, which makes it too slow to be practical on high-frequency data (multiplication of a \(8952 \times 8952\) matrix by a vector is a challenging task even for modern computers). But due to the lagged structure of the ADAM (discussed in Section 5), the construction of multiple seasonal models does not take as much time for ADAM ETS because we end up multiplying a matrix of \(3 \times 3\) by a vector with three rows (skipping level and trend, which would add two more elements). So, in ADAM, the main computational burden comes from recursive relation in the state space model’s transition equation because this operation needs to be repeated at least \(T\) times, whatever the sample size \(T\) is. This is still a computationally expensive task, so you would want to get to the optimum with as few iterations as possible. This gives another motivation for reducing the number of parameters to estimate, and thus for using backcasting.

Another potential simplification would be to use deterministic seasonality for some seasonal frequencies. The possible solution, in this case, is to use explanatory variables (Section 10) for the higher frequency states (see discussion in Section 12.3) or use multiple seasonal ETS, setting some of the smoothing parameters to zero.

Finally, given that we deal with large samples, some states of ETS might become more reactive than needed, having higher than required smoothing parameters. One of the possible ways to overcome this limitation is by using the multistep loss functions (Section 11.3). For example, Kourentzes and Trapero (2018) showed that using such loss functions as TMSE (from Subsection 11.3.2) in the estimation of ETS models on high-frequency data leads to improvements in accuracy due to the shrinkage of parameters towards zero, mitigating the potential overfitting issue. The only problem with this approach is that it is more computationally expensive than the conventional likelihood and thus would take more time than the conventional estimation procedures (at least \(h\) times more, where \(h\) is the length of the forecast horizon).

### 12.2.2 ADAM ARIMA issues

It is also possible to fit Multiple Seasonal ARIMA (discussed partially in Subsection 8.2.3) to the high-frequency data, and, for example, Taylor (2010) used triple seasonal ARIMA on the example of two time series and demonstrated that it produced more accurate forecasts than other ARIMAs under consideration, even slightly outperforming ETS. The main issue with ARIMA arises in the order selection stage. While in the case of ETS, one can decide what model to use based on judgment (e.g. there is no apparent trend, and the amplitude increases with the increase of level so we will fit the ETS(M,N,M) model), ARIMA requires more careful consideration of possible orders of the model. Selecting appropriate orders of ARIMA is not a trivial task on its own, but choosing the orders on high-frequency data (where correlations might appear significant just because of the sample size) becomes an even more challenging task than usual. Furthermore, while on monthly data, we typically set maximum AR and MA orders to 3 or 5, this does not have any merit in the case of high-frequency data. If the first seasonal component has a lag of 24, then, in theory, anything up until 24 might be helpful for the model. Long story short, be prepared for the lengthy investigation of appropriate ARIMA orders. While ADAM ARIMA implements an efficient order selection mechanism (see Section 15.2), it does not guarantee that the most appropriate model will be applied to the data. Inevitably, you would need to analyse the residuals of the applied model, add higher ARIMA orders, and see if there is an improvement in the model’s performance.

The related issue to this in the context of ADAM ARIMA (Section 9.1) is the dimensionality problem. The more orders you introduce in the model, the bigger the transition matrix becomes. This leads to the same issues as in the ADAM ETS, discussed in the previous subsection. There is no unique recipe in this challenging situation, but backcasting (Section 11.4.1) addresses some of these issues. You might also want to fine-tune the optimiser to get a balance between speed and accuracy in the estimation of parameters (see discussion in Subsection 11.4.1).