10.5 Dealing with categorical variables in ADAMX

When dealing with categorical variables in a regression context, they are typically expanded to a set of dummy variables (see Chapter 10 of Svetunkov, 2022a). So, for example, a variable “promotions” that can be “light”, “medium” and “heavy” for different observations \(t\) would be expanded to three dummy variables, promoLight, promoMedium and promoHeavy, each one of which is equal to 1, when the respective promotion type happens and equal to zero otherwise. When including these variables in the model, we would typically drop one of them (which is sometimes called pivot variable) and have a model with two dummy variables of a type: \[\begin{equation} y_t = a_0 + a_1 x_{1,t} + \dots + a_n x_{n,t} + d_1 promoLight_t + d_2 promoMedium_t + \epsilon_t, \tag{10.31} \end{equation}\] where \(d_i\) is the parameter for the \(i\)-th dummy variable. The same procedure can be done in the context of ADAMX, and the principles will be exactly the same for ADAMX{S}. However, when it comes to the dynamic model, the parameters have time indeces, and there can be different ways of formulating the model. Here is the first one: \[\begin{equation} \begin{aligned} & y_{t} = a_{0,t-1} + a_{1,t-1} x_{1,t} + \dots + a_{n,t-1} x_{n,t} + d_1 promoLight_t + d_2 promoMedium_t + \epsilon_t \\ & a_{i,t} = a_{i,t-1} + \left \lbrace \begin{aligned} &\delta_i \frac{\log(1+\epsilon_t)}{x_{i,t}} \text{ for each } i \in \{1, \dots, n\}, \text{ if } x_{i,t}\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \\ & d_{1,t} = d_{1,t-1} + \left \lbrace \begin{aligned} &\delta_{n+1} \epsilon_t, \text{ if } promoLight_t\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \\ & d_{2,t} = d_{2,t-1} + \left \lbrace \begin{aligned} &\delta_{n+2} \epsilon_t, \text{ if } promoMedium_t\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \end{aligned} . \tag{10.32} \end{equation}\] Here we assume that each specific category of the variable promotion changes over time on its own with its own smoothing parameters \(\delta_{n+1}\) and \(\delta_{n+2}\). Alternatively, we can assume that they have the same smoothing parameters, implying that the changes of the parameters are similar throughout different categories of the variable: \[\begin{equation} \begin{aligned} & d_{1,t} = d_{1,t-1} + \left \lbrace \begin{aligned} &\delta_{n+1} \epsilon_t, \text{ if } promoLight_t\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \\ & d_{2,t} = d_{2,t-1} + \left \lbrace \begin{aligned} &\delta_{n+1} \epsilon_t, \text{ if } promoMedium_t\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \end{aligned} . \tag{10.33} \end{equation}\] The rationale for such restriction is that we might expect the adaptation mechanism to apply to the promo variable as a whole, not to its specific values. This case also becomes useful in connecting the ETSX and the conventional seasonal ETS model. Let’s assume that we deal with quarterly data with no trend, and we have a categorical variable quarterOfYear, which can be First, Second, Third and Fourth, depending on the specific observation. For convenience, I will call the parameters for the dummy variables, created from this categorical variable \(s_{1,t}, s_{2,t}, s_{3,t} \text{ and } s_{4,t}\). Based on (10.33), the model can then be formulated as: \[\begin{equation} \begin{aligned} & y_{t} = l_{t-1} + s_{1,t} quarterOfYear_{1,t} + s_{2,t} quarterOfYear_{2,t} \\ & + s_{3,t} quarterOfYear_{3,t} + s_{4,t} quarterOfYear_{4,t} + \epsilon_t \\ & l_t = l_{t-1} + \alpha \epsilon_t \\ & s_{i,t} = s_{i,t-1} + \left \lbrace \begin{aligned} &\delta \epsilon_t \text{ for each } i \in \{1, \dots, 4\}, \text{ if } quarterOfYear_{i,t}\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \end{aligned} . \tag{10.34} \end{equation}\] We intentionally added all four dummy variables in (10.34) to separate the seasonal effect from the level component. While in regression and ETSX{S} contexts, this does not make much sense, in the ETSX{D} we avoid the trap of dummy variables due to the dynamic update of parameters. Having done that, we have just formulated the conventional ETS(A,N,A) model using a set of dummy variables and one smoothing parameter, the difference being that the latter relies on the lag of component: \[\begin{equation} \begin{aligned} & y_{t} = l_{t-1} + s_{t-4} + \epsilon_t \\ & l_t = l_{t-1} + \alpha \epsilon_t \\ & s_t = s_{t-4} + \gamma \epsilon_t \\ \end{aligned} . \tag{10.35} \end{equation}\] So, this comparison shows on one hand that the mechanism of ADAMX{D} is natural for the ADAM, and on the other hand that using the same smoothing parameters for categorical variables can be a reasonable idea, especially in cases when we do not have grounds to assume that each category of the variable should evolve independently.


• Svetunkov, I., 2022a. Statistics for business analytics. https://openforecast.org/sba/ (version: 31.03.2022)