Good news! The smooth package v4.3.0 is now on CRAN. And there are several things worth mentioning, so I have written this post.
New default initialisation mechanism
Since the beginning of the package, the smooth
functions supported three ways for initialising the state vector (the vector that includes level, trend, seasonal indices): optimisation, backcasting and values provided by user. The former has been considered the standard way of estimating ETS, while the backcasting was originally proposed by Box & Jenkins (1970) and was only implemented in the smooth
(at least, I haven’t seen it anywhere else). The main advantage of the latter is in computational time, because you do not need to estimate every single value of the state vector. The new ADAM core that I developed during COVID lockdown, had some improvements for the backcasting, and I noticed that adam()
produced more accurate forecasts with it than with the optimisation. But I needed more testing, so I have not changed anything back then.
However, my recent work with Kandrika Pritularga on capturing uncertainty in ETS, have demonstrated that backcasting solves some fundamental problems with the variance of states – the optimisation cannot handle so many parameters, and asymptotic properties of ETS do not make sense in that case (we’ll release the paper as soon as we finish the experiments). So, with this evidence on hands and additional tests, I have made a decision to switch from the optimisation to backcasting as the default initialisation mechanism for all the smooth functions.
The final users should not feel much difference, but it should work faster now and (hopefully) more accurately. If this is not the case, please get in touch or file an issue on github.
Also, rest assured the initial="optimal"
is available and will stay available as an option in all the smooth
functions, so, you can always switch back to it if you don’t like backcasting.
Finally, I have introduce a new initialisation mechanism called “two-stage”, the idea of which is to apply backcasting first and then to optimise the obtained state values. It is slower, but is supposed to be better than the standard optimisation.
ADAM core
Every single function in the smooth
package now uses ADAM C++ core, and the old core will be discontinued starting from v4.5.0 of the package. This applies to the functions: es()
, ssarima()
, msarima()
, ces()
, gum()
, sma()
. There are now the legacy versions of these functions in the package with the prefix “_old” (e.g. es_old()
), which will be removed in the smooth v4.5.0. The new engine also helped ssarima()
, which now became slightly more accurate than before. Unfortunately, there are still some issues with the initialisation of the seasonal ssarima()
, which I have failed to solve completely. But I hope that over time this will be resolved as well.
smooth performance update
I have applied all the smooth functions together with the ets()
and auto.arima()
from the forecast
package to the M1, M3 and Tourism competition data and have measured their performances in terms of RMSSE, scaled Cumulative Error (sCE) and computational time. I used the following R code for that:
The table below shows the distribution of RMSSE, the mean sCE and mean Time. The boldface shows the best performing model.
min Q1 median Q3 max mean sCE Time ETS 0.024 0.677 1.181 2.376 51.616 1.970 0.299 0.385 Auto ARIMA 0.025 0.680 1.179 2.358 51.616 1.986 0.124 1.467 ADAM ETS Back 0.015 0.666 1.175 2.276 51.616 1.921 0.470 0.218 ADAM ETS Opt 0.020 0.666 1.190 2.311 51.616 1.937 0.299 0.432 ADAM ETS Two 0.025 0.666 1.179 2.330 51.616 1.951 0.330 0.579 ES Back 0.015 0.672 1.174 2.284 51.616 1.921 0.464 0.219 ES Opt 0.020 0.672 1.186 2.316 51.616 1.943 0.302 0.497 ES Two 0.024 0.668 1.181 2.346 51.616 1.952 0.346 0.562 MSARIMA Back 0.025 0.710 1.188 2.383 51.616 2.028 0.067 0.780 MSARIMA Opt 0.025 0.724 1.242 2.489 51.616 2.083 0.088 1.905 MSARIMA Two 0.025 0.718 1.250 2.485 51.906 2.075 0.083 2.431 SSARIMA Back 0.045 0.738 1.248 2.383 51.616 2.063 0.167 1.747 SSARIMA Opt 0.025 0.774 1.292 2.413 51.616 2.040 0.178 7.324 SSARIMA Two 0.025 0.742 1.241 2.414 51.616 2.027 0.183 8.096 CES Back 0.046 0.695 1.189 2.355 51.342 1.981 0.125 0.185 CES Opt 0.030 0.698 1.218 2.327 49.480 2.001 -0.135 0.834 CES Two 0.025 0.696 1.207 2.343 51.242 1.993 -0.078 1.006 GUM Back 0.046 0.707 1.215 2.399 51.134 2.049 -0.285 3.575 GUM Opt 0.026 0.795 1.381 2.717 240.143 2.932 -0.549 4.668 GUM Two 0.026 0.803 1.406 2.826 240.143 3.041 -0.593 4.703
Several notes:
- ES is a wrapper of ADAM ETS. The main difference between them is that the former uses the Gamma distribution for the multiplicative error models, while the latter relies on the Normal one.
- MSARIMA is a wrapper for ADAM ARIMA, which is why I don't report the latter in the results.
One thing you can notice from the output above, is that the models with backcasting consistently produce more accurate forecasts across all measures. I explain this with the idea that they tend not to overfit the data as much as the optimal initialisation does.
To see the stochastic dominance of the forecasting models, I conducted the modification of the MCB/Nemenyi test, explained in this post:
par(mar=c(10,3,4,1)) greybox::rmcb(t(testResults20250603[c(1:8,12:23),,"RMSSE"]), outplot="mcb")
The image shows mean ranks for each of the models and whether the performance of those is significant on the 5% level or not. It is apparent that ADAM ETS has the lowest rank, no matter what the initialisation is used, but its performance does not differ significantly from the es()
, ets()
and auto.arima()
. Also, auto.arima()
significantly outperforms msarima()
and ssarima()
on this data, which could be due to their initialisation. Still, backcasting seems to help all the functions in terms of accuracy in comparison with the "optimal" and "two-stage" initials.
What's next?
I am now working on a modified formulation for ETS, which should fix some issues with the multiplicative trend and make the ETS safer. This is based on Section 6.6 of the online version of the ADAM monograph (it is not in the printed version). I am not sure whether this will improve the accuracy further, but I hope that it will make some of the ETS models more resilient than they are right now. I specifically need the multiplicative trend model, which sometimes behave like crazy due to its formulation.
I also plan to translate all the simulation functions to the ADAM core. This applies to sim.es()
, sim.ssarima()
, sim.gum()
and sim.ces()
. Currently they rely on the older one, and I want to get rid of it. Having said that, the method simulate()
applied to the new smooth
functions already uses the new core. It just lacks the flexibility that the other functions have.
Furthermore, I want to rewrite the oes()
function and substitute it with oadam()
, which would use a better engine, supporting more features, such as multiple frequencies and ARIMA for the occurrence. This is a lot of work, and I probably will need help with that.
Finally, Filotas Theodosiou, Leonidas Tsaprounis, and I are working on the translation of the R code of the smooth
to Python. You can read a bit more about this project here. There are several other people who decided to help us, but the progress so far has been a bit slow, because of the code translation. If you want to help, please get in touch.