smooth package has reached version 3.2.0 and is now on CRAN. While the version change from 3.1.7 to 3.2.0 looks small, this has introduced several substantial changes and represents a first step in moving to the new C++ code in the core of the functions. In this short post, I will outline the main new features of smooth 3.2.0.
New engines for ETS, MSARIMA and SMA
The first and one of the most important changes is the new engine for the ETS (Error-Trend-Seasonal exponential smoothing model), MSARIMA (Multiple Seasonal ARIMA) and SMA (Simple Moving Average), implemented respectively in
es()
, msarima()
and sma()
functions. The new engine was developed for adam()
and the three models above can be considered as special cases of it. You can read more about ETS in ADAM monograph, starting from Chapter 4; MSARIMA is discussed in Chapter 9, while SMA is briefly discussed in Subsection 3.3.3.
The
es()
function now implements the ETS close to the conventional one, assuming that the error term follows normal distribution. It still supports explanatory variables (discussed in Chapter 10 of ADAM monograph) and advanced estimators (Chapter 11), and it has the same syntax as the previous version of the function had, but now acts as a wrapper for adam()
. This means that it is now faster, more accurate and requires less memory than it used to. msarima()
being a wrapper of adam()
as well, is now also faster and more accurate than it used to be. But in addition to that both functions now support the methods that were developed for adam()
, including vcov()
, confint()
, summary()
, rmultistep()
, reapply()
, plot()
and others. So, now you can do more thorough analysis and improve the models using all these advanced instruments (see, for example, Chapter 14 of ADAM).
The main reason why I moved the functions to the new engine was to clean up the code and remove the old chunks that were developed when I only started learning C++. A side effect, as you see, is that the functions have now been improved in a variety of ways.
And to be on the safe side, the old versions of the functions are still available in
smooth
under the names es_old()
, msarima_old()
and sma_old()
. They will be removed from the package if it ever reaches the v.4.0.0.
New methods for ADAM
There are two new methods for
adam()
that can be used in a variety of cases. The first one is simulate()
, which will generate data based on the estimated ADAM, whatever the original model is (e.g. mixture of ETS, ARIMA and regression on the data with multiple frequencies). Here is how it can be used:
adam(BJsales, "AAdN") |> simulate() |> plot()
which will produce a plot similar to the following:
This can be used for research, when a more controlled environment is needed. If you want to fine tune the parameters of ADAM before simulating the data, you can save the output in an object and amend its parameters. For example:
testModel <- adam(BJsales, "AAdN") testModel$persistence <- c(0.5, 0.2) simulate(testModel)
The second new method is the
xtable()
from the respective xtable
package. It produces LaTeX version of the table from the summary of ADAM. Here is an example of a summary from ADAM ETS:
adam(BJsales, "AAdN") |> summary()
Model estimated using adam() function: ETS(AAdN) Response variable: BJsales Distribution used in the estimation: Normal Loss function type: likelihood; Loss function value: 256.1516 Coefficients: Estimate Std. Error Lower 2.5% Upper 97.5% alpha 0.9514 0.1292 0.6960 1.0000 * beta 0.3328 0.2040 0.0000 0.7358 phi 0.8560 0.1671 0.5258 1.0000 * level 203.2835 5.9968 191.4304 215.1289 * trend -2.6793 4.7705 -12.1084 6.7437 Error standard deviation: 1.3623 Sample size: 150 Number of estimated parameters: 6 Number of degrees of freedom: 144 Information criteria: AIC AICc BIC BICc 524.3032 524.8907 542.3670 543.8387
As you can see in the output above, the function generates the confidence intervals for the parameters of the model, including the smoothing parameters, dampening parameter and the initial states. This summary can then be used to generate the LaTeX code for the main part of the table:
adam(BJsales, "AAdN") |> xtable()
which will looks something like this:
Other improvements
First, one of the major changes in
smooth
functions is the new backcasting mechanism for adam()
, es()
and msarima()
(this is discussed in Section 11.4 of ADAM monograph). The main difference with the old one is that now it does not backcast the parameters for the explanatory variables and estimates them separately via optimisation. This feature appeared to be important for some of users who wanted to try MSARIMAX/ETSX (a model with explanatory variables) but wanted to use backcasting as the initialisation. These users then wanted to get a summary, analysing the uncertainty around the estimates of parameters for exogenous variables, but could not because the previous implementation would not estimate them explicitly. This is now available. Here is an example:
cbind(BJsales, BJsales.lead) |> adam(model="AAdN", initial="backcasting") |> summary()
Model estimated using adam() function: ETSX(AAdN) Response variable: BJsales Distribution used in the estimation: Normal Loss function type: likelihood; Loss function value: 255.1935 Coefficients: Estimate Std. Error Lower 2.5% Upper 97.5% alpha 0.9724 0.1108 0.7534 1.0000 * beta 0.2904 0.1368 0.0199 0.5607 * phi 0.8798 0.0925 0.6970 1.0000 * BJsales.lead 0.1662 0.2336 -0.2955 0.6276 Error standard deviation: 1.3489 Sample size: 150 Number of estimated parameters: 5 Number of degrees of freedom: 145 Information criteria: AIC AICc BIC BICc 520.3870 520.8037 535.4402 536.4841
As you can see in the output above, the initial level and trend of the model are not reported, because they were estimated via backcasting. However, we get the value of the parameter
BJsales.lead
and the uncertainty around it. The old backcasting approach is now called "complete", implying that all values of the state vector are produce via backcasting.
Second,
forecast.adam()
now has a parameter scenarios
, which when TRUE will return the simulated paths from the model. This only works when interval="simulated"
and can be used for the analysis of possible forecast trajectories.
Third, the
plot()
method now can also produce ACF/PACF for the squared residuals for all smooth
functions. This becomes useful if you suspect that your data has ARCH elements and want to see if they need to be modelled separately. This can also be done using adam()
and sm()
and is discussed in Chapter 17 of the monograph.
Finally, the
sma()
function now has the fast
parameter, which when true will use a modified Ternary search for the best order based on information criteria. It might not give the global minimum, but it works much faster than the exhaustive search.
Conclusions
These are the main new features in the package. I feel that the main job in
smooth
is already done, and all I can do now is just tune the functions and improve the existing code. I want to move all the functions to the new engine and ditch the old one, but this requires much more time than I have. So, I don't expect to finish this any time soon, but I hope I'll get there someday. On the other hand, I'm not sure that spending much time on developing an R package is a wise idea, given that nowadays people tend to use Python. I would develop Python analogue of the smooth
package, but currently I don't have the necessary expertise and time to do that. Besides, there already exist great libraries, such as tsforecast from nixtla and sktime. I am not sure that another library, implementing ETS and ARIMA is needed in Python. What do you think?
Great work; stick with R…
Nice work.
I’m not sure there is good evidence for people who do good time series work favouring python over R, is there?
Hi Mark,
I only judge by LinkedIn posts and general rise of popularity of Python over R in business. Python’s Statistical methods seem to be lagging behind the R ones, but I think it will change soon. Nonetheless, I plan to continue supporting my packages anyway :).
Cheers,
Ivan