smooth v3.2.0: what’s new?

smooth package has reached version 3.2.0 and is now on CRAN. While the version change from 3.1.7 to 3.2.0 looks small, this has introduced several substantial changes and represents a first step in moving to the new C++ code in the core of the functions. In this short post, I will outline the main new features of smooth 3.2.0.

New engines for ETS, MSARIMA and SMA

The first and one of the most important changes is the new engine for the ETS (Error-Trend-Seasonal exponential smoothing model), MSARIMA (Multiple Seasonal ARIMA) and SMA (Simple Moving Average), implemented respectively in

es()
es(),
msarima()
msarima() and
sma()
sma() functions. The new engine was developed for
adam()
adam() and the three models above can be considered as special cases of it. You can read more about ETS in ADAM monograph, starting from Chapter 4; MSARIMA is discussed in Chapter 9, while SMA is briefly discussed in Subsection 3.3.3.

The

es()
es() function now implements the ETS close to the conventional one, assuming that the error term follows normal distribution. It still supports explanatory variables (discussed in Chapter 10 of ADAM monograph) and advanced estimators (Chapter 11), and it has the same syntax as the previous version of the function had, but now acts as a wrapper for
adam()
adam(). This means that it is now faster, more accurate and requires less memory than it used to.
msarima()
msarima() being a wrapper of
adam()
adam() as well, is now also faster and more accurate than it used to be. But in addition to that both functions now support the methods that were developed for
adam()
adam(), including
vcov()
vcov(),
confint()
confint(),
summary()
summary(),
rmultistep()
rmultistep(),
reapply()
reapply(),
plot()
plot() and others. So, now you can do more thorough analysis and improve the models using all these advanced instruments (see, for example, Chapter 14 of ADAM).

The main reason why I moved the functions to the new engine was to clean up the code and remove the old chunks that were developed when I only started learning C++. A side effect, as you see, is that the functions have now been improved in a variety of ways.

And to be on the safe side, the old versions of the functions are still available in

smooth
smooth under the names
es_old()
es_old(),
msarima_old()
msarima_old() and
sma_old()
sma_old(). They will be removed from the package if it ever reaches the v.4.0.0.

New methods for ADAM

There are two new methods for

adam()
adam() that can be used in a variety of cases. The first one is
simulate()
simulate(), which will generate data based on the estimated ADAM, whatever the original model is (e.g. mixture of ETS, ARIMA and regression on the data with multiple frequencies). Here is how it can be used:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
adam(BJsales, "AAdN") |>
simulate() |>
plot()
adam(BJsales, "AAdN") |> simulate() |> plot()
adam(BJsales, "AAdN") |>
     simulate() |>
     plot()

which will produce a plot similar to the following:

Simulated data based on adam() applied to Box-Jenkins sales data

Simulated data based on adam() applied to Box-Jenkins sales data

This can be used for research, when a more controlled environment is needed. If you want to fine tune the parameters of ADAM before simulating the data, you can save the output in an object and amend its parameters. For example:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
testModel <- adam(BJsales, "AAdN")
testModel$persistence <- c(0.5, 0.2)
simulate(testModel)
testModel <- adam(BJsales, "AAdN") testModel$persistence <- c(0.5, 0.2) simulate(testModel)
testModel <- adam(BJsales, "AAdN")
testModel$persistence <- c(0.5, 0.2)
simulate(testModel)

The second new method is the

xtable()
xtable() from the respective
xtable
xtable package. It produces LaTeX version of the table from the summary of ADAM. Here is an example of a summary from ADAM ETS:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
adam(BJsales, "AAdN") |>
summary()
adam(BJsales, "AAdN") |> summary()
adam(BJsales, "AAdN") |>
     summary()
Model estimated using adam() function: ETS(AAdN)
Response variable: BJsales
Distribution used in the estimation: Normal
Loss function type: likelihood; Loss function value: 256.1516
Coefficients:
      Estimate Std. Error Lower 2.5% Upper 97.5%  
alpha   0.9514     0.1292     0.6960      1.0000 *
beta    0.3328     0.2040     0.0000      0.7358  
phi     0.8560     0.1671     0.5258      1.0000 *
level 203.2835     5.9968   191.4304    215.1289 *
trend  -2.6793     4.7705   -12.1084      6.7437  

Error standard deviation: 1.3623
Sample size: 150
Number of estimated parameters: 6
Number of degrees of freedom: 144
Information criteria:
     AIC     AICc      BIC     BICc 
524.3032 524.8907 542.3670 543.8387

As you can see in the output above, the function generates the confidence intervals for the parameters of the model, including the smoothing parameters, dampening parameter and the initial states. This summary can then be used to generate the LaTeX code for the main part of the table:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
adam(BJsales, "AAdN") |>
xtable()
adam(BJsales, "AAdN") |> xtable()
adam(BJsales, "AAdN") |>
     xtable()

which will looks something like this:

Summary of adam()

Summary of adam()

Other improvements

First, one of the major changes in

smooth
smooth functions is the new backcasting mechanism for
adam()
adam(),
es()
es() and
msarima()
msarima() (this is discussed in Section 11.4 of ADAM monograph). The main difference with the old one is that now it does not backcast the parameters for the explanatory variables and estimates them separately via optimisation. This feature appeared to be important for some of users who wanted to try MSARIMAX/ETSX (a model with explanatory variables) but wanted to use backcasting as the initialisation. These users then wanted to get a summary, analysing the uncertainty around the estimates of parameters for exogenous variables, but could not because the previous implementation would not estimate them explicitly. This is now available. Here is an example:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
cbind(BJsales, BJsales.lead) |>
adam(model="AAdN", initial="backcasting") |>
summary()
cbind(BJsales, BJsales.lead) |> adam(model="AAdN", initial="backcasting") |> summary()
cbind(BJsales, BJsales.lead) |>
    adam(model="AAdN", initial="backcasting") |>
    summary()
Model estimated using adam() function: ETSX(AAdN)
Response variable: BJsales
Distribution used in the estimation: Normal
Loss function type: likelihood; Loss function value: 255.1935
Coefficients:
             Estimate Std. Error Lower 2.5% Upper 97.5%  
alpha          0.9724     0.1108     0.7534      1.0000 *
beta           0.2904     0.1368     0.0199      0.5607 *
phi            0.8798     0.0925     0.6970      1.0000 *
BJsales.lead   0.1662     0.2336    -0.2955      0.6276  

Error standard deviation: 1.3489
Sample size: 150
Number of estimated parameters: 5
Number of degrees of freedom: 145
Information criteria:
     AIC     AICc      BIC     BICc 
520.3870 520.8037 535.4402 536.4841

As you can see in the output above, the initial level and trend of the model are not reported, because they were estimated via backcasting. However, we get the value of the parameter

BJsales.lead
BJsales.lead and the uncertainty around it. The old backcasting approach is now called "complete", implying that all values of the state vector are produce via backcasting.

Second,

forecast.adam()
forecast.adam() now has a parameter
scenarios
scenarios, which when TRUE will return the simulated paths from the model. This only works when
interval="simulated"
interval="simulated" and can be used for the analysis of possible forecast trajectories.

Third, the

plot()
plot() method now can also produce ACF/PACF for the squared residuals for all
smooth
smooth functions. This becomes useful if you suspect that your data has ARCH elements and want to see if they need to be modelled separately. This can also be done using
adam()
adam() and
sm()
sm() and is discussed in Chapter 17 of the monograph.

Finally, the

sma()
sma() function now has the
fast
fast parameter, which when true will use a modified Ternary search for the best order based on information criteria. It might not give the global minimum, but it works much faster than the exhaustive search.

Conclusions

These are the main new features in the package. I feel that the main job in

smooth
smooth is already done, and all I can do now is just tune the functions and improve the existing code. I want to move all the functions to the new engine and ditch the old one, but this requires much more time than I have. So, I don't expect to finish this any time soon, but I hope I'll get there someday. On the other hand, I'm not sure that spending much time on developing an R package is a wise idea, given that nowadays people tend to use Python. I would develop Python analogue of the
smooth
smooth package, but currently I don't have the necessary expertise and time to do that. Besides, there already exist great libraries, such as tsforecast from nixtla and sktime. I am not sure that another library, implementing ETS and ARIMA is needed in Python. What do you think?

Comments (3):

    • Hi Mark,

      I only judge by LinkedIn posts and general rise of popularity of Python over R in business. Python’s Statistical methods seem to be lagging behind the R ones, but I think it will change soon. Nonetheless, I plan to continue supporting my packages anyway :).

      Cheers,
      Ivan

Leave a Reply