smooth forecasting with the smooth package in Python

Here is another piece of news I have been hoping to deliver for quite some time now. We have finally created the first release of the smooth package for Python. Anyone interested? Read more!

On this page:

Why does “smooth” exist?

There are lots of implementations of ETS and ARIMA (dynamic models) out there, both in Python and in R (and also in Julia now, see Durbyn). So, why bother creating yet another one?

The main philosophy of the smooth package in R is flexibility. It is not just an implementation of ETS or ARIMA – it is there to give you more control over what you can do with these models in different situations. The main function in the package is called “ADAM” – the Augmented Dynamic Adaptive Model. It is the single source of error state space model that unites ETS, ARIMA, and regression, and supports the following list of features (taken from the Introduction in the book):

  1. ETS;
  2. ARIMA;
  3. Regression;
  4. TVP regression;
  5. Combination of (1), (2), and either (3) or (4). i.e. ARIMAX/ETSX;
  6. Automatic selection/combination of states for ETS;
  7. Automatic orders selection for ARIMA;
  8. Variables selection for regression;
  9. Normal and non-normal distributions;
  10. Automatic selection of most suitable distributions;
  11. Multiple seasonality;
  12. Occurrence part of the model to handle zeroes in data (intermittent demand);
  13. Modelling scale of distribution (GARCH and beyond);
  14. A variety of ways to produce forecasts for different situation;
  15. Advanced loss functions for model estimation;

All of these features come with the ability to fine tune the optimiser (e.g. how the parameters are estimated) and to manually adjust any model parameters you want. This allows, for example, fitting iETSX(M,N,M) with multiple frequencies with Gamma distribution to better model hourly emergency department arrivals (intermittent demand) and thus producing more accurate forecasts of it. There is no other package either in R or Python that could give such flexibility with the dynamic models to users.

And at the same time, over the years, I have managed to iron out the R functions so much that they handle almost any real life situation and do not break. And at the same time, they work quite fast and produce accurate forecasts, sometimes outperforming the other existing R implementations.

So, here we are. We want to bring this flexibility, robustness, and speed to Python.

A bit of history

The smooth package for R was first released on CRAN in 2016, when I finished my PhD. It went from v1.4.3 to v4.4.0 over the last 10 years. It saw a rise in popularity, but then an inevitable decline due to the decreasing number of R users in business. So, back in 2021, Rebecca Killick and I applied for an EPSRC grant to develop Python packages for forecasting and time series analysis. The idea was to translate what we have in R (including greybox, smooth, forecast, and changepoint detection packages) to Python with the help of professional programmers. Unfortunately, we did not receive the funding (it went to sktime for a good reason – they already had an existing codebase in Python).

In the beginning of 2023, Leonidas Tsaprounis got in touch with me, suggesting some help with the development and translation of the smooth package to Python. The idea was to use the existing C++ core and simply create a wrapper in Python. “Simply” is actually an oversimplification here, because I am not a programmer, so my functions are messy and hard to read. Nonetheless, we started cooking. Leo helped in setting up pybind11 and carma, and creating the necessary files for the compilation of the C++ code. Just to test whether that worked, we managed to create a basic function for the simple moving average based on the sma() from the R version. Our progress was a bit slow, because we were both busy with other projects. All of that changed when in July 2023 Filotas Theodosiou joined our small team and started working on the translation. We decided to implement the hardest thing first – ADAM.

What Fil did was use LLMs to translate the code from R to Python. Fil will write about this work at some point; the only thing I can say here is that it was not an easy process, because thousands of lines of R code needed to be translated to Python and then refactored. I only helped with suggestions and explanations of what is happening inside, and Leo provided guidance regarding the coding philosophy. It was Fil and his AI tools that did the main heavy lifting. By Summer 2025, we had a basic working ADAM() function, but it worked slightly differently from the one in R due to differences in initialisation and optimisation. He presented his work at the IIF Open Source Forecasting Software Workshop, explaining his experience with LLMs and code translation. Because everyone was pretty busy, it took a bit more time to reach the first proper release of smooth in Python.

In December 2025, I bought a Claude AI subscription and started vibe coding my way through the existing Python code. Between the three of us, we managed to progress the project, and finally in January 2026 we reached v1.0.0 of smooth in Python. Now ADAM works in exactly the same way in both R and Python: if you give it the same time series, it will select the same model and produce the same parameter estimates across languages. It took us time and effort to reach this, but we feel it is a critically important step – ensuring that users working in different languages have the same experience.

How to install

We are entirely grateful to Gustavo Niemeyer, who gave us the name smooth on PyPI. It belonged to him since 2020, but the project was abandoned, and he agreed to transfer it to us. So, now you can install smooth simply by running:

pip install smooth

There is also a development version of the package, which you can install by following the instructions in our Installation wiki on GitHub.

What works

The package currently does not have the full functionality, but there are already some things:

  1. ADAM() – the main function that supports ETS with:
    • components selection via model="ZXZ" or other pools (see wiki on GitHub for details);
    • forecast combination using AIC weights via model="CCC" or other pools (again, explained in the wiki);
    • multiple seasonalities via lags=[24,168];
    • ability to provide some smoothing parameters or initial values (e.g. only alpha), letting the function estimate the rest;
    • different distributions;
    • advanced loss functions;
    • several options for model initialisation;
    • fine-tuning of the optimiser via nlopt_kargs (read more in the wiki);
  2. ES() – the wrapper of ADAM() with the normal distribution. Supports the same functionality, but is a simplification of ADAM.

We also have the standard methods for fitting and forecasting, and many attributes that allow extracting information from the model, all of which are explained in the wiki of the project.

An example

Here is an example of how to work with ADAM in Python. For this example to work, you will need to install the fcompdata package from pip:
pip install fcompdata

The example:

from smooth import ADAM
from fcompdata import M3

model = ADAM(model="ZXZ", lags=12)
model.fit(M3[2568].x)

print(model)

This is what you should see as the result of print:

Time elapsed: 0.21 seconds
Model estimated using ADAM() function: ETS(MAM)
With backcasting initialisation
Distribution assumed in the model: Gamma
Loss function type: likelihood; Loss function value: 868.7085
Persistence vector g:
 alpha   beta  gamma
0.0205 0.0203 0.1568
Damping parameter: 1.0000
Sample size: 116
Number of estimated parameters: 4
Number of degrees of freedom: 112
Information criteria:
      AIC      AICc       BIC      BICc
1745.4170 1745.7774 1756.4314 1757.2879

which is exactly the same output as in R (see, for example, some explanations here). We can then produce a forecast from this model:

predict(model, h=18, interval="prediction", level=[0.9,0.95])

The predict method currently supports analytical (aka “parametric”/”approximate”) and simulated prediction intervals. The interval="prediction" will tell the function to choose between the two depending on the type of model (multiplicative ETS models do not have analytical formulae for the multistep conditional variance and, as a result, do not have proper analytical prediction intervals). The level parameter can accept either a vector (which will produce several quantiles) or a scalar. What I get after running this is:

             mean    lower_0.05   lower_0.025    upper_0.95   upper_0.975
116  11234.592643  10061.150811   9853.119370  12443.904763  12686.143870
117   8050.810544   7228.709864   7064.356429   8896.078713   9080.867866
118   7658.608163   6886.165498   6746.881596   8475.545469   8633.149796
119  10552.382933   9452.306261   9236.493206  11679.816814  11892.381046
120  10889.816551   9768.665327   9559.580233  12066.628218  12313.872205
121   7409.545388   6643.378080   6495.237349   8232.558913   8363.550598
122   7591.183726   6800.878319   6650.514904   8425.167556   8576.257297
123  14648.452226  13089.346824  12793.226771  16263.206997  16582.786939
124   6953.045603   6206.829418   6079.126523   7730.260301   7892.917098
125  11938.941882  10650.759513  10427.172989  13307.563498  13579.662925
126   8299.626845   7379.550080   7200.075336   9280.095498   9468.327654
127   8508.558884   7530.987557   7367.541698   9534.117611   9734.218257
128  11552.654541  10162.615284   9907.770755  13039.710057  13313.534412
129   8286.727505   7273.279560   7100.450038   9401.374461   9627.922116
130   7889.999721   6879.771040   6710.494741   8948.052817   9186.279622
131  10860.671447   9438.536335   9174.365147  12353.444279  12664.983138
132  11218.330395   9690.780430   9453.114931  12849.545829  13201.679610
133   7620.922782   6564.497975   6391.080974   8748.561119   8977.120673

The separate wiki on Fitted Values and Forecasts explains all the parameters accepted by the predict method and what is returned by it.

Evaluation

To see how the developed function works, I decided to conduct exactly the same evaluation that I did for the recent R release of the smooth package, running the functions on the M1, M3, and Tourism competition data (5,315 time series) using the fcompdata package.

Setup

I have selected the same set of models for Python as I did in R. Here are several options for the ADAM model parameter to see how the specific pools impact accuracy (this is discussed in detail in Section 15.1 of ADAM):

  1. XXX – select between pure additive ETS models only;
  2. ZZZ – select from the pool of all 30 models, but use branch-and-bound to remove the less suitable models;
  3. ZXZ – same as (2), but without the multiplicative trend models. This is used in the smooth functions by default;
  4. FFF – select from the pool of all 30 models (exhaustive search);
  5. SXS – the pool of models used by default in ets() from the forecast package in R.

I also tested three types of ETS initialisation (read more about them here):

  1. Back – initial="backcasting" – this is the default initialisation method;
  2. Opt – initial="optimal";
  3. Two – initial="two-stage".

I have also found the following implementations of ETS in Python and included them in my evaluation:

  1. sktime AutoETS
  2. skforecast AutoETS
  3. statsforecast AutoETS

There is also a darts implementation of AutoETS, which is actually a wrapper of the statsforecast one. So I ran it just to check how it works, and found that it failed in 1,518 cases. I filed the issue, and it turned out that their implementation does not deal with short time series (10 observations or fewer), which is their design decision. They are now considering what to do about that, if anything.

I used RMSSE (M5 competition, motivated by Athanasopoulos & Kourentzes (2023)) and SAME error measure together with the computational time for each time series:
\begin{equation*}
\mathrm{RMSSE} = \frac{1}{\sqrt{\frac{1}{T-1} \sum_{t=1}^{T-1} \Delta_t^2}} \mathrm{RMSE},
\end{equation*}
where \(\mathrm{RMSE} = \sqrt{\frac{1}{h} \sum_{j=1}^h e^2_{t+j}}\) is the Root Mean Squared Error of the point forecasts, and \(\Delta_t\) is the first differences of the in-sample actual values.

\begin{equation*}
\mathrm{SAME} = \frac{1}{\frac{1}{T-1} \sum_{t=1}^{T-1} |\Delta_t|} \mathrm{AME},
\end{equation*}
where \(\mathrm{AME}= \left| \frac{1}{h} \sum_{j=1}^h e_{t+j} \right|\).

All of this was implemented in a Jupyter notebook, which is available here in case you want to reproduce the results.

sktime non-collaborative stance

In the first run of this (on 28th January 2026), I encountered several errors in AutoETS in sktime: it took an extremely long time to compute (see the table below – on average around 30 seconds per time series) and produced ridiculous forecasts (mean RMSSE was 106,951). I filed an issue in their repo and sent a courtesy message to Franz Kiraly on LinkedIn the same day, saying that I would be happy to rerun the results if this were fixed. I then received an insulting email from him, blaming me for not collaborating and trying to diminish sktime. They then closed the issue, claiming that it was fixed. I reran the experiment with their development version from GitHub on 21st March (two months later!), only to get exactly the same results. I do not think their fix is working, but given Franz’s toxic behaviour, I am not going to rerun this any further or help him in any way. The Jupyter notebook with the experiment is here, so he can investigate on his own.

Results

So, here are the summary results for the tested models:

====================================================================================
EVALUATION RESULTS for RMSSE (All Series)
====================================================================================
               Method       Min        Q1       Med        Q3       Max      Mean
        ADAM ETS Back  0.018252  0.663358  1.161473  2.301861  50.25854  1.928086
         ADAM ETS Opt  0.024155  0.670682  1.185932  2.365498  51.61599  1.943729
         ADAM ETS Two  0.024599  0.669522  1.182516  2.342385  51.61599  1.947715
              ES Back  0.018252  0.667225  1.160971  2.313932  50.25854  1.927436
               ES Opt  0.024155  0.673575  1.185756  2.364915  51.61599  1.947180
               ES Two  0.024467  0.671771  1.187368  2.346343  51.61599  1.955076
               ES XXX  0.018252  0.677717  1.170823  2.306197  50.25854  1.961318
               ES ZZZ  0.011386  0.670211  1.179916  2.353334  115.5442  2.053459
               ES FFF  0.011386  0.680956  1.211736  2.449626  115.5442  2.100899
               ES SXS  0.018252  0.674537  1.169187  2.353334  50.25854  1.939847
statsforecast AutoETS  0.024468  0.673157  1.189209  2.326650  51.61597  1.923925
   skforecast AutoETS  0.074744  0.747200  1.344916  2.721083  50.54339  2.273724
       sktime AutoETS  0.024467  0.676191  1.190093  2.456184 565753200  106951.7

Things to note:

  1. The best performing ETS on average is from Nixtla’s statsforecast package. The second best is our implementation (ADAM/ES) with backcasting;
  2. Given that I ran exactly the same experiment for the R packages, we can conclude that Nixtla’s implementation is even better than the one in the forecast package in R;
  3. In terms of median RMSSE, ES with backcasting outperforms all other implementations;
  4. ADAM ETS is the best in terms of the first and third quartiles of RMSSE;
  5. ADAM ETS and ES perform quite similarly. This is expected, because ES is a wrapper of ADAM ETS, which assumes normality for the error term. ADAM ETS switches between Normal and Gamma distributions based on the type of error term;
  6. Backcasting leads to the most accurate forecasts on these datasets. This does not mean it is a universal rule, and I am sure the situation will change for other datasets;
  7. ES XXX gives exactly the same results as the one implemented in the R version of the package. This is important because we were aiming to reproduce results between R and Python with 100% precision, and we did. The reason why other ETS flavours differ between R and Python is that since smooth 4.4.0 for R, we changed how point forecasts are calculated for multiplicative component models: previously, they relied on simulations; now we simply use point forecasts. While this is not entirely statistically accurate, it is pragmatic because it avoids explosive trajectories.

The results for SAME are qualitatively similar to those for RMSSE:

===================================================================================
EVALUATION RESULTS for SAME (All Series)
===================================================================================
               Method       Min        Q1       Med        Q3       Max      Mean
        ADAM ETS Back  0.001142  0.374466  0.995272  2.402342  52.34177  1.951070
         ADAM ETS Opt  0.000106  0.373551  1.021661  2.485596  55.10179  1.962040
         ADAM ETS Two  0.000782  0.380398  1.029629  2.451008  55.10186  1.970422
              ES Back  0.001142  0.372777  0.994547  2.412503  53.45041  1.946517
               ES Opt  0.000217  0.372725  1.024666  2.478435  54.68603  1.967217
               ES Two  0.000095  0.384795  1.028561  2.454543  54.68558  1.982261
               ES XXX  0.000094  0.373315  1.005006  2.425682  53.16973  1.992656
               ES ZZZ  0.000760  0.386673  1.017732  2.467912  145.7604  2.107522
               ES FFF  0.000597  0.401426  1.048395  2.566151  145.7604  2.173559
               ES SXS  0.000597  0.375438  1.005603  2.450490  53.45041  1.964322
statsforecast AutoETS  0.000228  0.374821  1.015205  2.434952  53.61359  1.938682
   skforecast AutoETS  0.000993  0.457066  1.217751  2.954003  59.84443  2.409900
       sktime AutoETS  0.000433  0.392802  1.029433  2.571555 385286900  72931.90

Finally, I also measured the computational time and got the following summary. Note that I moved ES flavours below because they are not directly comparable with the others (they have special pools of models):

================================================================================
EVALUATION RESULTS for Computational Time in seconds (All Series)
================================================================================
               Method       Min        Q1       Med        Q3        Max      Mean
        ADAM ETS Back  0.008679  0.086219  0.140929  0.241768   0.923601  0.181546
         ADAM ETS Opt  0.051302  0.229400  0.315379  0.792827   2.638193  0.550378
         ADAM ETS Two  0.051314  0.287382  0.455149  1.080276   3.535120  0.715090
              ES Back  0.009274  0.085299  0.139511  0.247114   0.958868  0.182547
               ES Opt  0.053176  0.224293  0.312200  0.772161   2.888662  0.541243
               ES Two  0.048539  0.279598  0.446404  1.058847   3.553156  0.703183
statsforecast AutoETS  0.001770  0.007702  0.081189  0.167040   1.271202  0.102575
   skforecast AutoETS  0.021553  0.243102  0.302078  1.478667   7.482101  0.820576
       sktime AutoETS  0.128021  6.227344 19.170921 41.494513 229.793067 30.712191

================================================================================
               ES XXX  0.008793  0.054139  0.093792  0.144012   0.480976  0.104800
               ES ZZZ  0.010502  0.127297  0.184561  0.447649   1.794031  0.313157
               ES FFF  0.046921  0.215119  1.110657  1.557724   3.489375  1.071004
               ES SXS  0.024909  0.116427  0.434594  0.564973   1.251257  0.403988

Things to note:

  1. Nixtla’s implementation is actually the fastest and very hard to beat. As far as I understand, they did great work implementing some code in C++ and then using numba (I do not know what that means yet);
  2. Our implementation does not use numba, but our ETS with backcasting is still faster than the skforecast and sktime implementations;
  3. ADAM ETS with backcasting also has the lowest maximum time, implying that in difficult situations it finds a solution relatively quickly compared with the others;

So, overall, I would argue that the smooth implementation of ETS is competitive with other implementations. But it has one important benefit: it supports more features. And we plan to expand it further to make it even more useful across a wider variety of cases.

What’s next

There are still a lot of features that we have not managed to implement yet. Here is a non-exhaustive list:

  1. Explanatory variables to have ETSX/ARIMAX;
  2. ARIMA;
  3. Occurrence model for intermittent demand forecasting;
  4. Scale model for ADAM;
  5. Simulation functions;
  6. Model diagnostics;
  7. CES, MSARIMA, SSARIMA, GUM, and SMA – functions that are available in R and not yet ported to Python.

So, lots of work to do. I am sure we will be quite busy well into 2026.

Summary

It has been a long and winding road, but Filotas and Leo did an amazing job to make this happen. The existing ETS implementation in smooth already works quite well and quite fast. It does not fail as some other implementations do, and it is quite reliable. I have actually spent many years testing the R version on different time series to make sure that it produces something sensible no matter what. The code was translated to Python one-to-one, so I am fairly confident that the function will work as expected in 99.9% of cases (there is always a non-zero probability that something will go wrong). Both ADAM and ES already support a variety of features that you might find useful.

One thing I will kindly ask of you is that if you find a bug or an issue when running experiments on your datasets, please file it in our GitHub repo here – we will try to find the time to fix it. Also, if you would like to contribute by translating some features from R to Python or implementing something additional, please get in touch with me.

Finally, I am always glad to hear success stories. If you find the smooth package useful in your work, please let us know. One way of doing that is via the Discussions on GitHub, or you can simply send me an email.

Leave a Reply