smooth in python: ETS with explanatory variables

We continue our series of posts on the functions from the smooth package for Python/R. Today we will see how to enhance your exponential smoothing with explanatory variables. What? Yes, you heard me! Let’s dive in!

We all know that in real life sales don’t just evolve over time on their own. Any univariate model, such as ARIMA or ETS is just a way to approximate a complex reality. In practice, there are many factors affecting the demand for your product. What would happen if the price on your product increases? What if you run a promotion (e.g. “Buy One, Get One Free”)? Your competitor’s strategy impacts the demand for your product as well… There’s lots of different factors, and some of them can be quite useful in demand forecasting. But can we join the dynamic univariate models with regression?

Yes, we can! Although ETS is thought as a pure univariate model, it is easy to extend to include explanatory variables. There are several great papers showing how it works (e.g. Kourentzes & Petropoulos, 2016), and in fact the es() function from the smooth package for R was used as a benchmark in the M5 competition.

So, consider a situation where you have weekly sales of a product with some recorded promotions (encoded as dummy variables). We will use a time series from the fcompdata package for Python. The first image shows how the series looks, the vertical lines show when promotions happen. The series itself seems to be seasonal, roughly repeating peaks and troughs every 52 observations (every year). Also, we see that there are two types of promotions, and when they happen sales tend to increase. So, including them should improve the model fit, and if the company decides to run promotions again, the model will forecast demand better. I will start by fitting the ETS(M,N,M) to the data:

from smooth import ES
from fcompdata import PromoData

y = PromoData.y

model = ES(model="MNM", lags=52, holdout=True, h=13)
model.fit(y)
model.predict(h=13)
model.plot(7)

NOTE: PromoData has a specific structure with several attributes. PromoData.x contains the in-sample data, PromoData.xx has the holdout – this is consistent with the Mcomp package for R. The new features in python are:

PromoData.y – concatenated training and test sets,
PromoData.xregx – matrix of explanatory variables for the training set,
PromoData.xregxx – matrix of explanatory variables for the test set,
PromoData.xreg – the full (concatenated) matrix of explanatory variables.

The following image shows the model fit and the point forecasts from the ETS(M,N,M):

ETS(M,N,M) fit and forecast for the promotional data example

As expected, because the model does not take promotions into account, it fits the data as best as it can and produces forecasts that are oblivious of the potential external effects on sales. We can improve it by including the promotional dummies:

X_train = PromoData.xreg
X_test =  PromoData.xregxx

model = ES(model="MNM", lags=52, holdout=True, h=13)
model.fit(y, X_train)
model.predict(h=13, X=X_test)
model.plot(7)

ETS(M,N,M) with explanatory variables

The image above shows the fit and the point forecasts from the ETSX(M,N,M) model that now takes the promotions into account. This is quite an improvement in comparison with the previous one. Furthermore, if we can control when to have promotions and what types of promotions to run, we can change the values in the `X_test` matrix and see what demand to expected in that situation. So, this gives an analyst a tool for a more advanced sensitivity analysis.

Read more about the ETSX here.
Install smooth: pip install smooth
ETSX wiki on github.

Open Forecasting

Leave a Reply Cancel reply