<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Archives regression - Open Forecasting</title>
	<atom:link href="https://openforecast.org/tag/regression/feed/" rel="self" type="application/rss+xml" />
	<link>https://openforecast.org/tag/regression/</link>
	<description>How to look into the future</description>
	<lastBuildDate>Mon, 28 Jul 2025 15:51:00 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2015/08/cropped-usd-05-32x32.png&amp;nocache=1</url>
	<title>Archives regression - Open Forecasting</title>
	<link>https://openforecast.org/tag/regression/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>The first draft of &#8220;Forecasting and Analytics with ADAM&#8221;</title>
		<link>https://openforecast.org/2022/04/11/the-first-draft-of-forecasting-and-analytics-with-adam/</link>
					<comments>https://openforecast.org/2022/04/11/the-first-draft-of-forecasting-and-analytics-with-adam/#respond</comments>
		
		<dc:creator><![CDATA[Ivan Svetunkov]]></dc:creator>
		<pubDate>Mon, 11 Apr 2022 15:30:26 +0000</pubDate>
				<category><![CDATA[adam()]]></category>
		<category><![CDATA[ARIMA]]></category>
		<category><![CDATA[ETS]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Regression]]></category>
		<category><![CDATA[Theory of forecasting]]></category>
		<category><![CDATA[ADAM]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[smooth]]></category>
		<guid isPermaLink="false">https://openforecast.org/?p=2817</guid>

					<description><![CDATA[<p>After working on this for more than a year, I have finally prepared the first draft of my online monograph &#8220;Forecasting and Analytics with ADAM&#8220;. This is a monograph on the model that unites ETS, ARIMA and regression and introduces advanced features in univariate modelling, including: ETS in a new State Space form; ARIMA in [&#8230;]</p>
<p>Message <a href="https://openforecast.org/2022/04/11/the-first-draft-of-forecasting-and-analytics-with-adam/">The first draft of &#8220;Forecasting and Analytics with ADAM&#8221;</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div id="attachment_2819" style="width: 222px" class="wp-caption aligncenter"><a href="/wp-content/uploads/2022/03/Adam-Title-web.jpg"><img fetchpriority="high" decoding="async" aria-describedby="caption-attachment-2819" src="/wp-content/uploads/2022/03/Adam-Title-web-212x300.jpg" alt="Forecasting and Analytics with ADAM" width="212" height="300" class="size-medium wp-image-2819" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/Adam-Title-web-212x300.jpg&amp;nocache=1 212w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/Adam-Title-web-724x1024.jpg&amp;nocache=1 724w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/Adam-Title-web-768x1087.jpg&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/Adam-Title-web.jpg&amp;nocache=1 1000w" sizes="(max-width: 212px) 100vw, 212px" /></a><p id="caption-attachment-2819" class="wp-caption-text">Forecasting and Analytics with ADAM</p></div>
<p>After working on this for <a href="/en/2021/01/13/the-creation-of-adam-next-step-in-statistical-forecasting/" rel="noopener">more than</a> <a href="/en/2021/02/28/after-the-creation-of-adam-smooth-v3-1-0/">a year</a>, I have finally prepared the first draft of my online monograph &#8220;<a href="https://openforecast.org/adam/" rel="noopener" target="_blank">Forecasting and Analytics with ADAM</a>&#8220;. This is a monograph on the model that unites ETS, ARIMA and regression and introduces advanced features in univariate modelling, including:</p>
<ol>
<li>ETS in a new State Space form;</li>
<li>ARIMA in a new State Space form;</li>
<li>Regression;</li>
<li>TVP regression;</li>
<li>Combinations of (1), (2) and either (3), or (4);</li>
<li>Automatic selection/combination for ETS;</li>
<li>Automatic orders selection for ARIMA;</li>
<li>Variables selection for regression part;</li>
<li>Normal and non-normal distributions;</li>
<li>Automatic selection of most suitable distribution;</li>
<li>Multiple seasonality;</li>
<li>Occurrence part of the model to handle zeroes in data (intermittent demand);</li>
<li>Modelling scale of distribution (GARCH and beyond);</li>
<li>Handling uncertainty of estimates of parameters.</li>
</ol>
<p>The model and all its features are already implemented in <code>adam()</code> function from <code>smooth</code> package for R (you need v3.1.6 from CRAN for all the features listed above). The function supports many options that allow one experimenting with univariate forecasting, allowing to build complex models, combining elements from the list above. The monograph explaining how models underlying ADAM and how to work with them is <a href="https://openforecast.org/adam/" rel="noopener" target="_blank">available online</a>, and I plan to produce several physical copies of it after refining the text. Furthermore, I have already asked two well-known academics to act as reviewers of the monograph to collect the feedback and improve the monograph, and if you want to act as a reviewer as well, please let me know.</p>
<h3>Examples in R</h3>
<p>Just to give you a flavour of ADAM, I decided to provide a couple of examples on time series <code>AirPassengers</code> (included in <code>datasets</code> package in R). The first one is the ADAM ETS.</p>
<p>Building and selecting the most appropriate ADAM ETS comes to running the following line of code:</p>
<pre class="decode">adamETSAir <- adam(AirPassengers, h=12, holdout=TRUE)</pre>
<p>In this case, ADAM will select the most appropriate ETS model for the data, creating a holdout of the last 12 observations. We can see the details of the model by printing the output:</p>
<pre class="decode">adamETSAir</pre>
<pre>Time elapsed: 0.75 seconds
Model estimated using adam() function: ETS(MAM)
Distribution assumed in the model: Gamma
Loss function type: likelihood; Loss function value: 467.2981
Persistence vector g:
 alpha   beta  gamma 
0.7691 0.0053 0.0000 

Sample size: 132
Number of estimated parameters: 17
Number of degrees of freedom: 115
Information criteria:
      AIC      AICc       BIC      BICc 
 968.5961  973.9646 1017.6038 1030.7102 

Forecast errors:
ME: 9.537; MAE: 20.784; RMSE: 26.106
sCE: 43.598%; Asymmetry: 64.8%; sMAE: 7.918%; sMSE: 0.989%
MASE: 0.863; RMSSE: 0.833; rMAE: 0.273; rRMSE: 0.254</pre>
<p>The output above provides plenty of detail on what was estimated and how. Some of these elements have been discussed in <a href="/en/2016/11/02/smooth-package-for-r-es-function-part-ii-pure-additive-models/">one of my previous posts</a> on <code>es()</code> function. The new thing is the information about the assumed distribution for the response variable. By default, ADAM works with Gamma distribution in case of multiplicative error model. This is done to make model more robust in cases of low volume data, where the Normal distribution might produce negative numbers (see <a href="/en/2021/06/30/isf2021-how-to-make-multiplicative-ets-work-for-you/">my presentation</a> on this issues). In case of high volume data, the Gamma distribution will perform similar to the Normal one. The pure multiplicative ADAM ETS is discussed in <a href="https://openforecast.org/adam/ADAMETSPureMultiplicativeChapter.html" rel="noopener" target="_blank">Chapter 6 of ADAM monograph</a>. If Gamma is not suitable, then the other distribution can be selected via the <code>distribution</code> parameter. There is also an automated distribution selection approach in the function <code>auto.adam()</code>:</p>
<pre class="decode">adamETSAutoAir <- auto.adam(AirPassengers, h=12, holdout=TRUE)
adamETSAutoAir</pre>
<pre>Time elapsed: 3.86 seconds
Model estimated using auto.adam() function: ETS(MAM)
Distribution assumed in the model: Normal
Loss function type: likelihood; Loss function value: 466.0744
Persistence vector g:
 alpha   beta  gamma 
0.8054 0.0000 0.0000 

Sample size: 132
Number of estimated parameters: 17
Number of degrees of freedom: 115
Information criteria:
      AIC      AICc       BIC      BICc 
 966.1487  971.5172 1015.1564 1028.2628 

Forecast errors:
ME: 9.922; MAE: 21.128; RMSE: 26.246
sCE: 45.36%; Asymmetry: 65.4%; sMAE: 8.049%; sMSE: 1%
MASE: 0.877; RMSSE: 0.838; rMAE: 0.278; rRMSE: 0.255</pre>
<p>As we see from the output above, the Normal distribution is more appropriate for the data in terms of AICc than the other ones tried out by the function (by default the list includes Normal, Laplace, S, Generalised Normal, Gamma, Inverse Gaussian and Log Normal distributions, but this can be amended by providing a vector of names via <code>distribution</code> parameter). The selection of ADAM ETS and distributions is discussed in <a href="https://openforecast.org/adam/ADAMSelection.html" rel="noopener" target="_blank">Chapter 15 of the monograph</a>.</p>
<p>Having obtained the model, we can diagnose it using <code>plot.adam()</code> function:</p>
<pre class="decode">par(mfcol=c(3,3))
plot(adamETSAutoAir,which=c(1,4,2,6,7,8,10,11,13))</pre>
<p>The <code>which</code> parameter specifies what type of plots to produce, you can find the list of plots in the documentation for <code>plot.adam()</code>. The code above will result in:<br />
<div id="attachment_2824" style="width: 310px" class="wp-caption aligncenter"><a href="/wp-content/uploads/2022/03/adamETSAirDiagnostics.png"><img decoding="async" aria-describedby="caption-attachment-2824" src="/wp-content/uploads/2022/03/adamETSAirDiagnostics-300x175.png" alt="Diagnostics plots for ADAM ETS on AirPassengers data" width="300" height="175" class="size-medium wp-image-2824" /></a><p id="caption-attachment-2824" class="wp-caption-text">Diagnostics plots for ADAM ETS on AirPassengers data</p></div>
The diagnostic plots are discussed in the <a href="https://openforecast.org/adam/diagnostics.html" rel="noopener" target="_blank">Chapter 14 of ADAM monograph</a>. The plot above does not show any serious issues with the model.</p>
<p>Just for the comparison, we could also try fitting the most appropriate ADAM ARIMA to the data (this model is discussed in <a href="https://openforecast.org/adam/ADAMARIMA.html" rel="noopener" target="_blank">Chapter 9</a>). The code in this case is slightly more complicated, because we need to switch off ETS part of the model and define the maximum orders of ARIMA to try:</p>
<pre class="decode">adamARIMAAir <- adam(AirPassengers, model="NNN", h=12, holdout=TRUE,
                     orders=list(ar=c(3,2),i=c(2,1),ma=c(3,2),select=TRUE))</pre>
<p>This results in the following <a href="https://openforecast.org/adam/ARIMASelection.html" rel="noopener" target="_blank">automatically selected</a> ARIMA model:</p>
<pre>Time elapsed: 3.54 seconds
Model estimated using auto.adam() function: SARIMA(0,1,1)[1](0,1,1)[12]
Distribution assumed in the model: Normal
Loss function type: likelihood; Loss function value: 491.7117
ARMA parameters of the model:
MA:
 theta1[1] theta1[12] 
   -0.1952    -0.0720 

Sample size: 132
Number of estimated parameters: 16
Number of degrees of freedom: 116
Information criteria:
     AIC     AICc      BIC     BICc 
1015.423 1020.154 1061.548 1073.097 

Forecast errors:
ME: -13.795; MAE: 16.65; RMSE: 21.644
sCE: -63.064%; Asymmetry: -79.4%; sMAE: 6.343%; sMSE: 0.68%
MASE: 0.691; RMSSE: 0.691; rMAE: 0.219; rRMSE: 0.21</pre>
<p>Given that ADAM ETS and ADAM ARIMA are formulated in the same framework, they are directly comparable using information critirea. Comparing AICc of the models <code>adamETSAutoAir</code> and <code>adamARIMAAir</code>, we can conclude that the former is more appropriate to the data than the latter. However, the default ARIMA works with the Normal distribution, which might not be appropriate for the data, so we can revert to the <code>auto.adam()</code> to select the better one:</p>
<pre class="decode">adamAutoARIMAAir <- auto.adam(AirPassengers, model="NNN", h=12, holdout=TRUE,
                              orders=list(ar=c(3,2),i=c(2,1),ma=c(3,2),select=TRUE))</pre>
<p>This will take more computational time, but will result in a different model with a lower AICc (which is still higher than the one in ADAM ETS):</p>
<pre>Time elapsed: 25.46 seconds
Model estimated using auto.adam() function: SARIMA(0,1,1)[1](0,1,1)[12]
Distribution assumed in the model: Log-Normal
Loss function type: likelihood; Loss function value: 472.923
ARMA parameters of the model:
MA:
 theta1[1] theta1[12] 
   -0.2785    -0.5530 

Sample size: 132
Number of estimated parameters: 16
Number of degrees of freedom: 116
Information criteria:
      AIC      AICc       BIC      BICc 
 977.8460  982.5764 1023.9708 1035.5197 

Forecast errors:
ME: -12.968; MAE: 13.971; RMSE: 19.143
sCE: -59.285%; Asymmetry: -91.7%; sMAE: 5.322%; sMSE: 0.532%
MASE: 0.58; RMSSE: 0.611; rMAE: 0.184; rRMSE: 0.186</pre>
<p>Note that although the AICc is higher for ARIMA than for ETS, the former has lower error measures than the latter. So, the higher AICc does not necessarily mean that the model is not good. But if we rely on the information criteria, then we should stick with ADAM ETS and we can then produce the forecasts for the next 12 observations (see <a href="https://openforecast.org/adam/ADAMForecasting.html" rel="noopener" target="_blank">Chapter 18</a>):</p>
<pre class="decode">adamETSAutoAirForecast <- forecast(adamETSAutoAir, h=12, interval="prediction",
                                   level=c(0.9,0.95,0.99))
par(mfcol=c(1,1))
plot(adamETSAutoAirForecast)</pre>
<div id="attachment_2839" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/adamETSAirForecast.png&amp;nocache=1"><img decoding="async" aria-describedby="caption-attachment-2839" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/adamETSAirForecast-300x175.png&amp;nocache=1" alt="Forecast from ADAM ETS" width="300" height="175" class="size-medium wp-image-2839" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/adamETSAirForecast-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/adamETSAirForecast-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/adamETSAirForecast-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/03/adamETSAirForecast.png&amp;nocache=1 1200w" sizes="(max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2839" class="wp-caption-text">Forecast from ADAM ETS</p></div>
Finally, if we want to do a more in-depth analysis of parameters of ADAM, we can also produce the summary, which will create the confidence intervals for the parameters of the model:</p>
<pre class="decode">summary(adamETSAutoAir)</pre>
<pre>Model estimated using auto.adam() function: ETS(MAM)
Response variable: data
Distribution used in the estimation: Normal
Loss function type: likelihood; Loss function value: 466.0744
Coefficients:
            Estimate Std. Error Lower 2.5% Upper 97.5%  
alpha         0.8054     0.0864     0.6343      0.9761 *
beta          0.0000     0.0203     0.0000      0.0401  
gamma         0.0000     0.0382     0.0000      0.0755  
level        96.2372     6.8596    82.6496    109.7919 *
trend         2.0901     0.3955     1.3068      2.8716 *
seasonal_1    0.9145     0.0077     0.9003      0.9372 *
seasonal_2    0.8999     0.0081     0.8857      0.9227 *
seasonal_3    1.0308     0.0094     1.0165      1.0535 *
seasonal_4    0.9885     0.0077     0.9743      1.0112 *
seasonal_5    0.9856     0.0072     0.9713      1.0083 *
seasonal_6    1.1165     0.0093     1.1023      1.1392 *
seasonal_7    1.2340     0.0115     1.2198      1.2568 *
seasonal_8    1.2254     0.0105     1.2112      1.2481 *
seasonal_9    1.0668     0.0094     1.0526      1.0896 *
seasonal_10   0.9256     0.0087     0.9113      0.9483 *
seasonal_11   0.8040     0.0075     0.7898      0.8268 *

Error standard deviation: 0.0367
Sample size: 132
Number of estimated parameters: 17
Number of degrees of freedom: 115
Information criteria:
      AIC      AICc       BIC      BICc 
 966.1487  971.5172 1015.1564 1028.2628 </pre>
<p>Note that the <code>summary()</code> function might complain about the Observed Fisher Information. This is because the covariance matrix of parameters is calculated numerically and sometimes the likelihood is not maximised properly. I have not been able to fully resolve this issue yet, but hopefully will do at some point. The summary above shows, for example, that the smoothing parameters \(\beta\) and \(\gamma\) are not significantly different from zero (on 5% level), while \(\alpha\) is expected to vary between 0.6343 and 0.9761 in 95% of the cases. You can read more about the uncertainty of parameters in ADAM in <a href="https://openforecast.org/adam/ADAMUncertainty.html" rel="noopener" target="_blank">Chapter 16</a> of the monograph.</p>
<p>As for the other features of ADAM, here is a brief guide:</p>
<ul>
<li>If you work with multiple seasonal data, then you might need to specify the seasonality via the <code>lags</code> parameter, for example as <code>lags=c(24,7*24)</code> in case of hourly data. This is discussed in <a href="https://openforecast.org/adam/multiple-frequencies-in-adam.html" rel="noopener" target="_blank">Chapter 12</a>;</li>
<li>If you have intermittent data, then you should read <a href="https://openforecast.org/adam/ADAMIntermittent.html" rel="noopener" target="_blank">Chapter 13</a>, which explains how to work with the <code>occurrence</code> parameter of the function;</li>
<li>Explanatory variables are discussed in <a href="https://openforecast.org/adam/ADAMX.html" rel="noopener" target="_blank">Chapter 10</a> and are handled in the <code>adam()</code> function via the <code>formula</code> parameter;</li>
<li>In the cases of heteroscedasticity (time varying or induced by some explanatory variables), there a scale model (which is discussed in <a href="https://openforecast.org/adam/ADAMscaleModel.html" rel="noopener" target="_blank">Chapter 17</a> and implemented as <code>sm()</code> method for the <code>adam</code> class).</li>
</ul>
<p>You can also experiment with advanced estimators (<a href="https://openforecast.org/adam/ADAMETSEstimation.html" rel="noopener" target="_blank">Chapter 11</a>, including custom loss functions) via the <code>loss</code> parameter and forecast combinations (<a href="https://openforecast.org/adam/ADAMCombinations.html" rel="noopener" target="_blank">Section 15.4</a>).</p>
<p>Long story short, if you are interested in univariate forecasting, then do give ADAM a try - it might have the flexibility you needed for your experiments. If you are worried about its accuracy, check out <a href="/en/2021/02/28/after-the-creation-of-adam-smooth-v3-1-0/">this post</a>, where I compared ADAM with other models.</p>
<p>And, as a friend of mine says, "Happy forecasting!"</p>
<p>Message <a href="https://openforecast.org/2022/04/11/the-first-draft-of-forecasting-and-analytics-with-adam/">The first draft of &#8220;Forecasting and Analytics with ADAM&#8221;</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://openforecast.org/2022/04/11/the-first-draft-of-forecasting-and-analytics-with-adam/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Introducing scale model in greybox</title>
		<link>https://openforecast.org/2022/01/23/introducing-scale-model-in-greybox/</link>
					<comments>https://openforecast.org/2022/01/23/introducing-scale-model-in-greybox/#respond</comments>
		
		<dc:creator><![CDATA[Ivan Svetunkov]]></dc:creator>
		<pubDate>Sun, 23 Jan 2022 18:04:33 +0000</pubDate>
				<category><![CDATA[Package greybox for R]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Regression]]></category>
		<category><![CDATA[Univariate models]]></category>
		<category><![CDATA[greybox]]></category>
		<category><![CDATA[regression]]></category>
		<guid isPermaLink="false">https://openforecast.org/?p=2670</guid>

					<description><![CDATA[<p>At the end of June 2021, I released the greybox package version 1.0.0. This was a major release, introducing new functionality, but I did not have time to write a separate post about it because of the teaching and lack of free time. Finally, Christmas has arrived, and I could spend several hours preparing the [&#8230;]</p>
<p>Message <a href="https://openforecast.org/2022/01/23/introducing-scale-model-in-greybox/">Introducing scale model in greybox</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>At the end of June 2021, I released the <code>greybox</code> package version 1.0.0. This was a major release, introducing new functionality, but I did not have time to write a separate post about it because of the teaching and lack of free time. Finally, Christmas has arrived, and I could spend several hours preparing the post about it. In this post, I want to tell you about the new major feature in the <code>greybox</code> package.</p>
<h3>Scale Model</h3>
<p>The Scale Model is the regression-like model focusing on capturing the relation between the scale of distribution (for example, variance in Normal distribution) and a set of explanatory variables. It is implemented in <code>sm()</code> method in the <code>greybox</code> package. The motivation for this comes from <a href="https://www.gamlss.com/">GAMLSS</a>, the Generalised Additive Model for Location, Scale and Shape. While I have decided not to bother with the &#8220;GAM&#8221; part of this (there are <code>gam</code> and <code>gamlss</code> packages in R that do that), I liked the idea of being able to predict the scale (for example, variance) of a distribution. This becomes especially useful when one suspects heteroscedasticity in the model but does not think that variable transformations are appropriate.</p>
<p>To understand what the function does, it is necessary first to discuss the underlying model. We will start the discussion with an example of a linear regression model with two explanatory variables, assuming Normally distributed residuals \(\xi_t\) with zero mean and a fixed variance \(\sigma^2\), \(\xi_t \sim \mathcal{N}(0,\sigma^2)\), which can be formulated as:<br />
\begin{equation} \label{eq:model1}<br />
    y_t = \beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t} + \xi_t ,<br />
\end{equation}<br />
where \(y_t\) is the response variable, \(x_{1,t}\) and \(x_{2,t}\) are the explanatory variables on observation \(t\), \(\beta_0\), \(\beta_1\) and \(\beta_2\) are the parameters of the model and \(\xi_t \sim \mathcal{N}\left(0, \sigma^2 \right)\). Recalling the basic properties of Normal distribution, we can rewrite the same model as a model with standard normal residuals \(\epsilon_t \sim \mathcal{N}\left(0, 1 \right)\) by inserting \(\xi_t = \sigma \epsilon_t\) in \eqref{eq:model1}:<br />
\begin{equation} \label{eq:model2}<br />
    y_t = \beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t} + \sigma \epsilon_t .<br />
\end{equation}<br />
Now if we suspect that the variance of the model might not be constant, we can substitute the standard deviation \(\sigma\) with some function, transforming the model into:<br />
\begin{equation} \label{eq:model3}<br />
    y_t = \beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t} + f\left(\gamma_0 + \gamma_2 x_{2,t} + \gamma_3 x_{3,t}\right) \epsilon_t ,<br />
\end{equation}<br />
where \(x_{2,t}\) and \(x_{3,t}\) are the explanatory variables (as you see, not necessarily the same as in the first part of the model) and \(\gamma_0\), \(\gamma_1\) and \(\gamma_2\) are the parameters of the scale part of the model. The idea here is that there is a regression model for the conditional mean of the distribution \(\beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t}\), and that there is another one that will regulate the standard deviation via \(f\left(\gamma_0 + \gamma_2 x_{2,t} + \gamma_3 x_{3,t}\right)\). The main thing to keep in mind about the latter is that the function \(f(\cdot)\) needs to be strictly positive because the standard deviation cannot be zero or negative. The simplest way to guarantee this is to use exponent instead of \(f(\cdot)\). Furthermore, in our example with Normal distribution, the scale corresponds to the variance, so we should be introducing the model for variance: \(\sigma^2_t = \exp\left(\gamma_0 + \gamma_2 x_{2,t} + \gamma_3 x_{3,t}\right)\). This leads to the following model:<br />
\begin{equation} \label{eq:model4}<br />
    y_t = \beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t} + \sqrt{\exp\left(\gamma_0 + \gamma_2 x_{2,t} + \gamma_3 x_{3,t}\right)} \epsilon_t ,<br />
\end{equation}<br />
The model above would not only have the conditional mean depending on the values of explanatory variables (the conventional regression) but also the conditional variance, which would change depending on the values of variables. Note that this model assumes the linearity in the conditional mean: increase of \(x_{1,t}\) by one leads to the increase of \(y_t\) by \(\beta_1\) on average. At the same time, it assumes non-linearity in the variance: increase of \(x_{2,t}\) by one leads to the increase of variance by \(\exp(\gamma_2-1)\times 100\)%. If we want a non-linear change in the conditional mean, we can use a model in logarithms. Alternatively, we could assume a different distribution for the response variable \(y_t\). To understand how the latter would work, we need to represent the same model \eqref{eq:model4} in a more general form. For the Normal distribution, the same model \eqref{eq:model4} can be rewritten as:<br />
\begin{equation} \label{eq:model5}<br />
    y_t \sim \mathcal{N}\left(\beta_0 + \beta_1 x_{1,t} + \beta_2 x_{2,t}, \exp\left(\gamma_0 + \gamma_2 x_{2,t} + \gamma_3 x_{3,t}\right)\right).<br />
\end{equation}<br />
This representation allows introducing scale model for many other distributions, such as Laplace, Generalised Normal, Gamma, Inverse Gaussian etc. All that we need to do in those cases is to substitute the distribution \(\mathcal{N}(\cdot)\) with a distribution of interest. The <code>sm()</code> function supports the same list of distributions as <code>alm()</code> (see <a href="https://cran.r-project.org/web/packages/greybox/vignettes/alm.html" rel="noopener" target="_blank">the vignette </a>for the function on CRAN or in R using the command <code>vignette()</code>). Each specific formula for scale would differ from one distribution to another, but the principles will be the same.</p>
<h3>Demonstration in R</h3>
<p>For demonstration purposes, we will use an example with artificial data, generated according to the model \eqref{eq:model4}:</p>
<pre class="decode">xreg <- matrix(rnorm(300,10,3),100,3)
xreg <- cbind(1000-0.75*xreg[,1]+1.75*xreg[,2]+
              sqrt(exp(0.3+0.5*xreg[,2]-0.4*xreg[,3]))*rnorm(100,0,1),xreg)
colnames(xreg) <- c("y",paste0("x",c(1:3)))</pre>
<p>The scatterplot of the generated data will look like this:</p>
<pre class="decode">spread(xreg)</pre>
<div id="attachment_2789" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smExampleSpread.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2789" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smExampleSpread-300x263.png&amp;nocache=1" alt="Scatterplot matrix for the generated data" width="300" height="263" class="size-medium wp-image-2789" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smExampleSpread-300x263.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smExampleSpread-768x672.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smExampleSpread.png&amp;nocache=1 800w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2789" class="wp-caption-text">Scatterplot matrix for the generated data</p></div>
<p>We can then fit a model, specifying the location and scale parts of it in <code>alm()</code>. In this case, the <code>alm()</code> will call for <code>sm()</code> and will estimate both parts via likelihood maximisation. To make things closer to forecasting task, we will withhold the last 10 observations for the test set:</p>
<pre class="decode">ourModel <- alm(y~x1+x2+x3, scale=~x2+x3, xreg, subset=c(1:90), distribution="dnorm")</pre>
<p>The returned model contains both parts. The scale part of the model can be accessed via <code>ourModel$scale</code>. It is an object of class "scale", supporting several methods, such as<br />
<code>actuals()</code>, <code>residuals()</code>, <code>fitted()</code>, <code>summary()</code> and <code>plot()</code> (and several other). Here how the summary of the model looks in my case:</p>
<pre class="decode">summary(ourModel)</pre>
<pre>Response variable: y
Distribution used in the estimation: Normal
Loss function used in estimation: likelihood
Coefficients:
             Estimate Std. Error Lower 2.5% Upper 97.5%  
(Intercept) 1000.2850     2.9698   994.3782   1006.1917 *
x1            -0.8350     0.1435    -1.1204     -0.5497 *
x2             1.8656     0.1714     1.5246      2.2065 *
x3            -0.0228     0.1776    -0.3761      0.3305  

Coefficients for scale:
            Estimate Std. Error Lower 2.5% Upper 97.5%  
(Intercept)   0.0436     0.7012    -1.3510      1.4382  
x2            0.4705     0.0413     0.3883      0.5527 *
x3           -0.3355     0.0487    -0.4324     -0.2385 *

Error standard deviation: 4.52
Sample size: 90
Number of estimated parameters: 7
Number of degrees of freedom: 83
Information criteria:
     AIC     AICc      BIC     BICc 
391.0191 392.3849 408.5177 411.5908</pre>
<p>The summary above shows parameters for both parts of the model. They are not far from the ones used in the generation of the model, which indicates that the implemented model works as intended. The only issue here is that the standard errors in the location part of the model (first four coefficients) <strong>do not take the heteroscedasticity into account and thus are biased</strong>. The <a href="https://www.econometrics-with-r.org/15.4-hac-standard-errors.html" rel="noopener" target="_blank">HAC standard errors</a> are not yet implemented in <code>alm()</code></p>
<p>As we see, the returned model contains both parts. The scale part of the model can be accessed via <code>ourModel$scale</code>. It is an object of class "scale", supporting several methods, such as<br />
<code>actuals()</code>, <code>residuals()</code>, <code>fitted()</code>, <code>summary()</code> and <code>plot()</code> (and several other). Just to see the effect of scale model, here are the diagnostics plots for the original model (which returns the \(\xi_t\) residuals) and for the scale model (\(\epsilon_t\) residuals):</p>
<pre class="decode">par(mfcol=c(1,2))
plot(ourModel, 5)
plot(ourModel, 5)</pre>
<div id="attachment_2794" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smDiagnostics.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2794" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smDiagnostics-300x175.png&amp;nocache=1" alt="Diagnostics plots for sm" width="300" height="175" class="size-medium wp-image-2794" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smDiagnostics-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smDiagnostics-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smDiagnostics-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smDiagnostics.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2794" class="wp-caption-text">Diagnostics plots for sm</p></div>
<p>The Figure above shows squared residuals vs fitted values for the location (the plot on the left) and the scale (the plot on the right) models. The former is agnostic of the scale model and demonstrates that there is heteroscedasticity of residuals (the variance increases with the increase of the fitted values). The latter shows that the scale model managed to resolve the issue. While the LOWESS line demonstrates some non-linearity, the distribution of residuals conditional on fitted values looks random.</p>
<p>Finally, we can produce forecasts from such model, similarly to how it is done for any other model, estimated with <code>alm()</code>:</p>
<pre class="decode">ourForecast <- predict(ourModel,xreg[-c(1:90),],interval="pred")
plot(ourForecast)</pre>
<div id="attachment_2800" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smForecast.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2800" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smForecast-300x175.png&amp;nocache=1" alt="Forecast from the model" width="300" height="175" class="size-medium wp-image-2800" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smForecast-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smForecast-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smForecast-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2022/01/smForecast.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2800" class="wp-caption-text">Forecast from the model</p></div>
<p>In this case, the function will first predict the scale part of the model, then it will use the predicted variance and the covariance matrix of parameters to calculate the prediction intervals, shown in Figure above. Given the independence of location and scale parts of the model, the conditional expectation (point forecast) will not change if we drop the scale model. It is all about variance.</p>
<p>Finally, if you do not want to use <code>alm()</code> function, you can use <code>lm()</code> instead and then apply the <code>sm()</code>:</p>
<pre class="decode">lmModel <- lm(y~x1+x2+x3, as.data.frame(xreg), subset=c(1:90))
smModel <- sm(lmModel, formula=~x2+x3, xreg)</pre>
<p>In this case, the <code>sm()</code> will assume that the error term follows Normal distribution, and we will end up with two models that are not connected with each other (e.g., the <code>predict()</code> method applied to <code>lmModel</code> will not use predictions from the <code>smModel</code>). Nonetheless, we could still use all the R methods discussed above for the analysis of the <code>smModel</code>.</p>
<p>As a final word, the scale model is a new feature. While it already works, there might be bugs in it. If you find any, please let me know by submitting <a href="https://github.com/config-i1/greybox/issues" rel="noopener" target="_blank">an issue on Github</a>.</p>
<h3>P.S.</h3>
<p>There is a danger that <code>greybox</code> <strong>package will be soon removed from CRAN</strong> together with other 88 packages (including my <code>smooth</code> and <code>legion</code>) because the <code>nloptr</code> package that it relies on has not passed some of new checks recently introduced by CRAN. This is beyond my control, and I do not have time or power to influence this, but if this happens, you might need to switch to <a href="https://github.com/config-i1/greybox/" rel="noopener" target="_blank">the installation from GitHub</a> via <code>remotes</code> package, using the command:</p>
<pre class="decode">remotes::install_github("config-i1/greybox")</pre>
<p>My apologies for the inconvenience. I might be able to remove the dependence on <code>nloptr</code> at some point, but it will not happen before March 2022.</p>
<p>Message <a href="https://openforecast.org/2022/01/23/introducing-scale-model-in-greybox/">Introducing scale model in greybox</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://openforecast.org/2022/01/23/introducing-scale-model-in-greybox/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>An Integrated Method for Estimation and Optimisation</title>
		<link>https://openforecast.org/2021/09/03/an-integrated-method-for-estimation-and-optimisation/</link>
					<comments>https://openforecast.org/2021/09/03/an-integrated-method-for-estimation-and-optimisation/#respond</comments>
		
		<dc:creator><![CDATA[Ivan Svetunkov]]></dc:creator>
		<pubDate>Fri, 03 Sep 2021 15:47:15 +0000</pubDate>
				<category><![CDATA[Package greybox for R]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Regression]]></category>
		<category><![CDATA[Univariate models]]></category>
		<category><![CDATA[extrapolation methods]]></category>
		<category><![CDATA[greybox]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[statistics]]></category>
		<guid isPermaLink="false">https://openforecast.org/?p=2703</guid>

					<description><![CDATA[<p>My PhD student, Congzheng Liu (co-supervised with Adam Letchford) has written a paper, entitled &#8220;Newsvendor Problems: An Integrated Method for Estimation and Optimisation&#8220;. This paper has recently been published in EJOR. In this paper we build upon the existing Ban &#038; Rudin (2019) approach for newsvendor problem, showing that in case of the linear model, [&#8230;]</p>
<p>Message <a href="https://openforecast.org/2021/09/03/an-integrated-method-for-estimation-and-optimisation/">An Integrated Method for Estimation and Optimisation</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>My PhD student, <a href="https://www.linkedin.com/in/congzheng-liu/" rel="noopener" target="_blank">Congzheng Liu</a> (co-supervised with <a href="https://www.lancaster.ac.uk/staff/letchfoa/default.htm" rel="noopener" target="_blank">Adam Letchford</a>) has written a paper, entitled &#8220;<a href="https://doi.org/10.1016/j.ejor.2021.08.013" rel="noopener" target="_blank">Newsvendor Problems: An Integrated Method for Estimation and Optimisation</a>&#8220;. This paper has recently been published in <a href="https://www.sciencedirect.com/journal/european-journal-of-operational-research" rel="noopener" target="_blank">EJOR</a>. In this paper we build upon the existing <a href="https://doi.org/10.1287/opre.2018.1757" rel="noopener" target="_blank">Ban &#038; Rudin (2019)</a> approach for newsvendor problem, showing that in case of the linear model, it becomes equivalent to quantile regression. We then extend it for the non-linear newsvendor problems, testing it on simulated and real life data. In order to understand what specifically we propose, we need to discuss the typical process in case of newsvendor problem.</p>
<p>Newsvendor is a class of problems, where the product can only be sold one day, after which it goes to waste. So this is appropriate, for example, for perishable products in retail. Typically, in this situation we would have historical demand of sales of our product \(y_t\) and we would try forecasting it using regression / ETS / ARIMA or any other model. After doing that and obtaining the estimates of parameters, we would produce a quantile of assumed distribution, which then tells us how much to order (\(q_t\)). If we order more than needed, we will have holding costs. In the opposite case, we will have shortage costs. Based on these costs and the price of product, we can find the optimal order, that will give the maximum profit.</p>
<p>As you can already spot, the forecasting stage is detached from the optimisation one in this situation. The idea of the proposed integrated approach (IMEO) is simple: instead of optimising the model via MSE or any other conventional loss and then solving the optimisation problem, we could <strong>estimate the model via maximisation of the specific profit function</strong>, thus obtaining the required orders directly. This is not a new idea on its own, but using profit function rather than the cost (as <a href="https://doi.org/10.1287/opre.2018.1757" rel="noopener" target="_blank">Ban &#038; Rudin, 2019</a> did) allows applying IMEO to wider set of problems.</p>
<p>For example, if we know the price of the product \(p\), the costs for production \(v\), holding \(c_h\) and shortage costs \(c_s\), we can then calculate profit as (for a linear newsvendor problem):<br />
\begin{equation}<br />
    \pi(q_t,y_t)=<br />
    \begin{cases}<br />
        p y_t -v q_t -c_h (q_t -y_t),&#038; \text{for } q_t \geq y_t\\<br />
        p q_t -v q_t -c_s (y_t -q_t),&#038; \text{for } q_t< y_t,
    \end{cases}
\end{equation}
where \(q_t\) ​is the order quantity and \(y_t\) is the actual sales. This profit function can be used for the estimation of a model of your choosing. Congzheng has written a separate R code for the experiments for the paper. Inspired by his example, I have implemented custom losses in <code>alm()</code> and <code>adam()</code> functions from respective <code>greybox</code> and <code>smooth</code> packages for R. At the moment, only the regression model works properly with custom losses &#8211; ETS / ARIMA need additional modifications, which we will hopefully resolve in the next paper. So, here is an example with linear newsvendor problem and <code>alm()</code>:</p>
<pre class="decode"># Generate artificial data
x1 <- rnorm(100,100,10)
x2 <- rbinom(100,2,0.05)
y <- 10 + 1.5*x1 + 5*x2 + rnorm(100,0,10)
ourData <- cbind(y=y,x1=x1,x2=x2)

# Define price and costs
price <- 50
costBasic <- 5
costShort <- 15
costHold <- 1

# Define profit function for the linear case
lossProfit <- function(actual, fitted, B, xreg){
    # Minus sign is needed here, because we need to minimise the loss
    profit <- -ifelse(actual >= fitted,
                     (price - costBasic) * fitted - costShort * (actual - fitted),
                     price * actual - costBasic * fitted - costHold * (fitted - actual));
    return(sum(profit));
}

# Estimate the model
model1 <- alm(y~x1+x2, ourData, loss=lossProfit)

# Print summary of the model
summary(model1, bootstrap=TRUE) </pre>
<pre>Response variable: y
Distribution used in the estimation: Normal
Loss function used in estimation: custom
Bootstrap was used for the estimation of uncertainty of parameters
Coefficients:
            Estimate Std. Error Lower 2.5% Upper 97.5%  
(Intercept)  36.5177    14.2840     2.7783     51.4844 *
x1            1.3622     0.1622     1.1909      1.7528 *
x2            3.3423     2.7810    -6.5997      5.9101  

Error standard deviation: 17.2266
Sample size: 100
Number of estimated parameters: 3
Number of degrees of freedom: 97</pre>
<p>The resulting model is easy to work with: it provides meaningful parameters, showing how on average the order should change if a variable changes by one. For example, we see that with the increase of the variable x1, the orders should change on average by 1.36.</p>
<p>Note that in this specific case, as shown <a href="https://doi.org/10.1016/j.ejor.2021.08.013" rel="noopener" target="_blank">in our paper</a>, the model would be equivalent to the quantile regression, estimated for the quantile \(\left( \frac{c_u}{c_o+c_u} \right)\), where \(c_u= p-v+c_s\) is the "underage" cost and \(c_o = v+c_h\) is the "overage" cost. In our example it corresponds to approximately 0.9091 quantile. We can compare the output of this model with the one from the quantile regression in <code>alm</code> (which is estimated as an Asymmetric Laplace model):</p>
<pre class="decode">model2 <- alm(y~x1+x2, ourData, distribution="dalaplace", alpha=0.9091)
summary(model2, bootstrap=TRUE)</pre>
<pre>Response variable: y
Distribution used in the estimation: Asymmetric Laplace with alpha=0.9091
Loss function used in estimation: likelihood
Bootstrap was used for the estimation of uncertainty of parameters
Coefficients:
            Estimate Std. Error Lower 2.5% Upper 97.5%  
(Intercept)  36.6688    11.6686     3.8674     51.1987 *
x1            1.3611     0.1338     1.1920      1.7454 *
x2            3.1259     2.5424    -6.2518      5.4703  

Error standard deviation: 17.3379
Sample size: 100
Number of estimated parameters: 4
Number of degrees of freedom: 96
Information criteria:
     AIC     AICc      BIC     BICc 
826.4622 826.8833 836.8829 837.8524</pre>
<p>The differences between the estimates of parameters of the two models are due to the optimisation procedure, which would converge to slightly different points in these two cases. Still, the values of parameters are close to each other and would converge asymptotically, which supports our finding.</p>
<p>And here how the orders over time look in case of our custom loss:</p>
<pre class="decode">plot(model1, 7)</pre>
<div id="attachment_2716" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/09/ordersDynamics.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2716" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/09/ordersDynamics-300x175.png&amp;nocache=1" alt="Dynamics of orders from alm model" width="300" height="175" class="size-medium wp-image-2716" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/09/ordersDynamics-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/09/ordersDynamics-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/09/ordersDynamics-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/09/ordersDynamics.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2716" class="wp-caption-text">Dynamics of orders from alm model</p></div>
<p>The purple line in the Figure above corresponds to the orders and would cover roughly 90.91% of cases, so that we would run out of product in approximately 10% of cases, which would still be more profitable than any other option.</p>
<p>Finally, the approach works also well in case of non-linear newsvendor problem (see <a href="https://doi.org/10.1016/j.ejor.2021.08.013" rel="noopener" target="_blank">the paper</a> for details), where quantile regression is not suitable and the conventional approach fails. The only thing that would change is the loss function, where the prices and costs would depend non-linearly on the order quantity and sales.</p>
<p>You can read <a href="https://doi.org/10.1016/j.ejor.2021.08.013" rel="noopener" target="_blank">the published paper on EJOR website</a> or the working paper on <a href="http://dx.doi.org/10.13140/RG.2.2.27057.81763" rel="noopener" target="_blank">ResearchGate</a>.</p>
<p>Message <a href="https://openforecast.org/2021/09/03/an-integrated-method-for-estimation-and-optimisation/">An Integrated Method for Estimation and Optimisation</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://openforecast.org/2021/09/03/an-integrated-method-for-estimation-and-optimisation/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>The creation of ADAM &#8211; next step in statistical forecasting</title>
		<link>https://openforecast.org/2021/01/13/the-creation-of-adam-next-step-in-statistical-forecasting/</link>
					<comments>https://openforecast.org/2021/01/13/the-creation-of-adam-next-step-in-statistical-forecasting/#respond</comments>
		
		<dc:creator><![CDATA[Ivan Svetunkov]]></dc:creator>
		<pubDate>Wed, 13 Jan 2021 11:24:18 +0000</pubDate>
				<category><![CDATA[adam()]]></category>
		<category><![CDATA[ARIMA]]></category>
		<category><![CDATA[ETS]]></category>
		<category><![CDATA[Package smooth for R]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Regression]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[smooth]]></category>
		<guid isPermaLink="false">https://openforecast.org/?p=2552</guid>

					<description><![CDATA[<p>Good news everyone! The future of statistical forecasting is finally here :). Have you ever struggled with ETS and needed explanatory variables? Have you ever needed to unite ARIMA and ETS? Have you ever needed to deal with all those zeroes in the data? What about the data with multiple seasonalities? All of this and [&#8230;]</p>
<p>Message <a href="https://openforecast.org/2021/01/13/the-creation-of-adam-next-step-in-statistical-forecasting/">The creation of ADAM &#8211; next step in statistical forecasting</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Good news everyone! The future of statistical forecasting is finally here :). Have you ever struggled with ETS and needed explanatory variables? Have you ever needed to unite ARIMA and ETS? Have you ever needed to deal with all those zeroes in the data? What about the data with multiple seasonalities? All of this and more can now be solved by <code>adam()</code> function from smooth v3.0.1 package for R (<a href="https://cran.r-project.org/package=smooth">on its way to CRAN now</a>). ADAM stands for &#8220;Augmented Dynamic Adaptive Model&#8221; (I will talk about it in the next <a href="https://cmaf-fft.lp151.com/" rel="noopener" target="_blank">CMAF Friday Forecasting Talk</a> on 15th January). Now, what is ADAM? Well, something like this:</p>
<div id="attachment_2557" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2020/12/Touched_by_His_Noodly_Appendage_HD-smooth.jpg&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2557" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2020/12/Touched_by_His_Noodly_Appendage_HD-smooth-300x142.jpg&amp;nocache=1" alt="ADAM, smooth and His Noodly Appendage Flying Spaghetti Monster" width="300" height="142" class="size-medium wp-image-2557" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2020/12/Touched_by_His_Noodly_Appendage_HD-smooth-300x142.jpg&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2020/12/Touched_by_His_Noodly_Appendage_HD-smooth-768x364.jpg&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2020/12/Touched_by_His_Noodly_Appendage_HD-smooth.jpg&amp;nocache=1 1000w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2557" class="wp-caption-text">The Creation of ADAM by <a href="http://www.androidarts.com/">Arne Niklas Jansson</a> with my adaptation</p></div>
<p>ADAM is the next step in time series analysis and forecasting. Remember <a href="/en/2016/10/14/smooth-package-for-r-es-i/">exponential smoothing</a> and functions like <code>es()</code> and <code>ets()</code>? Remember ARIMA and functions like <code>arima()</code>, <code>ssarima()</code>, <code>msarima()</code> etc? Remember your favourite <a href="/en/2019/01/07/marketing-analytics-with-greybox/">linear regression function</a>, e.g. <code>lm()</code>, <code>glm()</code> or <code>alm()</code>? Well, now these three models are implemented in a unified framework. Now you can have exponential smoothing with ARIMA elements and explanatory variables in one box: <code>adam()</code>. You can do ETS components and ARIMA orders selection, together with explanatory variables selection in one go. You can estimate ETS / ARIMA / regression using either likelihood of a selected distribution or using conventional losses like MSE, or even using your own custom loss. You can tune parameters of optimiser and experiment with initialisation and estimation of the model. The function can deal with multiple seasonalities and with intermittent data in one place. In fact, there are so many features that it is just easier to list the major of them:</p>
<ol>
<li>ETS;</li>
<li>ARIMA;</li>
<li>Regression;</li>
<li>TVP regression;</li>
<li>Combination of (1), (2) and either (3), or (4);</li>
<li>Automatic selection / combination of states for ETS;</li>
<li>Automatic orders selection for ARIMA;</li>
<li>Variables selection for regression part;</li>
<li>Normal and non-normal distributions;</li>
<li>Automatic selection of most suitable distributions;</li>
<li>Advanced and custom loss functions;</li>
<li>Multiple seasonality;</li>
<li>Occurrence part of the model to handle zeroes in data (intermittent demand);</li>
<li>Model diagnostics using plot() and other methods;</li>
<li>Confidence intervals for parameters of models;</li>
<li>Automatic outliers detection;</li>
<li>Handling missing data;</li>
<li>Fine tuning of persistence vector (smoothing parameters);</li>
<li>Fine tuning of initial values of the state vector (e.g. level / trend / seasonality / ARIMA components / regression parameters);</li>
<li>Two initialisation options (optimal / backcasting);</li>
<li>Provided ARMA parameters;</li>
<li>Fine tuning of optimiser (select algorithm and convergence criteria);</li>
<li>&#8230;</li>
</ol>
<p>All of this is based on the Single Source of Error state space model, which makes ETS, ARIMA and regression directly comparable via information criteria and opens a variety of modelling and forecasting possibilities. In addition, the code is much more efficient than the code of already existing smooth functions, so hopefully this will be a convenient function to use. I do not promise that everything will work 100% efficiently from scratch, because this is a new function, which implies that inevitably there are bugs and there is a room for improvement. But I intent to continue working on it, improving it further, based on the provided feedback (you can submit <a href="https://github.com/config-i1/smooth/issues">an issue on github</a> if you have ideas).</p>
<p>Keep in mind that starting from smooth v3.0.0 I will not be introducing new features in <code>es()</code>, <code>ssarima()</code> and other conventional functions for univariate variables in <code>smooth</code> &#8211; I will only fix bugs in them and possibly optimise some parts of the code, but there will be no innovations in them, given that the main focus from now on will be on <code>adam()</code>. To that extent, I have removed some experimental and not fully developed parameters from those functions (e.g. occurrence, oesmodel, updateX, persistenceX and transitionX).</p>
<p>Now, I realise that ADAM is something completely new and contains just too much information to cover in one post. As a result, I have started the work on an <a href="https://openforecast.org/adam/" rel="noopener" target="_blank">online textbook</a>. This is work in progress, missing some chapters, but it already covers many important elements of ADAM. If you find any mistakes in the text or formulae, please, use the &#8220;Open Review&#8221; functionality in the textbook to give me feedback or send me a message. This will be highly appreciated, because, working on this alone, I am sure that I have made plenty of mistakes and typos.</p>
<h3>Example in R</h3>
<p>Finally, it would be boring just to announce things and leave it like that. So, I&#8217;ve decided to come up with an R experiments on M, M3 and tourism competitions data, similar to how I&#8217;ve <a href="/en/2018/01/01/smooth-functions-in-2017/" rel="noopener" target="_blank">done it in 2017</a>, just to show how the function compares with the other conventional ones, measuring their accuracy and computational time:</p>
<div class="su-spoiler su-spoiler-style-fancy su-spoiler-icon-plus su-spoiler-closed" data-scroll-offset="0" data-anchor-in-url="no"><div class="su-spoiler-title" tabindex="0" role="button"><span class="su-spoiler-icon"></span>Huge chunk of code in R</div><div class="su-spoiler-content su-u-clearfix su-u-trim">
<pre class="decode"># Load the packages. If the packages are not available, install them from CRAN
library(Mcomp)
library(Tcomp)
library(smooth)
library(forecast)

# Load the packages for parallel calculation
# This package is available for Linux and MacOS only
# Comment out this line if you work on Windows
library(doMC)

# Set up the cluster on all cores / threads.
## Note that the code that follows might take around 500Mb per thread,
## so the issue is not in the number of threads, but rather in the RAM availability
## If you do not have enough RAM,
## you might need to reduce the number of threads manually.
## But this should not be greater than the number of threads your processor can do.
registerDoMC(detectCores())

##### Alternatively, if you work on Windows (why?), uncomment and run the following lines
# library(doParallel)
# cl <- detectCores()
# registerDoParallel(cl)
#####

# Create a small but neat function that will return a vector of error measures
errorMeasuresFunction <- function(object, holdout, insample){
    return(c(measures(holdout, object$mean, insample),
             mean(holdout < object$upper &#038; holdout > object$lower),
             mean(object$upper-object$lower)/mean(insample),
             pinball(holdout, object$upper, 0.975)/mean(insample),
             pinball(holdout, object$lower, 0.025)/mean(insample),
             sMIS(holdout, object$lower, object$upper, mean(insample),0.95),
             object$timeElapsed))
}

# Create the list of datasets
datasets <- c(M1,M3,tourism)
datasetLength <- length(datasets)
# Give names to competing forecasting methods
methodsNames <- c("ADAM-ETS(ZZZ)","ADAM-ETS(ZXZ)","ADAM-ARIMA",
                  "ETS(ZXZ)","ETSHyndman","AutoSSARIMA","AutoARIMA");
methodsNumber <- length(methodsNames);
# Run adam on one of time series from the competitions to get names of error measures
test <- adam(datasets[[125]]);
# The array with error measures for each method on each series.
## Here we calculate a lot of error measures, but we will use only few of them
testResults <- array(NA,c(methodsNumber,datasetLength,length(test$accuracy)+6),
                             dimnames=list(methodsNames, NULL,
                                           c(names(test$accuracy),
                                             "Coverage","Range",
                                             "pinballUpper","pinballLower","sMIS",
                                             "Time")));

#### ADAM(ZZZ) ####
j <- 1;
result <- foreach(i=1:datasetLength, .combine="cbind", .packages="smooth") %dopar% {
    startTime <- Sys.time()
    test <- adam(datasets[[i]],"ZZZ");
    testForecast <- forecast(test, h=datasets[[i]]$h, interval="pred");
    testForecast$timeElapsed <- Sys.time() - startTime;
    return(errorMeasuresFunction(testForecast, datasets[[i]]$xx, datasets[[i]]$x));
}
testResults[j,,] <- t(result);

#### ADAM(ZXZ) ####
j <- 2;
result <- foreach(i=1:datasetLength, .combine="cbind", .packages="smooth") %dopar% {
    startTime <- Sys.time()
    test <- adam(datasets[[i]],"ZXZ");
    testForecast <- forecast(test, h=datasets[[i]]$h, interval="pred");
    testForecast$timeElapsed <- Sys.time() - startTime;
    return(errorMeasuresFunction(testForecast, datasets[[i]]$xx, datasets[[i]]$x));
}
testResults[j,,] <- t(result);

#### ADAMARIMA ####
j <- 3;
result <- foreach(i=1:datasetLength, .combine="cbind", .packages="smooth") %dopar% {
    startTime <- Sys.time()
    test <- adam(datasets[[i]], "NNN",
                 order=list(ar=c(3,2),i=c(2,1),ma=c(3,2),select=TRUE));
    testForecast <- forecast(test, h=datasets[[i]]$h, interval="pred");
    testForecast$timeElapsed <- Sys.time() - startTime;
    return(errorMeasuresFunction(testForecast, datasets[[i]]$xx, datasets[[i]]$x));
}
testResults[j,,] <- t(result);

#### ES(ZXZ) ####
j <- 4;
result <- foreach(i=1:datasetLength, .combine="cbind", .packages="smooth") %dopar% {
    startTime <- Sys.time()
    test <- es(datasets[[i]],"ZXZ");
    testForecast <- forecast(test, h=datasets[[i]]$h, interval="parametric");
    testForecast$timeElapsed <- Sys.time() - startTime;
    return(errorMeasuresFunction(testForecast, datasets[[i]]$xx, datasets[[i]]$x));
}
testResults[j,,] <- t(result);

#### ETS from forecast package ####
j <- 5;
result <- foreach(i=1:datasetLength, .combine="cbind", .packages="forecast") %dopar% {
    startTime <- Sys.time()
    test <- ets(datasets[[i]]$x);
    testForecast <- forecast(test, h=datasets[[i]]$h, level=95);
    testForecast$timeElapsed <- Sys.time() - startTime;
    return(errorMeasuresFunction(testForecast, datasets[[i]]$xx, datasets[[i]]$x));
}
testResults[j,,] <- t(result);

#### AUTO SSARIMA ####
j <- 6;
result <- foreach(i=1:datasetLength, .combine="cbind", .packages="smooth") %dopar% {
    startTime <- Sys.time()
    test <- auto.ssarima(datasets[[i]]);
    testForecast <- forecast(test, h=datasets[[i]]$h, interval=TRUE);
    testForecast$timeElapsed <- Sys.time() - startTime;
    return(errorMeasuresFunction(testForecast, datasets[[i]]$xx, datasets[[i]]$x));
}
testResults[j,,] <- t(result);

#### AUTOARIMA ####
j <- 7;
result <- foreach(i=1:datasetLength, .combine="cbind", .packages="forecast") %dopar% {
    startTime <- Sys.time()
    test <- auto.arima(datasets[[i]]$x);
    testForecast <- forecast(test, h=datasets[[i]]$h, level=95);
    testForecast$timeElapsed <- Sys.time() - startTime;
    return(errorMeasuresFunction(testForecast, datasets[[i]]$xx, datasets[[i]]$x));
}
testResults[j,,] <- t(result);

# If you work on Windows, don't forget to shutdown the cluster via the following command:
# stopCluster(cl)</pre>
</div></div>
<p>After running this code, we will get the big array (7x5315x21), which would contain many different error measures for <a href="/en/2019/08/25/are-you-sure-youre-precise-measuring-accuracy-of-point-forecasts/">point forecasts</a> and <a href="/en/2019/10/18/how-confident-are-you-assessing-the-uncertainty-in-forecasting/">prediction intervals</a>. We will not use all of them, but instead will extract MASE and RMSSE for point forecasts and Coverage, Range and sMIS for prediction intervals, together with computational time. Although it might be more informative to look at distributions of those variables, we will calculate mean and median values overall, just to get a feeling about the performance:<br />
<div class="su-spoiler su-spoiler-style-fancy su-spoiler-icon-plus su-spoiler-closed" data-scroll-offset="0" data-anchor-in-url="no"><div class="su-spoiler-title" tabindex="0" role="button"><span class="su-spoiler-icon"></span>A much smaller chunk of code in R</div><div class="su-spoiler-content su-u-clearfix su-u-trim">
<pre class="decode">round(apply(testResults[,,c("MASE","RMSSE","Coverage","Range","sMIS","Time")],
            c(1,3),mean),3)
round(apply(testResults[,,c("MASE","RMSSE","Range","MIS","Time")],
            c(1,3),median),3)</pre>
</div></div>
This will result in the following two tables (boldface shows the best performing functions):</p>
<pre><strong>Means</strong>:
               MASE RMSSE Coverage Range  sMIS  Time
ADAM-ETS(ZZZ) 2.415 2.098    0.888 1.398 2.437 0.654
ADAM-ETS(ZXZ) <strong>2.250 1.961    0.895</strong> 1.225 <strong>2.092</strong> 0.497
ADAM-ARIMA    2.551 2.203    0.862 0.968 3.098 5.990
ETS(ZXZ)      2.279 1.977    0.862 1.372 2.490 1.128
ETSHyndman    2.263 1.970    0.882 1.200 2.258 <strong>0.404</strong>
AutoSSARIMA   2.482 2.134    0.801 <strong>0.780</strong> 3.335 1.700
AutoARIMA     2.303 1.989    0.834 0.805 3.013 1.385

<strong>Medians</strong>:
               MASE RMSSE Range  sMIS  Time
ADAM-ETS(ZZZ) 1.362 1.215 0.671 0.917 0.396
ADAM-ETS(ZXZ) 1.327 1.184 0.675 0.909 0.310
ADAM-ARIMA    1.476 1.300 0.769 1.006 3.525
ETS(ZXZ)      1.335 1.198 0.616 0.931 0.551
ETSHyndman    1.323 <strong>1.181</strong> 0.653 0.925 <strong>0.164</strong>
AutoSSARIMA   1.419 1.271 <strong>0.577</strong> 0.988 0.909
AutoARIMA     <strong>1.310</strong> 1.182 0.609 <strong>0.881</strong> 0.322</pre>
<p>Some things to note from this:</p>
<ul>
<li>ADAM ETS(ZXZ) is the most accurate model in terms of mean MASE and RMSSE, it has the coverage closest to 95% (although none of the models achieved the nominal value because of the fundamental underestimation of uncertainty) and has the lowest sMIS, implying that it did better than the other functions in terms of prediction intervals;</li>
<li>The ETS(ZZZ) did worse than ETS(ZXZ) because the latter considers the multiplicative trend, which sometimes becomes unstable, producing exploding trajectories;</li>
<li>ADAM ARIMA is not performing well yet, because of the implemented order selection algorithm and it was the slowest function of all. I plan to improve it in future releases of the function;</li>
<li>While ADAM ETS(ZXZ) did not beat ETS from forecast package in terms of computational time, it was faster than the other functions;</li>
<li>When it comes to medians, <code>auto.arima()</code>, <code>ets()</code> and <code>auto.ssarima()</code> seem to be doing better than ADAM, but not by a large margin.
</ul>
<p>In order to see if the performance of functions is statistically different, we run <a href="/en/2020/08/17/accuracy-of-forecasting-methods-can-you-tell-the-difference/">the RMCB test</a> for MASE, RMSSE and MIS. Note that RMCB compares the median performance of functions. Here is the R code:<br />
<div class="su-spoiler su-spoiler-style-fancy su-spoiler-icon-plus su-spoiler-closed" data-scroll-offset="0" data-anchor-in-url="no"><div class="su-spoiler-title" tabindex="0" role="button"><span class="su-spoiler-icon"></span>A smaller chunk of code in R for the MCB test</div><div class="su-spoiler-content su-u-clearfix su-u-trim">
<pre class="decode"># Load the package with the function
library(greybox)
# Run it for each separate measure, automatically producing plots
rmcbResultMASE <- rmcb(t(testResults[,,"MASE"]))
rmcbResultRMSSE <- rmcb(t(testResults[,,"RMSSE"]))
rmcbResultsMIS <- rmcb(t(testResults[,,"sMIS"]))</pre>
</div></div>
<p>And here are the figures that we get by running that code</p>
<div id="attachment_2599" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBMASE.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2599" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBMASE-300x175.png&amp;nocache=1" alt="RMCB test for MASE" width="300" height="175" class="size-medium wp-image-2599" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBMASE-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBMASE-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBMASE-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBMASE.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2599" class="wp-caption-text">RMCB test for MASE</p></div>
<div id="attachment_2598" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBRMSSE.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2598" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBRMSSE-300x175.png&amp;nocache=1" alt="RMCB test for RMSSE" width="300" height="175" class="size-medium wp-image-2598" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBRMSSE-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBRMSSE-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBRMSSE-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBRMSSE.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2598" class="wp-caption-text">RMCB test for RMSSE</p></div>
<p>As we can see from the two figures above, ADAM-ETS(Z,X,Z) performs better than the other functions, although statistically not different than ETS implemented in <code>es()</code> and <code>ets()</code> functions. ADAM-ARIMA is the worst performing function for the moment, as we have already noticed in the previous analysis. The ranking is similar for both MASE and RMSSE.</p>
<p>And here is the sMIS plot:</p>
<div id="attachment_2597" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBsMIS.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2597" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBsMIS-300x175.png&amp;nocache=1" alt="RMCB test for sMIS" width="300" height="175" class="size-medium wp-image-2597" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBsMIS-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBsMIS-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBsMIS-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2021/01/adamTestsRMCBsMIS.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2597" class="wp-caption-text">RMCB test for sMIS</p></div>
<p>When it comes to sMIS, the leader in terms of medians is <code>auto.arima()</code>, doing quite similar to <code>ets()</code>, but this is mainly because they have lower ranges, incidentally resulting in lower than needed coverage (as seen from the summary performance above). ADAM-ETS does similar to <code>ets()</code> and <code>es()</code> in this aspect (the intervals of the three intersect).</p>
<p>Obviously, we could provide more detailed analysis of performance of functions on different types of data and see, how they compare in each category, but the aim of this post is just to demonstrate how the new function works, I do not have intent to investigate this in detail.</p>
<p>Finally, I will present ADAM with several case studies in <a href="https://cmaf-fft.lp151.com/" rel="noopener" target="_blank">CMAF Friday Forecasting Talk</a> on 15th January. If you are interested to hear more and have some questions, please <a href="https://www.meetup.com/cmaf-friday-forecasting-talks/" rel="noopener" target="_blank">register on MeetUp</a> or <a href="https://www.linkedin.com/events/cmaffft-toinfinityandbeyond-for6751883043834773504/" rel="noopener" target="_blank">via LinkedIn</a> and join us online.</p>
<p>Message <a href="https://openforecast.org/2021/01/13/the-creation-of-adam-next-step-in-statistical-forecasting/">The creation of ADAM &#8211; next step in statistical forecasting</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://openforecast.org/2021/01/13/the-creation-of-adam-next-step-in-statistical-forecasting/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Analytics with greybox</title>
		<link>https://openforecast.org/2019/01/07/marketing-analytics-with-greybox/</link>
					<comments>https://openforecast.org/2019/01/07/marketing-analytics-with-greybox/#comments</comments>
		
		<dc:creator><![CDATA[Ivan Svetunkov]]></dc:creator>
		<pubDate>Mon, 07 Jan 2019 16:40:17 +0000</pubDate>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Package greybox for R]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[greybox]]></category>
		<category><![CDATA[regression]]></category>
		<guid isPermaLink="false">https://openforecast.org/?p=1893</guid>

					<description><![CDATA[<p>One of the reasons why I have started the greybox package is to use it for marketing research and marketing analytics. The common problem that I face, when working with these courses is analysing the data measured in different scales. While R handles numeric scales natively, the work with categorical is not satisfactory. Yes, I [&#8230;]</p>
<p>Message <a href="https://openforecast.org/2019/01/07/marketing-analytics-with-greybox/">Analytics with greybox</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>One of the reasons why I have started the <span class="lang:r decode:true crayon-inline">greybox</span> package is to use it for marketing research and marketing analytics. The common problem that I face, when working with these courses is analysing the data measured in different scales. While R handles numeric scales natively, the work with categorical is not satisfactory. Yes, I know that there are packages that implement some of the functions, but I wanted to have them in one place without the need to install a lot of packages and satisfy the dependencies. After all, what&#8217;s the point in installing a package for Cramer&#8217;s V, when it can be calculated with two lines of code? So, here&#8217;s a brief explanation of the functions for marketing analytics in <span class="lang:r decode:true crayon-inline">greybox</span>.</p>
<p>I will use `mtcars` dataset for the examples, but we will transform some of the variables into factors:</p>
<pre class="decode">mtcarsData &lt;- as.data.frame(mtcars)
mtcarsData$vs &lt;- factor(mtcarsData$vs, levels=c(0,1), labels=c("v","s"))
mtcarsData$am &lt;- factor(mtcarsData$am, levels=c(0,1), labels=c("a","m"))</pre>
<p><em>All the functions discussed in this post are available in <span class="lang:r decode:true crayon-inline">greybox</span> starting from v0.4.0. However, I&#8217;ve found several bugs since the submission to CRAN, and the most recent version with bugfixes is now <a href="https://github.com/config-i1/greybox" rel="noopener noreferrer" target="_blank">available on github</a>.</em></p>
<h2>Analysing the relation between the two variables in categorical scales</h2>
<h3>Cramer&#8217;s V</h3>
<p>Cramer&#8217;s V measures the relation between two variables in categorical scale. It is implemented in the <span class="lang:r decode:true crayon-inline">cramer()</span> function. It returns the value in a range of 0 to 1 (1 &#8211; when the two categorical variables are linearly associated with each other, 0 &#8211; otherwise), Chi-Squared statistics from the <span class="lang:r decode:true crayon-inline">chisq.test()</span>, the respective p-value and the number of degrees of freedom. The tested hypothesis in this case is formulated as:<br />
\begin{matrix}<br />
H_0: V = 0 \text{ (the variables don&#8217;t have association);} \\<br />
H_1: V \neq 0 \text{ (there is an association between the variables).}<br />
\end{matrix}</p>
<p>Here&#8217;s what we get when trying to find the association between the engine and transmission in the `mtcars` data:</p>
<pre class="decode">cramer(mtcarsData$vs, mtcarsData$am)</pre>
<pre>Cramer's V: 0.1042
Chi^2 statistics = 0.3475, df: 1, p-value: 0.5555</pre>
<p>Judging by this output, the association between these two variables is very low (close to zero) and is not statistically significant.</p>
<p>Cramer&#8217;s V can also be used for the data in numerical scales. In general, this might be not the most suitable solution, but this might be useful when you have a small number of values in the data. For example, the variable `gear` in `mtcars` is numerical, but it has only three options (3, 4 and 5). Here&#8217;s what Cramer&#8217;s V tells us in the case of `gear` and `am`:</p>
<pre class="decode">cramer(mtcarsData$am, mtcarsData$gear)</pre>
<pre>Cramer's V: 0.809
Chi^2 statistics = 20.9447, df: 2, p-value: 0</pre>
<p>As we see, the value is high in this case (0.809), and the null hypothesis is rejected on 5% level. So we can conclude that there is a relation between the two variables. This does not mean that one variable causes the other one, but they both might be driven by something else (do more expensive cars have less gears but the automatic transmission?).</p>
<h3>Plotting categorical variables</h3>
<p>While R allows plotting two categorical variables against each other, the plot is hard to read and is not very helpful (in my opinion):</p>
<pre class="decode">plot(table(mtcarsData$am,mtcarsData$gear))</pre>
<div id="attachment_1912" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsPlot.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1912" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsPlot-300x300.png&amp;nocache=1" alt="" width="300" height="300" class="size-medium wp-image-1912" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsPlot-300x300.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsPlot-150x150.png&amp;nocache=1 150w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsPlot.png&amp;nocache=1 700w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1912" class="wp-caption-text">Default plot of a table</p></div>
<p>So I have created a function that produces a heat map for two categorical variables. It is called <span class="lang:r decode:true crayon-inline">tableplot()</span>:</p>
<pre class="decode">tableplot(mtcarsData$am,mtcarsData$gear)</pre>
<div id="attachment_1915" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsTableplot.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1915" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsTableplot-300x300.png&amp;nocache=1" alt="" width="300" height="300" class="size-medium wp-image-1915" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsTableplot-300x300.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsTableplot-150x150.png&amp;nocache=1 150w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsTableplot.png&amp;nocache=1 700w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1915" class="wp-caption-text">Tableplot for the two categorical variables</p></div>
<p>It is based on <span class="lang:r decode:true crayon-inline">table()</span> function and uses the frequencies inside the table for the colours:</p>
<pre class="decode">table(mtcarsData$am,mtcarsData$gear) / length(mtcarsData$am)</pre>
<pre>        3       4       5
a 0.46875 0.12500 0.00000
m 0.00000 0.25000 0.15625</pre>
<p>The darker sectors mean that there is a higher concentration of values, while the white ones correspond to zeroes. So, in our example, we see that the majority of cars have automatic transmissions with three gears. Furthermore, the plot shows that there is some sort of relation between the two variables: the cars with automatic transmissions have the lower number of gears, while the ones with the manual have the higher number of gears (something we&#8217;ve already noticed in the previous subsection).</p>
<h2>Association between the categorical and numerical variables</h2>
<p>While Cramer&#8217;s V can also be used for the measurement of association between the variables in different scales, there are better instruments. For example, some analysts recommend using intraclass correlation coefficient when measuring the relation between the numerical and categorical variables. But there is a simpler option, which involves calculating the coefficient of multiple correlation between the variables. This is implemented in <span class="lang:r decode:true crayon-inline">mcor()</span> function of <span class="lang:r decode:true crayon-inline">greybox</span>. The `y` variable should be numerical, while `x` can be of any type. What the function then does is expands all the factors and runs a regression via <span class="lang:r decode:true crayon-inline">.lm.fit()</span> function, returning the square root of the coefficient of determination. If the variables are linearly related, then the returned value will be close to one. Otherwise it will be closet to zero. The function also returns the F statistics from the regression, the associated p-value and the number of degrees of freedom (the hypothesis is formulated similarly to <span class="lang:r decode:true crayon-inline">cramer()</span> function).</p>
<p>Here&#8217;s how it works:</p>
<pre class="decode">mcor(mtcarsData$am,mtcarsData$mpg)</pre>
<pre>Multiple correlations value: 0.5998
F-statistics = 16.8603, df: 1, df resid: 30, p-value: 3e-04</pre>
<p>In this example, the simple linear regression of mpg from the set of dummies is constructed, and we can conclude that there is a linear relation between the variables, and that this relation is statistically significant.</p>
<h2>Association between several variables</h2>
<h3>Measures of association</h3>
<p>When you deal with datasets (i.e. data frames or matrices), then you can use <span class="lang:r decode:true crayon-inline">cor()</span> function in order to calculate the correlation coefficients between the variables in the data. But when you have a mixture of numerical and categorical variables, the situation becomes more difficult, as the correlation does not make sense for the latter. This motivated me to create a function that uses either <span class="lang:r decode:true crayon-inline">cor()</span>, or <span class="lang:r decode:true crayon-inline">cramer()</span>, or <span class="lang:r decode:true crayon-inline">mcor()</span> functions depending on the types of data (see discussions of <span class="lang:r decode:true crayon-inline">cramer()</span> and <span class="lang:r decode:true crayon-inline">mcor()</span> above). The function is called <span class="lang:r decode:true crayon-inline">association()</span> or <span class="lang:r decode:true crayon-inline">assoc()</span> and returns three matrices: the values of the measures of association, their p-values and the types of the functions used between the variables. Here&#8217;s an example:</p>
<pre class="decode">assocValues &lt;- assoc(mtcarsData)
print(assocValues,digits=2)</pre>
<pre> Associations: 
 values:
        mpg  cyl  disp    hp  drat    wt  qsec   vs   am gear carb
 mpg   1.00 0.86 -0.85 -0.78  0.68 -0.87  0.42 0.66 0.60 0.66 0.67
 cyl   0.86 1.00  0.92  0.84  0.70  0.78  0.59 0.82 0.52 0.53 0.62
 disp -0.85 0.92  1.00  0.79 -0.71  0.89 -0.43 0.71 0.59 0.77 0.56
 hp   -0.78 0.84  0.79  1.00 -0.45  0.66 -0.71 0.72 0.24 0.66 0.79
 drat  0.68 0.70 -0.71 -0.45  1.00 -0.71  0.09 0.44 0.71 0.83 0.33
 wt   -0.87 0.78  0.89  0.66 -0.71  1.00 -0.17 0.55 0.69 0.66 0.61
 qsec  0.42 0.59 -0.43 -0.71  0.09 -0.17  1.00 0.74 0.23 0.63 0.67
 vs    0.66 0.82  0.71  0.72  0.44  0.55  0.74 1.00 0.10 0.62 0.69
 am    0.60 0.52  0.59  0.24  0.71  0.69  0.23 0.10 1.00 0.81 0.44
 gear  0.66 0.53  0.77  0.66  0.83  0.66  0.63 0.62 0.81 1.00 0.51
 carb  0.67 0.62  0.56  0.79  0.33  0.61  0.67 0.69 0.44 0.51 1.00
 
 p-values:
       mpg  cyl disp   hp drat   wt qsec   vs   am gear carb
 mpg  1.00 0.00 0.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.01
 cyl  0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.01
 disp 0.00 0.00 1.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.07
 hp   0.00 0.00 0.00 1.00 0.01 0.00 0.00 0.00 0.18 0.00 0.00
 drat 0.00 0.00 0.00 0.01 1.00 0.00 0.62 0.01 0.00 0.00 0.66
 wt   0.00 0.00 0.00 0.00 0.00 1.00 0.34 0.00 0.00 0.00 0.02
 qsec 0.02 0.00 0.01 0.00 0.62 0.34 1.00 0.00 0.21 0.00 0.01
 vs   0.00 0.00 0.00 0.00 0.01 0.00 0.00 1.00 0.56 0.00 0.01
 am   0.00 0.01 0.00 0.18 0.00 0.00 0.21 0.56 1.00 0.00 0.28
 gear 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.09
 carb 0.01 0.01 0.07 0.00 0.66 0.02 0.01 0.01 0.28 0.09 1.00
 
 types:
      mpg    cyl      disp   hp     drat   wt     qsec   vs       am      
 mpg  "none" "mcor"   "cor"  "cor"  "cor"  "cor"  "cor"  "mcor"   "mcor"  
 cyl  "mcor" "none"   "mcor" "mcor" "mcor" "mcor" "mcor" "cramer" "cramer"
 disp "cor"  "mcor"   "none" "cor"  "cor"  "cor"  "cor"  "mcor"   "mcor"  
 hp   "cor"  "mcor"   "cor"  "none" "cor"  "cor"  "cor"  "mcor"   "mcor"  
 drat "cor"  "mcor"   "cor"  "cor"  "none" "cor"  "cor"  "mcor"   "mcor"  
 wt   "cor"  "mcor"   "cor"  "cor"  "cor"  "none" "cor"  "mcor"   "mcor"  
 qsec "cor"  "mcor"   "cor"  "cor"  "cor"  "cor"  "none" "mcor"   "mcor"  
 vs   "mcor" "cramer" "mcor" "mcor" "mcor" "mcor" "mcor" "none"   "cramer"
 am   "mcor" "cramer" "mcor" "mcor" "mcor" "mcor" "mcor" "cramer" "none"  
 gear "mcor" "cramer" "mcor" "mcor" "mcor" "mcor" "mcor" "cramer" "cramer"
 carb "mcor" "cramer" "mcor" "mcor" "mcor" "mcor" "mcor" "cramer" "cramer"
      gear     carb    
 mpg  "mcor"   "mcor"  
 cyl  "cramer" "cramer"
 disp "mcor"   "mcor"  
 hp   "mcor"   "mcor"  
 drat "mcor"   "mcor"  
 wt   "mcor"   "mcor"  
 qsec "mcor"   "mcor"  
 vs   "cramer" "cramer"
 am   "cramer" "cramer"
 gear "none"   "cramer"
 carb "cramer" "none"</pre>
<p>One thing to note is that the function considers numerical variables as categorical, when they only have up to 10 unique values. This is useful, for example, in case of number of `gears` in the dataset.</p>
<h2>Plots of association between several variables</h2>
<p>Similarly to the problem with <span class="lang:r decode:true crayon-inline">cor()</span>, scatterplot matrix (produced using <span class="lang:r decode:true crayon-inline">plot()</span>) is not meaningful in case of a mixture of variables:</p>
<pre class="decode">plot(mtcarsData)</pre>
<div id="attachment_1913" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsScatter.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1913" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsScatter-300x300.png&amp;nocache=1" alt="" width="300" height="300" class="size-medium wp-image-1913" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsScatter-300x300.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsScatter-150x150.png&amp;nocache=1 150w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsScatter.png&amp;nocache=1 700w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1913" class="wp-caption-text">Default scatter plot matrix</p></div>
<p>It makes sense to use scatterplot in case of numeric variables, <span class="lang:r decode:true crayon-inline">tableplot()</span> in case of categorical and <span class="lang:r decode:true crayon-inline">boxplot()</span> in case of a mixture. So, there is the function <span class="lang:r decode:true crayon-inline">spread()</span> in <span class="lang:r decode:true crayon-inline">greybox</span> that creates something more meaningful. It uses the same algorithm as <span class="lang:r decode:true crayon-inline">assoc()</span> function, but produces plots instead of calculating measures of association. So, `gear` will be considered as categorical and the function will produce either <span class="lang:r decode:true crayon-inline">boxplot()</span> or <span class="lang:r decode:true crayon-inline">tableplot()</span>, when plotting it against other variables.</p>
<p>Here&#8217;s an example:</p>
<pre class="decode">spread(mtcarsData)</pre>
<div id="attachment_1914" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpread.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1914" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpread-300x300.png&amp;nocache=1" alt="" width="300" height="300" class="size-medium wp-image-1914" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpread-300x300.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpread-150x150.png&amp;nocache=1 150w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpread.png&amp;nocache=1 700w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1914" class="wp-caption-text">Spread matrix</p></div>
<p>This plot demonstrates, for example, that the number of carburetors influences fuel consumption (something that we could not have spotted in the case of <span class="lang:r decode:true crayon-inline">plot()</span>). Notice also, that the number of gears influences the fuel consumption in a non-linear relation as well. So constructing the model with dummy variables for the number of gears might be a reasonable thing to do.</p>
<p>The function also has the parameter `log`, which will transform all the numerical variables using logarithms, which is handy, when you suspect the non-linear relation between the variables. Finally, there is a parameter `histogram`, which will plot either histograms, or barplots on the diagonal.</p>
<pre class="decode">spread(mtcarsData, histograms=TRUE, log=TRUE)</pre>
<div id="attachment_1921" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpreadLogs.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1921" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpreadLogs-300x300.png&amp;nocache=1" alt="" width="300" height="300" class="size-medium wp-image-1921" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpreadLogs-300x300.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpreadLogs-150x150.png&amp;nocache=1 150w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2019/01/mtcarsSpreadLogs.png&amp;nocache=1 700w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1921" class="wp-caption-text">Spread matrix in logs</p></div>
<p>The plot demonstrates that the `disp` has a strong non-linear relation with `mpg`, and, similarly, `drat` and `hp` also influence `mpg` in a non-linear fashion.</p>
<h2>Regression diagnostics</h2>
<p>One of the problems of linear regression that can be diagnosed prior to the model construction is multicollinearity. The conventional way of doing this diagnostics is via calculating the variance inflation factor (VIF) after constructing the model. However, VIF is not easy to interpret, because it lies in \((1,\infty)\). Coefficients of determination from the linear regression models of explanatory variables are easier to interpret and work with. If such a coefficient is equal to one, then there are some perfectly correlated explanatory variables in the dataset. If it is equal to zero, then they are not linearly related.</p>
<p>There is a function <span class="lang:r decode:true crayon-inline">determination()</span> or <span class="lang:r decode:true crayon-inline">determ()</span> in <span class="lang:r decode:true crayon-inline">greybox</span> that returns the set of coefficients of determination for the explanatory variables. The good thing is that this can be done before constructing any model. In our example, the first column, `mpg` is the response variable, so we can diagnose the multicollinearity the following way:</p>
<pre class="decode">determination(mtcarsData[,-1])</pre>
<pre>       cyl      disp        hp      drat        wt      qsec        vs 
 0.9349544 0.9537470 0.8982917 0.7036703 0.9340582 0.8671619 0.8017720 
        am      gear      carb 
 0.7924392 0.8133441 0.8735577</pre>
<p>As we can see from the output above, `disp` is the most linearly related with the variables, so including it in the model might cause the multicollinearity, which will decrease the efficiency of the estimates of parameters.</p>
<p>Message <a href="https://openforecast.org/2019/01/07/marketing-analytics-with-greybox/">Analytics with greybox</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://openforecast.org/2019/01/07/marketing-analytics-with-greybox/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>greybox 0.3.0 &#8211; what&#8217;s new</title>
		<link>https://openforecast.org/2018/08/07/greybox-0-3-0-whats-new/</link>
					<comments>https://openforecast.org/2018/08/07/greybox-0-3-0-whats-new/#respond</comments>
		
		<dc:creator><![CDATA[Ivan Svetunkov]]></dc:creator>
		<pubDate>Tue, 07 Aug 2018 16:06:26 +0000</pubDate>
				<category><![CDATA[Package greybox for R]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Univariate models]]></category>
		<category><![CDATA[greybox]]></category>
		<category><![CDATA[regression]]></category>
		<guid isPermaLink="false">https://openforecast.org/?p=1778</guid>

					<description><![CDATA[<p>Three months have passed since the initial release of greybox on CRAN. I would not say that the package develops like crazy, but there have been some changes since May. Let&#8217;s have a look. We start by loading both greybox and smooth: library(greybox) library(smooth) Rolling Origin First of all, ro() function now has its own [&#8230;]</p>
<p>Message <a href="https://openforecast.org/2018/08/07/greybox-0-3-0-whats-new/">greybox 0.3.0 &#8211; what&#8217;s new</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Three months have passed since the initial release of <span class="lang:r decode:true crayon-inline">greybox</span> on CRAN. I would not say that the package develops like crazy, but there have been some changes since May. Let&#8217;s have a look. We start by loading both <span class="lang:r decode:true crayon-inline">greybox</span> and <span class="lang:r decode:true crayon-inline">smooth</span>:</p>
<pre class="decode">library(greybox)
library(smooth)</pre>
<h3>Rolling Origin</h3>
<p>First of all, <span class="lang:r decode:true crayon-inline">ro()</span> function now has its own class and works with <span class="lang:r decode:true crayon-inline">plot()</span> function, so that you can have a visual representation of the results. Here&#8217;s an example:</p>
<pre class="decode">x <- rnorm(100,100,10)
ourCall <- "es(data, h=h, intervals=TRUE)"
ourValue <- c("forecast", "lower", "upper")
ourRO <- ro(x,h=20,origins=5,ourCall,ourValue,co=TRUE)
plot(ourRO)</pre>
<div id="attachment_1781" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExample.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1781" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExample-300x175.png&amp;nocache=1" alt="" width="300" height="175" class="size-medium wp-image-1781" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExample-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExample-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExample-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExample.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1781" class="wp-caption-text">Example of the plot of rolling origin function</p></div>
<p>Each point on the produced graph corresponds to an origin and straight lines correspond to the forecasts. Given that we asked for point forecasts and for lower and upper bounds of prediction interval, we have three respective lines. By plotting the results of rolling origin experiment, we can see if the model is stable or not. Just compare the previous graph with the one produced from the call to Holt's model:</p>
<pre class="decode">ourCall <- "es(data, model='AAN', h=h, intervals=TRUE)"
ourRO <- ro(x,h=20,origins=5,ourCall,ourValue,co=TRUE)
plot(ourRO)</pre>
<div id="attachment_1782" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExampleAAN.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1782" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExampleAAN-300x175.png&amp;nocache=1" alt="" width="300" height="175" class="size-medium wp-image-1782" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExampleAAN-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExampleAAN-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExampleAAN-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/roPlotExampleAAN.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1782" class="wp-caption-text">Example of the plot of rolling origin function with ETS(A,A,N)</p></div>
<p>Holt's model is not suitable for this time series, so it's forecasts are less stable than the forecasts of the automatically selected model in the previous case (which is ETS(A,N,N)).</p>
<p>Once again, there is a vignette with examples for the <span class="lang:r decode:true crayon-inline">ro()</span> function, <a href="https://cran.r-project.org/web/packages/greybox/vignettes/ro.html" rel="noopener noreferrer" target="_blank">have a look</a> if you want to know more.</p>
<h3>ALM - Advanced Linear Model</h3>
<p>Yes, there is "Generalised Linear Model" in R, which implements Poisson, Gamma, Binomial and other regressions. Yes, there are smaller packages, implementing models with more exotic distributions. But I needed several regression models with: Laplace distribution, Folded normal distribution, Chi-squared distribution and one new mysterious distribution, which is currently called "S distribution". I needed them in one place and in one format: properly estimated using likelihoods, returning confidence intervals, information criteria and being able to produce forecasts. I also wanted them to work similar to <span class="lang:r decode:true crayon-inline">lm()</span>, so that the learning curve would not be too steep. So, here it is, the function <span class="lang:r decode:true crayon-inline">alm()</span>. It works quite similar to <span class="lang:r decode:true crayon-inline">lm()</span>:</p>
<pre class="decode">xreg <- cbind(rfnorm(100,1,10),rnorm(100,50,5))
xreg <- cbind(100+0.5*xreg[,1]-0.75*xreg[,2]+rlaplace(100,0,3),xreg,rnorm(100,300,10))
colnames(xreg) <- c("y","x1","x2","Noise")
inSample <- xreg[1:80,]
outSample <- xreg[-c(1:80),]

ourModel <- alm(y~x1+x2, inSample, distribution="laplace")
summary(ourModel)</pre>
<p>Here's the output of the summary: </p>
<pre>Distribution used in the estimation: Laplace
Coefficients:
            Estimate Std. Error Lower 2.5% Upper 97.5%
(Intercept) 95.85207    0.36746   95.12022    96.58392
x1           0.59618    0.02479    0.54681     0.64554
x2          -0.67865    0.00622   -0.69103    -0.66626
ICs:
     AIC     AICc      BIC     BICc 
474.2453 474.7786 483.7734 484.9419</pre>
<p>And here's the respective plot of the forecast:</p>
<pre class="decode">plot(forecast(ourModel,outSample))</pre>
<div id="attachment_1787" style="width: 310px" class="wp-caption alignnone"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/almLaplaceExample.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1787" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/almLaplaceExample-300x175.png&amp;nocache=1" alt="" width="300" height="175" class="size-medium wp-image-1787" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/almLaplaceExample-300x175.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/almLaplaceExample-768x448.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/almLaplaceExample-1024x597.png&amp;nocache=1 1024w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/almLaplaceExample.png&amp;nocache=1 1200w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1787" class="wp-caption-text">Forecast from lm with Laplace distribution</p></div>
The thing that is currently missing in the function is prediction intervals, but this will be added in the upcoming releases.</p>
<p>Having the likelihood approach, allows comparing different models with different distributions using information criteria. Here's, for example, what model we get if we assume S-distribution (which has fatter tails than Laplace):</p>
<pre class="decode">summary(alm(y~x1+x2, inSample, distribution="s"))</pre>
<pre>Distribution used in the estimation: S
Coefficients:
            Estimate Std. Error Lower 2.5% Upper 97.5%
(Intercept) 95.61244    0.23386   95.14666    96.07821
x1           0.56144    0.00721    0.54708     0.57581
x2          -0.66867    0.00302   -0.67470    -0.66265
ICs:
     AIC     AICc      BIC     BICc 
482.9358 483.4692 492.4639 493.6325</pre>
<p>As you see, the information criteria for S distribution are higher than for Laplace, so we can conclude that the previous model was better than the second in terms of ICs.</p>
<p><strong>Note</strong> that at this moment the AICc and BICc are not correct for non-normal models (at least the derivation of them needs to be double checked, which I haven't done yet), so don't rely on them too much.</p>
<p>I intent to add several other distributions that either are not available in R or are implemented unsatisfactory (from my point of view) - the function is written in a quite flexible way, so this should not be difficult to do. If you have any preferences, please add them on github, <a href="https://github.com/config-i1/greybox/issues/13" rel="noopener noreferrer" target="_blank">here</a>.</p>
<p>I also want to implement the mixture distributions, so that things discussed in <a href="/en/2017/11/07/multiplicative-state-space-models-for-intermittent-time-series/">the paper on intermittent state-space model</a> can also be implemented using pure regression.</p>
<p>Finally, now that I have alm, we can select between the regression models with different distributions (with <span class="lang:r decode:true crayon-inline">stepwise()</span> function) or even combine them using AIC weights (hello, <span class="lang:r decode:true crayon-inline">lmCombine()</span>!). Yes, I know that it sounds crazy (think of the pool of models in this case), but this should be fun!</p>
<p><a name="RMCB"></a></p>
<h3>Regression for Multiple Comparison with the Best</h3>
<p><strong>Please, note that this part of the post has been updated on 02.03.2020 in order to reflect the changes in the v0.5.9 version of the package.</strong><br />
One of the typical tasks in forecasting is to evaluate the performance of different methods on the holdout. In order to do that, it is common to use some statistical tests, the most popular of which is <a href="http://www.jmlr.org/papers/volume7/demsar06a/demsar06a.pdf">Nemenyi</a> / <a href="https://doi.org/10.1016/j.ijforecast.2004.10.003">MCB</a> (Multiple Comparison with the Best method). The test implemented in greybox package uses similar principles and relies on ranks of methods, but instead of taking averages and then applying studentised distances, it constructs a regression on the ranked data. This way we compare the median performance of different method (the same way as it is done in the classical MCB) and we produce parametric confidence intervals for parameters. The test is based on the simple linear model with dummy variables for each provided method (1 if the error corresponds to the method and 0 otherwise). Here's an example of how this thing works:</p>
<pre class="decode">ourData <- cbind(rnorm(100,0,10), rnorm(100,-2,5), rnorm(100,2,6), rlaplace(100,1,5))
colnames(ourData) <- c("Method A","Method B","Method C","Method D")

ourTest <- rmcb(ourData, level=0.95)</pre>
<p>By default the function produces graph in the MCB (Multiple Comparison with the Best) style:</p>
<div id="attachment_2370" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormNew.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2370" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormNew-300x180.png&amp;nocache=1" alt="" width="300" height="180" class="size-medium wp-image-2370" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormNew-300x180.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormNew-768x461.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormNew.png&amp;nocache=1 1000w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2370" class="wp-caption-text">RMCB example, MCB style plot</p></div>
<p>If we compare the results of the test with the mean rank values, we will see that they are the same:</p>
<pre class="decode">apply(t(apply(ourData,1,rank)),2,mean)</pre>
<pre>Method A Method B Method C Method D 
    2.40     2.06     2.75     2.79</pre>
<pre class="decode">ourTest$mean</pre>
<pre>Method B Method A Method C Method D 
    2.06     2.40     2.75     2.79</pre>
<p>This also reflects how the data was generated. Notice that Method D was generated from Laplace distribution with mean 1, but the test managed to give the correct answer in this situation, because Laplace distribution is symmetric and the sample size is large enough. But the main point of the test is that we can get the confidence intervals for each parameter, so we can see if the differences between the methods are significant: if the intervals intersect, then they are not.</p>
<p>The regression model used in the calculation is saved in the variable model and you can request a basic summary from it:</p>
<pre class="decode">summary(ourTest$model)</pre>
<pre>            Estimate Std. Error  Lower 2.5% Upper 97.5%
(Intercept)     2.40  0.1083601  2.18761804  2.61238196
Method B       -0.34  0.1532444 -0.64035346 -0.03964654
Method C        0.35  0.1532444  0.04964654  0.65035346
Method D        0.39  0.1532444  0.08964654  0.69035346</pre>
<p>But, please, keep in mind that this is not a proper "lm" object, so you cannot do much with it.</p>
<p>The function also reports p-value from the F-test of regression, testing the standard hypothesis that all the parameters are equal to zero.</p>
<p>We can also produce plots with vertical lines, that connect the models that are in the same group (no statistical difference, intersection of respective intervals). Here's the example for the same data:</p>
<pre class="decode">plot(ourTest, outplot="lines")</pre>
<div id="attachment_2371" style="width: 310px" class="wp-caption aligncenter"><a href="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormLinesNew.png&amp;nocache=1"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2371" src="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormLinesNew-300x180.png&amp;nocache=1" alt="" width="300" height="180" class="size-medium wp-image-2371" srcset="https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormLinesNew-300x180.png&amp;nocache=1 300w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormLinesNew-768x461.png&amp;nocache=1 768w, https://openforecast.org/wp-content/webpc-passthru.php?src=https://openforecast.org/wp-content/uploads/2018/08/rmcExampleNormLinesNew.png&amp;nocache=1 1000w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-2371" class="wp-caption-text">RMCB example, lines plot</p></div>
<p>If you want to tune the plot, you can always do this using the standart plot parameters:</p>
<pre class="decode">plot(ourTest, xlab="Models", ylab="Errors")</pre>
<p>Also, given that we work with a flexible plot method, you can tune the parameters of the canvas using "par()" function, as it is usually done in R.</p>
<h3>What else?</h3>
<p>Several methods have been moved from <span class="lang:r decode:true crayon-inline">smooth</span> to <span class="lang:r decode:true crayon-inline">greybox</span>. These include:</p>
<ul>
<li>pointLik() - returns point Likelihoods, discussed in <a href="http://kourentzes.com/forecasting/2018/06/20/isf2018-presentation-beyond-summary-performance-metrics-for-forecast-selection-and-combination/" rel="noopener noreferrer" target="_blank">our research with Nikos</a>;</li>
<li>pAIC, pBIC, pAICc, pBICc - point values of respective information criteria, from <a href="http://kourentzes.com/forecasting/2018/06/20/isf2018-presentation-beyond-summary-performance-metrics-for-forecast-selection-and-combination/" rel="noopener noreferrer" target="_blank">the same research</a>;</li>
<li>nParam() - returns number of the estimated parameters in the model (+ variance);</li>
<li>errorType() - returns the type of error used in the model (Additive / Multiplicative);</li>
</ul>
<p>Furthermore, as you might have already noticed, I've implemented several distribution functions:</p>
<ul>
<li>Folded normal distribution;</li>
<li>Laplace distribution;</li>
<li>S distribution.</li>
</ul>
<p>Finally, there is also a function, called <span class="lang:r decode:true crayon-inline">lmDynamic()</span>, which uses pAIC in order to produce dynamic linear regression models. But this should be discussed separately in a separate post.</p>
<p>That's it for now. See you in greybox 0.4.0!</p>
<p>Message <a href="https://openforecast.org/2018/08/07/greybox-0-3-0-whats-new/">greybox 0.3.0 &#8211; what&#8217;s new</a> first appeared on <a href="https://openforecast.org">Open Forecasting</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://openforecast.org/2018/08/07/greybox-0-3-0-whats-new/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
