There is no such thing as “the best approach for everything”

If someone tells you that method X solves all problems and is the best one ever, they are either lying intentionally or do not fully understand what they are talking about. There is no such thing as “the best approach for everything”. Let me explain.

Consider two products sold by retailers: ice cream and bread. You would expect the demand for ice cream to exhibit seasonal patterns because people tend to buy it more when it is warm outside. As a result, demand in summer is typically higher on average than in winter (this doesn’t apply to my friend Nikos Kourentzes, who eats ice cream no matter what). This suggests that if we want to forecast demand for ice cream, we should use an approach that correctly captures seasonality in one way or another.

Demand for bread, on the other hand, typically follows a different pattern, as people tend to buy it regularly, and it usually does not have seasonality. Imposing a seasonal structure on such data could harm forecast accuracy.

Even in this simplistic example, it’s clear that the optimal approach may vary depending on each situation. Yes, we could develop a more flexible model capable of distinguishing between these cases, but there are multiple ways to achieve this (cross-validation, information criteria, statistical tests, etc.), and each specific solution would have strengths and weaknesses.

Now, would you expect a single new approach that can distinguish between the cases to outperform all others and be the best for every possible scenario? My answer is no, because one could always devise an alternative model or method that selects features differently and performs better under different conditions. For example, approach A might forecast demand for white bread better than approach B, but the opposite might be true for sourdough bread.

Even if approach A outperforms all others on average across a dataset, there will always be cases where it performs worse than some competitors, because forecasting accuracy is based on the distribution of error measures, not just a single number (see my old post here). This is, for example, confirmed by the M5 competition, where the winning method, LightGBM, produced the most accurate forecasts on average but was outperformed by exponential smoothing in 41.5% of cases.

The same principle applies beyond point forecasts, across other statistics, fields, and disciplines. For example, if you need to produce prediction intervals or quantiles, you have a variety of tools to choose from, and depending on the specific situation, some will work better than others. There is no single approach that outperforms all alternatives in every context.

So, when someone claims to have a silver bullet that solves all problems, keep in mind: they are either trying to sell you something or do not understand what they are talking about.

Leave a Reply