This website does not support Internet Explorer. Please, switch to Edge, Chrome, or Firefox browser to view this page.

Learn about IE support
Best Practices, Knowledge

Forecast quality is about more than just accuracy

Blogs 17 o9brand 2
Published: July 27, 2022Reading time: 4 min
Stefan de Kok Principal Data Science Architect
Stefan de KokPrincipal Data Science Architect

Forecast quality

Forecast stability

Forecast desirability


Published: July 27, 2022

Forecast quality

There are two dominant camps as to how the quality of a forecast should be judged. The traditional camp—originating from and still favored by academia—looks only at the intrinsic qualities of a forecast, such as accuracy and bias. The second camp—rising more recently from industry—objects because academic metrics do not completely reflect the impact on the business. This camp promotes metrics such as Forecast Value Add (FVA) and more exotic correlations to impacts on downstream processes. We forecast not for the purpose of forecasting but to improve the business. They aim to measure that improvement, such as the effect of forecast on inventory levels or expediting.

Here we make the case that while the causes of both camps may be necessary, they are not sufficient in practice. This blog highlights two additional aspects of a forecast that are often ignored but are critical to a business’ forecast, and these vital factors are stability and desirability.

Forecast stability

None of the metrics used to judge the quality of a forecast of either school consider change over time. If we generate a forecast this month, it may be accurate. If we generate another forecast next month, it may also be accurate. But how much did it change? Maybe both had an error of around 20%, but one was positively biased and the other was negatively biased. The change in quantities forecasted could be as significant as 40% in this case. Figure 1 shows an example of two such forecasts with equal error but opposite bias.

Figure 1: Forecast instability

Forecast instability occurs when consecutive forecasts for the same period have dramatically different values. It is not uncommon for forecasts to change upwards of 10% when measured across entire years and entire portfolios from one forecast cycle to the next. Obviously, for individual time series, the changes are even greater. This is a significant contributing factor to the bullwhip effect that businesses experience.

Forecast desirability

While stability can be measured objectively, a subjective aspect of forecast quality is its desirability. When measured at agreed-upon lags, a generated forecast may have all the right metric values but not satisfy the customer’s expectations. Figure 2 below shows a few examples.

Figure 2: The expected forecast by a customer and three common undesirable forecasts

The examples, in this case, are clearly undesirable given the smooth historical pattern even for a short time range. Most automated forecasts will not pick up on seasonality with such brief history and could produce any of the output examples. While understandable—even justifiable—to an expert, the customer is unlikely to agree. Automated forecasts are an area where the risk of overfitting (finding patterns where there are none) and the risk of underfitting (failing to find existing patterns) clash.


The examples included are anecdotal. In practice, the situation is typically not so clear. But the examples highlight that the quality of a forecast should not just be taken at face value. In business, each forecast is but one iteration in an endless cycle of iterations. What came before and what comes after matters, and ignoring that turns most quality assessments into academic exercises, even those that measure impact on the business.

Similarly, not all aspects of quality can be easily measured numerically. Often a visual inspection will expose issues hidden by the numbers. That is not to say it cannot be mitigated by better algorithms. There will always be cases where the customer expects more. Often, they will be right. We, humans, perceive patterns even when they are pure coincidence. Forecast algorithms tend to err on the safe side. For desirability, maybe they should err on the daring side?

Blog page hero image

Get free industry updates

Each quarter, we'll send you a newsletter with the latest industry news and o9 knowledge. Don’t miss out!

I agree to the privacy policy.

I agree to the privacy policy.

About the author

Stefan de Kok Principal Data Science Architect

Stefan de Kok

Principal Data Science Architect

Stefan de Kok consults with o9 Solutions as a Principal Data Science Architect. He has 25 years experience in supply chain planning, optimization and forecasting, across a wide range of industries. As an applied mathematician he works mainly on model design and remediation, algorithm development, and metric design for quality measurement. He is recognized as an innovator on these topics especially in the area of probabilistic planning and forecasting. He is an active blogger and is currently writing a textbook on these topics.


View our related articles, white papers, use cases & videos

news3 min

o9 Takes Big Steps in Augmenting Its Industry-leading Integrated Planning Platform with Generative AI Capabilities

by o9 Solutions
[o9 pr gen ai newsroom]
White Paper

Next-gen Integrated Business Planning for CPG

O9 whitepaper ibp cpg mockup 1
news2 min

o9 Solutions Enables Digital Transformation at Shurtape Technologies

by o9 Solutions
O9 pr shurtape web banner 1 1
White Paper20 min

Multi-Tier supply chain collaboration and risk management

O9 pov supplier collaboration mockup 1
article6 min

The Top Takeaways from aim10x Dallas 2023

by o9 Solutions
Aim10x Dallas
White Paper15 min

Sustainable Supply Chains: Driving Actionable Insights

O9 whitepaper sustainability planning framework 2 mockup 1
© – o9 Solutions, Inc.
Privacy policy
Cookie Policy
Privacy policy