How To Apply Machine Learning To Demand Forecasting

How To Apply Machine Learning To Demand Forecasting

Download PDF

Businesses during COVID-19 are operating in uncharted territory and are being forced to depend on human behavior causing rapid shifts. Human behavior can depend on satisfaction related to new experiences as well as pandemic restrictions. In this article, I want to show how machine learning approaches can help with demand forecasting and future sales predictions. I’ll also describe how to enhance forecasting accuracy related to the COVID-19 pandemic uncertainties.

Since I have experience in building forecasting models for retail field products, I’ll use a retail business as an example. Don’t worry if your business’ focus isn’t on retail. The main goal of this article is to describe the logic of how machine learning can be applied in demand forecasting both in a stable environment and in crisis.

Image credit

Demand Forecasting Methods

Demand forecasting is the process of predicting what the demand for certain products will be in the future. This helps manufacturers to decide what they should produce and guides retailers toward what they should stock.

Demand forecasting is aimed at improving the following processes:

  • Supplier relationship management. By having the prediction of customer demand in numbers, it’s possible to calculate how many products to order, making it easy for you to decide whether you need new supply chains or to reduce the number of suppliers.
  • Customer relationship management. Customers planning to buy something expect the products they want to be available immediately. Demand forecasting allows you to predict which categories of products need to be purchased in the next period from a specific store location. This improves customer satisfaction and commitment to your brand.
  • Order fulfillment and logistics. Demand forecasting features optimizing supply chains. This means that at the time of order, the product will be more likely to be in stock, and unsold goods won’t occupy prime retail space.
  • Marketing campaigns. Forecasting is often used to adjust ads and marketing campaigns and can influence the number of sales. This is one of the use cases of machine learning in marketing. Sophisticated machine learning forecasting models can take marketing data into account as well.
  • Manufacturing flow management. Being part of the ERP, the time series-based demand forecasting predicts production needs based on how many goods will eventually be sold.

There are methods of qualitative and quantitative demand assessment methods.

Demand forecasting methods

Machine Learning Approach to Demand Forecasting Methods

The above-listed traditional sales forecasting methods have been tried and tested for decades. With Artificial Intelligence development, they are now upgraded by modern forecasting methods using Machine Learning (ML).

Machine learning techniques allows for predicting the amount of products/services to be purchased during a defined future period. In this case, a software system can learn from data for improved analysis. Compared to traditional demand forecasting methods, a machine learning approach allows you to:

  • Accelerate data processing speed
  • Provide a more accurate forecast
  • Automate forecast updates based on the recent data
  • Analyze more data
  • Identify hidden patterns in data
  • Create a robust system
  • Increase adaptability to changes

How to Develop an ML-Based Demand Forecasting Software

When initiating the demand forecasting feature development, it’s recommended that you understand the workflow of ML modeling. This offers a data-driven roadmap on how to optimize the development process.

Let’s review the process of how AI engineers at MobiDev approach ML demand forecasting tasks.

Step 1. Brief Data Review

The first task when initiating the demand forecasting project is to provide the client with meaningful insights. The process includes the following steps:

  1. Gather available data
  2. Briefly review the data structure, accuracy, and consistency
  3. Run a few data tests and pilots
  4. Look through a statistical summary

In my experience, a few days is enough to understand the current situation and outline possible solutions.

Step 2. Setting Business Goals and Success Metrics

This stage establishes the client’s highlights of business aims and additional conditions to be taken into account. Our team provides data science consulting to combine it with the client’s business vision. The goal is to achieve something similar to:

“I want to integrate the demand forecasting feature so to forecast sales and plan marketing campaigns.”

Success metrics offer a clear definition of what is “valuable” within demand forecasting. A typical message might state:

“I need such machine learning solution that predicts demand for […] products, for the next [week/month/a half-a-year/year], with […]% accuracy.”

These points will help you to identify what your success metrics look like. You will want to consider the following:

Product Type/Categories

What types of products/product categories will you forecast? Different products/services have different demand forecasting outputs. For example, the demand forecast for perishable products and subscription services coming at the same time each month will likely be different.

Time Frame

What is the length of time for the demand forecast?

Short-term forecasts are commonly done for less than 12 months – 1 week/1 month/6 month. These forecasts may have the following purposes:

  • Uninterrupted supply of products/services
  • Sales target setting and evaluating sales performance
  • Optimization of prices according to the market fluctuations and inflation
  • Finance maintenance
  • Hiring of required specialists

Long-term forecasts are completed for periods longer than a year. The purpose of long-term forecasts may include the following:

  • Long-term financial planning and funds acquisition
  • Decision making regarding the expansion of business
  • Annual strategic planning


The example of metrics to measure the forecast accuracy are MAPE (Mean Absolute Percentage Error), MAE (Mean Absolute Error) or custom metrics.

Step 3. Data Understanding & Preparation 

Regardless of what we’d like to predict, data quality is a critical component of an accurate demand forecast. This following data could be used for building forecasting models:

Data sources for building demand forecasting models

Data Quality Parameters

When building a forecasting model, the data is evaluated according to the following parameters:

  • Consistency
  • Accuracy
  • Validity
  • Relevance
  • Accessibility
  • Completeness
  • Detalization

In reality, the data collected by companies often isn’t ideal. This data usually needs to be cleaned, analyzed for gaps and anomalies, checked for relevance, and restored. When developing POS applications for our retail clients, we use data preparation techniques that allow us to achieve higher data quality.

Once the data was cleaned, generated, and checked for relevance, we structure it into a comprehensive form. Below, you can see an example of the minimum required processed data set for demand forecasting:

Transactions Forecasting

Data understanding is the next task once preparation and structuring are completed. It’s not modeling yet but an excellent way to understand data by visualization. Below you can see how we visualized the data understanding process:

Dynamic pricing and demand forecasting

Step 4. Machine Learning Models Development

There are no “one-size-fits-all” forecasting algorithms. Often, demand forecasting features consist of several machine learning approaches. The choice of machine learning models depends on several factors, such as business goal, data type, data amount and quality, forecasting period, etc.

Here I describe those machine learning approaches when applied to our retail clients. But if you have already read some articles about demand forecasting, you might discover that these approaches work for most demand forecasting cases.

  • Linear Regression
  • XGBoost
  • K-Nearest Neighbors Regression
  • Random Forest
  • Long Short-Term Memory (LSTM)

Time Series Approach

This involves processed data points that occur over a specific time that are used to predict the future. Time series is a sequence of data points taken at successive, equally-spaced points in time. The major components to analyze are: trends, seasonality, irregularity, cyclicity.

The analysis algorithm involves the use of historical data to forecast future demand. That historical data includes trends, cyclical fluctuations, seasonality, and behavior patterns.

In the retail field, the most applicable time series models are the following:

1. ARIMA (auto-regressive integrated moving average) models aim to describe the auto-correlations in the time series data. When planning short-term forecasts, ARIMA can make accurate predictions. By providing forecasted values for user-specified periods, it clearly shows results for demand, sales, planning, and production.

2. SARIMA (Seasonal Autoregressive Integrated Moving Average) models are the extension of the ARIMA model that supports uni-variate time series data involving backshifts of the seasonal period.

3. Exponential Smoothing models generate forecasts by using weighted averages of past observations to predict new values. The essence of these models is in combining Error, Trend, and Seasonal components into a smooth calculation.

Case study: POS system with ML-based demand forecasting feature

Let’s say you want to forecast demand for vegetables in the next month. For a time series approach, you require historical sale transaction data for at least the previous three months. If you have historical data about seasonal products – vegetables in our case – the best choice will be the SARIMA model. The forecast error, in that case, may be around 10-15%.

Linear Regression Approach

Linear regression is a statistical method for predicting future values from past values. It can help determine underlying trends and deal with cases involving overstated prices.

Linear regression method in demand forecasting

This regression type allows you to:

  • Predict trends and future values through data point estimates.
  • Forecast impacts of changes and identify the strength of the effects by analyzing dependent and independent variables.

Let’s say you want to calculate the demand for tomatoes based on their cost. Assuming that tomatoes grow in the summer and the price is lower because of high tomato quantity, the demand indicator will increase by July and decrease by December.

The information required for such type forecasting is historical transaction data, additional information about specific products (tomatoes in our case), discounts, average market cost, the amount in stock, etc. The forecast error may be 5-15%.

Feature Engineering

Feature engineering is the use of domain knowledge data and the creation of features that make machine learning models predict more accurately. It enables a deeper understanding of data and more valuable insights.

Feature engineering method in demand forecasting

Since feature engineering is creating new features according to business goals, this approach is applicable in any situation where standard methods fail to add value. In custom ML modeling, a data scientist builds new features from existing ones to achieve higher forecast accuracy or to get new data.

Random Forest

The basic idea behind the random forest model is a decision tree. The decision tree approach is a data mining technique used for data forecasting and classification. The decision tree method itself does not have any conceptual understanding of the problem. It learns from the data we provide it.

Random forest is the more advanced approach that makes multiple decision trees and merges them together. By taking an average of all individual decision tree estimates, the random forest model results in more reliable forecasts.

Random Forest method in demand forecasting

Random forest can be used for both classification and regression tasks, but it also has limitations. The model may be too slow for real-time predictions when analyzing a large number of trees.


Demand forecasting in retail

Step 5. Training & Deployment


Once the forecasting models are developed, it’s time to start the training process. When training forecasting models, data scientists usually use historical data. By processing this data, algorithms provide ready-to-use trained model(s).


This step requires the optimization of the forecasting model parameters to achieve high performance. By using a cross-validation tuning method where the training dataset is split into ten equal parts. Data scientists train forecasting models with different sets of hyper-parameters. The goal of this method is to figure out which model has the most accurate forecast.


When researching the best business solutions, data scientists usually develop several machine learning models. Since models show different levels of accuracy, the scientists choose the ones that cover their business needs the best. The improvement step involves the optimization of analytic results. For example, using model ensemble techniques, it’s possible to reach a more accurate forecast. In that case, the accuracy is calculated by combining the results of multiple forecasting models.



Sales Forecasting For Retail During Uncertainty

When integrating demand forecasting systems, it’s essential to understand that they are vulnerable to anomalies like the COVID-19 pandemic. It means that machine learning models should be upgraded according to current reality.

As the demand forecasting model processes historical data, it can’t know that the demand has radically changed. For example, if last year, we had one demand indicator for medical face masks and antiviral drugs, this year, it would be completely different.

In that case, there might be several ways to get an accurate forecast, here are the main six of them:

  1. Collect data about new market behavior. Once the situation becomes more or less stable, develop a demand forecasting model from scratch.
  2. Apply a feature engineering approach. By processing external data, news, a current market state, price index, exchange rates, and other economic factors, machine learning models are capable of making more up-to-date forecasts.
  3. Upload the most recent POS data. The period of a loadable dataset might vary from one to two months, depending on the products’ category. In this way, we can detect shifts in demand patterns and enhance forecast accuracy in a timely manner.
  4. Apply the transfer learning approach. If there is any gathered historical data about past pandemics or similar behavior shifts, we can use them to predict demand in the context of the current crisis.
  5. Apply information cascade modeling approach. Combining the most recent POS data with the cascade modeling, the demand forecasting system can identify herd patterns of human behavior. In other words, we can forecast how people will make buying decisions according to the behavior patterns of most people.
  6. Apply natural language processing (NLP) approach. NLP technology enables the processing of real comments from social networks, media platforms, and other available social sources. By utilizing text mining and sentiment analysis approaches, NLP models gather samples of customer’s conversations. This method allows the detection of people’s preferences, choices, sentiment, and behavior shifts.

During AI app development, AI engineers analyze historical data for forecasting. This forecasting cannot predict the disruption caused by a global pandemic. Such an event requires the recalibration of the machine learning models. We met this challenge using machine learning models developed for a restaurant business prior to the pandemic. Here are the historical revenues, before the COVID-19 pandemic.

Revenue before the COVID-19 pandemic

The total lockdown caused a dramatic change.

Zero revenue during the lockdown


A methodology used for a small business does not work for a larger business. Because the data set is much greater for a larger business, the acceptance criteria are much more smooth than for a small business.
Calculate Forecast Error (E) by comparing actual sales to forecasted sales.

The choices of error measurement for E include:

  • Absolute Error (AE)
  • Percentage Error (PE)
  • Absolute Percentage Error (APE)

If the product inventory is complex in terms of weights, units sold, packaging, etc., using PE makes it easier to compare errors across various product lines.

Calculate Forecast Accuracy (A) by deducting the APE from 100%.

A higher sales volume increases tolerance for prediction errors. A large business collects more historical data, making it less challenging to identify the behavioral patterns of customers. The approach used for a small business is hypothesis testing to uncover a correlation between a predictor and the volume of sales.

Clear patterns in demand

Using a time-series approach, we can see the definitive patterns that have four key components, which are:

  • Trend: Steady, long-term, moving gradually in one direction.
  • Cycle: A variation not caused by seasonal factors is recognizable as a cycle.
  • Seasons: Regular variations within a period of less than a full year are seasons.
  • Random Components: Removal of trends and cyclical variations from time-series data uncovers a variation that is irregular.
Random sales

Time-series analysis does not apply to random sales when no clear patterns exist. In such a case, external data may be added to the forecasting model. Examples of external data may be the rates for currency exchange, stock market performance, consumer price index, or another factor influencing the economy.


E-commerce has more predictors and external factors that can be incorporated in sales forecasting models. Data from a point-of-sale system is not needed when a website collects the pertinent data for e-commerce sales. More information is known about online customers and their purchase history. Having more predictors permits the use of more machine learning (ML) models, such as Gradient Boosting, KNN, Multiple Regression, Random Forest, SVR, and others.

There is no specific model that is considered the best one or a particular way to apply a model for all circumstances. The data, such as product type or geo-location of sales, influences the choice of which model to use.
E-commerce sales are enhanced by building a recommendation model based on a customer’s past purchases. Large online retailers use the ML technique of Market Basket Analysis. This helps discover insights about associated items to use for marketing purposes.

Offline sales analysis may be limited to data about historical purchases, and an appropriate model is a time-series analysis.


Predictive models are strongly influenced by regional factors that include customer behavior and cultural determinants. Marketing campaigns may be regionally specific and have a different impact that depends on where a customer is located. Holidays may vary between regions, which might be a consideration for adjusting the model. Legal issues may limit the use of certain data in different regions.


The product type is an important factor to consider for the demand model. For example, a perishable product demand model should not overestimate demand since the excess product will be lost to waste. Instead, the modeling error should always be tailored to be a lower inventory level than actual demand.

For example, a perishable item that has an actual demand of 100 cases, the prediction of selling 90 cases is preferred over the prediction of 110 cases. Missing the sales of 10 cases is a better result than wasting 10 cases, even though the actual error is the same percentage. Extra care needs to be taken in preparing data when working with perishable items.


The frequency used in a time-series analysis depends on the patterns of demand for a specific business. Some businesses have patterns of demand that are based on the intervals of hours, days, or weeks. Restaurants have hourly demand patterns. Shopping malls have daily demand patterns and weekend peaks patterns. Highly seasonal goods, such as holiday decorations, have specific weekly patterns each year.

Future of ML Forecasting

Machine learning is not limited to demand and sales forecasting. The future potential of this technology depends on how well we take advantage of it. Today, we work on demand forecasting technology and understand what added value it can deliver to modern businesses by solving the tasks such as forecasting customer engagement, future trends, brand development, marketing campaigns, resources usage, financial risks, etc. With new ML trends emerging, we never know what opportunities AI technology will open for us tomorrow.

Want to get in touch?

contact us

By submitting your email address you consent to our Privacy Policy and agree to receive information regarding our news and business offers. You can withdraw your consent at any time by sending a request to

By submitting your email address you consent to our Privacy Policy and agree to receive information regarding our news and business offers. You can withdraw your consent at any time by sending a request to

Content Download PDF Subscribe
Open Contents
Content Download PDF Subscribe
7 Technology Trends To Change Retail Industry In 2020

7 Retail Technology Trends Reviving the Stores in 2022

Unsupervised Learning to Improve Data Quality in Machine Learning Projects

Improve Data Quality With Unsupervised Machine Learning

Artificial Intelligence in Marketing: Boost the Growth in 2022

Artificial Intelligence in Marketing: Boost the Growth in 2022