All supply chain management problems have their roots in the search of the supply and demand equilibrium. Someone demands (offers) a good at a certain price, and for this demand (offer) to be satisfied (sold), the price or relevance of the product should be attractive enough. This over summarised dynamic that you learn in any economy introductory course develops into one of the most complex problems to solve. As it is quite easy to get lost in the broadness of the issue, we’ll start from somewhere close to the beginning.
1. The Inventory Management Problem
Efficient inventory management remains an essential part of supply chain operations. Mismanagement can tie up substantial capital, increase carrying costs, and risk obsolescence. Conversely, inadequate inventory can potentially lead to missed sales opportunities, loss of customer trust and even death (on shortage of medical supplies). Striking the optimal balance thus becomes a significant challenge.
In order to tackle this challenge we first must anticipate the demand and then act in accordance.
1.1. Demand forecasting
Precise demand forecasting is vital to ensuring that the right products are available at the right place and at the right time. However, demand is influenced by a multitude of factors including seasonal trends, market shifts, and unforeseen events, complicating the task of accurate forecasting. Fortunately, machine learning algorithms can be trained on historical sales data, market trends, and other pertinent factors to predict future demand. For example, Amazon uses complex demand forecasting models to optimise its inventory. They leverage vast amounts of data collected from their sales, including historical sales data, special events, holidays, and even weather forecasts. For instance, they’ve used machine learning models to predict the demand for different products during events like Black Friday, allowing them to stock up on high-demand items and avoid stock-outs or overstocking. Furthermore, Amazon uses machine learning to optimise its warehouses, positioning products based on demand to reduce the time taken to pick items for an order. The good news? You don’t have to be Amazon-sized to achieve this. Anyone with a transactional database and willingness to invest in cloud services can do the same.
But what model should we choose? Let’s check on what the industry is doing to solve the forecasting problem.
Here, most businesses operate by using one of the following:
- Simple statistics: any form of mean or median predictions. Could be used as benchmarks to compare the performance of machine learning models, but these are not smart forecasts as we are assuming that the time window chosen to calculate the mean is a good estimate of the mean of future values. This could work for low variance and static series, but when we have seasonality problems and special events we’ll end up messing things up.
- Classic time series models: autoregressive models (ARIMA, SARIMA, SARIMAX, VAR, etc) or exponential smoothing (Holt-Winters), which may deliver good results when the series are stable i.e. they have low variance and clear trends. Of course, these are better than using simple statistics yet they do not leverage the full potential of the features that characterise each time series and additional information that could be extracted from similar series (for example, products with similar sales and characteristics). Additionally, in case you have thousands of series composed by store-product combinations, the fitting process of these models will become a gruesome one.
- Ensemble/Boosting algorithms: models such as Random Forest, CatBoost, XGBoost and LightGBM. In this regard, it is important to highlight that these models aren’t specifically designed to handle time-series data, especially when it comes to multistep or multi-horizon forecasting, where each prediction should take into account previous predictions. In short, time-series data often contain temporal dependencies, which these algorithms do not explicitly model. Using these methods for multistep forecasting would involve predicting one step at a time and using each prediction as an input for the next prediction (i.e. recursive forecasting). This approach, however, can accumulate errors as predictions are propagated forward… So even if this is accounted for, the procedure remains conceptually flawed and suboptimal. Moreover, as in the case of classic time series models, we’ll face a real problem when trying to train on thousands of series composed by store-product combinations, as these models lack the ability to learn the specifics within such a wide variety of classes.
As you can guess we do not believe that these methods could work in complex scenarios where we have millions of observations, lots of series, and high volatility. This is where Deep Learning Models can offer a significant advantage provided you have enough quality data and computational power.
In this regard, we can implement neural networks that are designed to handle sequences of data, making them better suited for tasks like multi-horizon forecasting where the prediction of future values depends on past and present inputs. The most famous ones are:
- Recurrent Neural Networks (RNNs): designed to recognise patterns in sequences of data, such as text, genomes, handwriting, or the time series data used in forecasting. They are called “recurrent” because they perform the same operation for every element of a sequence, with the output being dependent on previous computations. This allows them to maintain information in “memory” over time. However, they can struggle with maintaining this “memory” over long sequences.
- Long Short-Term Memory (LSTM) networks: a special kind of RNN, capable of learning long-term dependencies. LSTMs were designed to combat the “vanishing gradient” problem in traditional RNNs, which made it hard for the networks to maintain their “memory” over long sequences. The key to LSTMs is the cell state, a horizontal line running through the top of the diagram (if you’ve ever seen a schematic of an LSTM). The cell state can carry information throughout the processing chain with minimal changes, which is what allows LSTMs to transport “memory” down the sequence chain. This makes them particularly effective for tasks where the prediction of future values depends on past and present inputs.
In short, RNNs and LSTMs are suited for time series forecasting due to their ability to process sequence data and maintain “memory” over time. However, LSTMs, with their unique architecture, have a better capability to handle long-term dependencies in the data, making them a preferred choice for complex forecasting tasks.
In this regard, I’ve found recent success with the Temporal Fusion Transformer (TFT) model to forecast the daily demand of a multitude of products (+400) across various stores (+50) by just using the information that describes each time series (sales, price, product category/subcategory, month, weekday, weekend, discount percentage, store_id, store location, product_id, time_idx, special days, relative_time_idx and rolling averages). If there is enough interest I may write an article on the full implementation, until then, here’s the paper. In any case, I believe that at the moment this is the most powerful predictive tool for multi-horizon forecasting.
Here are some of its main advantages that make it clear how deficient the other approaches are.
- Handling Multiple Time Series: TFT can work with multiple time series data, which is a critical factor when you need to forecast demand for a large number of products across different stores. Each product and store may have their own unique characteristics and time-series patterns, such as seasonality, trends, and other factors. TFT is capable of capturing these patterns for each individual time series, enabling more accurate and personalised predictions. Note that this is one of the problems that boosting algorithms does not manage to deal with correctly.
- Flexible Incorporation of Static and Dynamic Covariates: TFT can incorporate different types of covariates in the forecasting model. For instance, information about each store (like location, size, and customer demographic) and each product (like category, subcategory, price, and promotional activities) can be fed into the model as static covariates. Similarly, dynamic covariates like special events, weather, or changing economic conditions can also be incorporated.
- Non-linear and Complex Temporal Dependencies: TFT is particularly adept at capturing complex temporal dependencies and non-linear relationships in the data. It uses attention mechanisms to identify and weigh important historical information. It allows the model to learn what information from the past is most relevant for forecasting future values. Note that this comes as a solution to one of the most common issues with forecasting i.e. choosing which time window is relevant to predict the next sequence of values.
- Probabilistic Predictions: One of the key advantages of TFT is that it can provide probabilistic forecasts. This is particularly important in demand prediction, where understanding uncertainty can guide decision making. For example, knowing the range within which the demand is expected to lie (with a certain probability) is very useful in determining stock policies, such as safety stock levels, and avoiding stockouts or overstocking. This is done by using a QuantileLoss function instead of the commonly used performance metrics such as RMSE, MAE, MAPE, SMAPE, etc.
- Interpretability: Despite being a “black-box” model, TFT has mechanisms for interpretability. The attention mechanisms in the model can provide insights about what historical inputs are important for the prediction and how different covariates contribute to the forecast. This can be important in business scenarios where understanding the reasons behind a prediction is as important as the prediction itself. In the example mentioned, we use this to do a manual debugging of the model, to check for inconsistencies or “badly” trained models. For instance, we found that the special days needed some tuning as the days directly before the special days were more relevant that the special days per se (movement from the city centers began the day before).
- Scalability: Lastly, TFT is also designed to be computationally efficient. It uses variable selection networks to reduce the dimensionality of the data, making it feasible to scale to large numbers of time series. This makes it quite suitable for a scenario like the one we mentioned, where we have to predict the demand for hundreds of products across dozens of stores.
I advise you to check the implementation of TFT and more advanced models in the pytorch-forecasting documentation.
1.2. Inventory optimisation
So, once we forecast the demand for the next operating period, we must determine the best way to acquire the goods/components required to meet that demand. But where do we start? We can begin by identifying everything that needs to be analysed to achieve the best outcome. For instance, a list of variables could include:
- Supplier alternatives and costs: How many suppliers do I have, and how much do they charge me for each unit provisioned?
- The presentation of the product: you may have predicted that you are going to sell 120 candies but these come in boxes of 60, so you need to convert all units into their deliverable version.
- Warehouse capacity: How much stock can my warehouse accommodate?
- Delivery times: By what date and time do I need to receive the stock to avoid a goods shortage?
- Lead times: How long does each supplier take to deliver the order after it has been placed?
- Order times: What dates does the supplier accept and deliver orders? A supplier may only deliver on specific days of the week.
- Security stock: What’s the policy regarding the minimum stock of a product at each warehouse?
- Store locations: Where do the goods/components need to be delivered?
- Load transportation method: How will the orders be delivered and what characteristics are associated with that decision?
After considering these factors, it’s time to build a model. This is where data science extends beyond merely running code implementations. We’ll need to combine domain knowledge, technical skills, and creativity to work within the unique framework of Linear Programming and Mixed Integer Programming (or alternatively heuristics and meta-heuristics).
I’ve already written an article on this topic where a real life problem is presented, so I won’t go into the details. However, the point is that these methods can help us to maximise/minimise an outcome (utility function) based on several types of restrictions (for example, warehouse capacity, number of trucks, workers) and a cost function (variable and fixed costs per transportation method, cost per driver, cost per supplier, etc). The main strength and weakness of this framework is that you must define the problem you want to solve:
- If you lack creativity, domain knowledge, and technical skills, you might fail to specify the problem;
- But, the more knowledgeable you are, the more likely it is that you will end up defining a highly specific problem that becomes computationally unsolvable.
In any case, it’s always better to be in the latter scenario, as you can always reduce the problem specification. Building on this, the concept of inventory optimisation leads us into the broader problem of supply chain network optimisation.
2. The Supply Chain Network Optimisation Problem
Optimising the supply chain network is a multi-faceted challenge where we start by assuming/forecasting a certain demand and then determine the most efficient routes, modes of transportation, and storage facilities of different goods. In terms of transportation, these issues can be further classified based on whether they involve unimodal or intermodal logistics.
- Unimodal logistics: involves the use of a single mode of transportation, such as land, sea, or air. Each mode presents unique challenges — land transportation can face traffic congestion or road infrastructure issues, sea freight is subject to weather conditions and port capacity, and air freight involves considerations like airspace restrictions and fuel costs. Each mode requires specific types of variables and a unique problem definition. For example, an interesting problem in land transportation would be optimising the transportation of loads across the country based on the type of equipment (e.g. van, reefer or flatbed), driver preferences (perhaps some states are to be avoided), time windows for pickup and delivery, and work restrictions (like mandatory rest periods for drivers);
- Intermodal logistics: involves coordinating multiple modes of transportation, such as using trucks and ships for the same journey. This adds an extra layer of complexity as it requires seamless coordination between different modes of transportation and handling goods during the transfer from one mode to another. Given these complications, we could either tackle each problem separately or strive to provide a global optimum, which would result in a large model with several assumptions that may or may not be sufficiently accurate. In this case, the best approach might be to bring all the experts or actors involved in each stage, draft the problem as accurately as possible, and then prune non-essential elements.
Once again, algorithms such as Linear Programming and Mixed Integer Programming can be employed to find partial solutions to the problems that arise when analysing unimodal and intermodal businesses. However, given the potential size and complexity of the problem, heuristic and meta-heuristics algorithms like Genetic Algorithms may be more suitable.
From my experience (just in unimodal logistics), the best approach to complex problems, like the Capacitated Vehicle Routing Problem with work and time window constraints, is to use a combination of heuristics, traditional graph theory algorithms, and a well-designed data processing procedure (smart filtering and vectorised operations). This approach narrows down the solution search space, enabling us to sidestep the computation of unfeasible or suboptimal combinations of outcomes, and more important than anything: to obtain a solution within a reasonable amount of time.
One important caveat though: these specific solutions often remain unpublished, forming part of the strategic algorithms that offer a competitive edge to a company. So, the main takeaway here is that before rushing to implement a solver from a package like Google OR-Tools, consider whether you could achieve sufficient results using heuristics/meta-heuristics, graph algorithms, and intelligent data processing. This approach could offer a more manageable and equally effective solution to your supply chain network optimisation challenges.
3. The Risk Management Problem
Supply chains are susceptible to a wide array of risks, from natural disasters to geopolitical unrest and market volatility. Identifying, assessing, and mitigating these risks are significant challenges within supply chain management.
3.1. Potential solutions
As recent studies suggest, there’s still much to be done regarding Supply Chain Risk Management (Ganesh, A. D., & Kalpana, P. , 2022). With this in mind, I summarise some of the developments possibilities that are being discussed:
- Harnessing NLP for Proactive Risk Monitoring: Recent Natural Language Processing (NLP) models like GPT, among others, can be used to scrutinise news articles and social media posts for potential supply chain threats. This information can generate personalised alerts and devise corresponding mitigation strategies. Coupled with advanced analytics, this could offer predictive insights into impending market turbulence, enabling proactive risk management. To illustrate, consider creating a bot using the OpenAI GPT API to monitor specific media channels. It can extract pertinent data from news articles and social media posts and use the GPT model to determine if an event may impact your operations. For instance, during the Suez Canal blockage in 2021, a well-configured NLP system would have identified the event and alerted relevant stakeholders to begin considering alternative routes or modes of transportation.
- Evaluating Performance with NLP & Forecasting Tools: The same NLP tools, combined with graph theory, forecasting, and classification models, can evaluate, categorise, and recommend Suppliers/Shipper/Driver based on their historical performance. This contributes to supply chain efficiency and transparency, enabling underperforming entities to be identified and potentially replaced. Consider a model that assesses a supplier’s history of delivery timeliness, product quality, and responsiveness. This can help identify whether they are a reliable partner or if sourcing alternatives should be considered.
- Predictive Maintenance Models: Predictive models allow us to forecast machinery and equipment issues, enhancing preventative maintenance measures. This could reduce downtime, cut costs, and extend equipment lifespan. For example, a predictive model might analyse historical data on a fleet of delivery trucks to anticipate when certain components are likely to fail, helping to schedule preventative maintenance and avoid unexpected breakdowns.
- Geospatial Analysis for Geographical Risks: Geospatial analysis can elucidate and evaluate geographical risks in the supply chain, such as a warehouse located in a flood-prone area, allowing for better contingency planning.
- Leveraging IoT and Blockchain for Real-time Monitoring and Predictions: Pairing IoT and Blockchain technology data with predictive models can boost real-time monitoring and tracking capabilities. Moreover, it can enable real-time deviation predictions, accompanied by alerts and mitigation strategies. This improves transparency and traceability, diminishing risks tied to fraud, counterfeiting, and non-compliance. (Kaur, A., et al., 2022).
3.2. What can be done while studying potential solutions
Based on what I’ve seen, I cannot stop emphasising that organisations should invest more resources in developing simulations to understand the potential impacts of various strategic decisions. The best thing? You don’t need more than pure Python and domain knowledge.
Consider a hypothetical retail company “TheRetail”. TheRetail is considering whether to centralise their supply management through a single logistics operator, or maintain direct relationships with multiple suppliers. To evaluate this decision, they could run simulations to assess the potential impacts of each option. These should, at the minimum, consider factors like cost efficiency, delivery speed, flexibility, and vulnerability to disruptions. Let’s imagine some scenarios:
a) simulate a situation where the logistics operator faces a significant delay or shutdown. How long could TheRetail maintain operations using their current inventory? How would this impact their sales and customer satisfaction?
b) consider the impact of a supplier shutting down or raising prices. Could the logistics operator quickly find a replacement? How would this compare to TheRetail’s ability to switch suppliers in their current system?
These are just simple examples to showcase that by running thousands or even millions of scenarios any company can gain a better understanding of the potential risks and benefits associated with each strategy, while allowing for better informed decisions.
Up to know we discussed about solutions but have not mentioned why is it that there is still so much to do. Don’t worry, we’ll go over that next.
4. The Data Problem and the Future of Supply Chain
Thus far, we have outlined the three main challenges in supply chain management — inventory management, supply chain network optimisation, and risk management. These issues coexist and interplay, creating a complex landscape. The key to navigating this landscape and addressing these challenges lies in the realm of data. Specifically, we require systematic, clean data adhering to a unified taxonomy. As probably more than one angry logistics manager has stated, “The era of managing supply chains via excel spreadsheets must end…”
4.1. Deciphering the Data Puzzle
It is no surprise then that a data-centric approach is indispensable for tackling these challenges. What this is implies is that we need to account for a spectrum of data sources, encompassing but not restricted to: Historical sales data for demand forecasting; Current inventory levels and their locations; Supplier performance data; Transportation costs and delivery times; Market trends and customer behaviour data; Geopolitical data and news updates for risk assessment; Equipment usage data.
Overcoming data acquisition and integration challenges can be done by implementing a robust data infrastructure that can handle large volumes of diverse data types. Moreover, partnering with data providers or enhancing internal data collection processes can address data scarcity issues.
Beyond mere data acquisition, we must also ensure our ability to interconnect this data in a coherent and logical manner. In other words, we need a clear database schema, outlining distinct entities and their relationships. Essentially, we need to realise End-to-End (E2E) Supply Chain Visibility and consolidate this wealth of information into a consistent database — a monumental challenge. However, there are individuals brave (or audacious) enough to confront this task and catalyse a new era in supply chain management: The era of End-to-End Supply Chain Visibility. But what does this concept refer to?
4.2. The Future — End-to-End Supply Chain Visibility
End-to-End (E2E) supply chain visibility is a transformative concept in the industry. It mandates a comprehensive, transparent view of all processes and data from the inception to the conclusion of the supply chain. Accomplishing this comprehensive visibility involves navigating multiple actors, intricate processes, and vast data points. (Dolgui, A., et al., 2022).
Yet, by integrating Artificial Intelligence (AI) throughout the supply chain, we can exploit advanced analytics and machine learning capabilities. These AI-powered technologies interpret and contextualise the vast data generated by supply chain activities, offering invaluable insights and enhancing visibility across the entire chain. In doing so, they establish the groundwork for complete transparency.
As you can guess, AI not only enables end-to-end transparency but also flourishes in such an environment. With transparency in place, and access to all the relevant data, models operate more efficiently, delivering more accurate and valuable outputs. This synergistic dynamic ultimately fosters a more proactive approach to supply chain management. Once this full transparency is achieved, everything we talked about in the previous section becomes much easier to make a reality.
This overview provides a glimpse into the transformative potential of AI and data science technologies in the supply chain industry. By tackling long-standing challenges, these technologies promise to shape a future of enhanced efficiency, resilience, and responsiveness in supply chains. As we stand on the brink of this exciting frontier, we invite you, our readers and fellow pioneers, to continue exploring and embracing the convergence of data science and supply chain management.
Stay tuned for more insightful articles in this series as we endeavour together to improve the world through the democratisation of innovative solutions.
 Dolgui, A., & Ivanov, D. (2022). 5G in digital supply chain and operations management: Fostering flexibility, end-to-end connectivity and real-time visibility through internet-of-everything. International Journal of Production Research, 60(2), 442–451.
 Ganesh, A. D., & Kalpana, P. (2022). Future of artificial intelligence and its influence on supply chain risk management–A systematic review. Computers & Industrial Engineering, 108206.
 Kaur, A., Singh, G., Kukreja, V., Sharma, S., Singh, S., & Yoon, B. (2022). Adaptation of IoT with blockchain in Food Supply Chain Management: An analysis-based review in development, benefits and potential applications. Sensors, 22(21), 8174.
 Lim, B., Arık, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4), 1748–1764.
 Sharma, R., Shishodia, A., Gunasekaran, A., Min, H., & Munim, Z. H. (2022). The role of artificial intelligence in supply chain management: mapping the territory. International Journal of Production Research, 60(24), 7527–7550.