US-20260127544-A1 - SYSTEMS AND METHODS FOR PROBABILISTIC ESTIMATION IN TREE-BASED FORECAST MODELS
Abstract
Systems and methods in which a historical data set is pre-processed once per trained machine-learning model; a value of an unknown sample is forecast while tracking a leaf path of the unknown sample; the leaf path of the unknown sample is limited to a subset of trees in each trained-machine model; a set of related historical samples is determined based on the leaf path of the unknown sample, and a set of quantiles is determined from the leaf path of the unknown sample. Inventory is loaded according to the set of quantiles.
Inventors
- Sebastien OUELLET
- Leila MOUSAPOUR
- Andrii STEPURA
Assignees
- KINAXIS INC.
Dates
- Publication Date
- 20260507
- Application Date
- 20260105
Claims (12)
- 1 . A system comprising: a processor; and a memory storing instructions that, when executed by the processor, configure the system to: pre-process, by the processor, a historical data set the pre-processing executed once per trained machine-learning model; forecast, by the processor, a value of an unknown sample while tracking a leaf path of the unknown sample; limit, by the processor, the leaf path of the unknown sample to a subset of trees in each trained-machine model; and determine, by the processor, a set of related historical samples based on the leaf path of the unknown sample.
- 2 . The system of claim 1 , wherein when pre-processing the historical data set, the system is further configured to: train, by the processor, a tree-based machine learning model; make, by the processor, a forecast for the historical data set; track, by the processor, one or more leaf paths associated with the forecast for the historical data set; and create, by the processor, a group of similar samples in the historical data set at each branch of the tree-based machine learning model.
- 3 . The system of claim 2 , wherein when training the tree-based model, the system is further configured to: prepare, by the processor, the historical data set for training; define, by the processor, a target variable; select, by the processor, a set of features; and build, by the processor, a decision tree.
- 4 . The system of claim 1 , wherein when determining the set of related historical samples, the system is further configured to: determine, by the processor, a subset of frequently-repeating training samples from the historical data set, on the leaf path of the unknown sample; estimate, by the processor, a probability density function for the leaf path of the unknown sample; generate, by the processor, a set of synthetic values from the probability density function; and compute, by the processor, a set of quantiles from the leaf path of the unknown sample, based on specified service levels and the set of synthetic values.
- 5 . A non-transitory computer-readable storage medium, the computer-readable storage medium including instructions that when executed by a computer, cause the computer to: pre-process, by a processor, a historical data set the pre-processing executed once per trained machine-learning model; forecast, by the processor, a value of an unknown sample while tracking a leaf path of the unknown sample; limit, by the processor, the leaf path of the unknown sample to a subset of trees in each trained-machine model; and determine, by the processor, a set of related historical samples based on the leaf path of the unknown sample.
- 6 . The computer-readable storage medium of claim 5 , wherein when pre-processing the historical data set, the instructions that when executed by the computer, further cause the computer to: train, by the processor, a tree-based machine learning model; make, by the processor, a forecast for the historical data set; track, by the processor, one or more leaf paths associated with the forecast for the historical data set; and create, by the processor, a group of similar samples in the historical data set at each branch of the tree-based machine learning model.
- 7 . The computer-readable storage medium of claim 6 , wherein when training the tree-based model, the instructions that when executed by the computer, further cause the computer to: prepare, by the processor, the historical data set for training; define, by the processor, a target variable; select, by the processor, a set of features; and build, by the processor, a decision tree.
- 8 . The computer-readable storage medium of claim 5 , wherein when determining the set of related historical samples, the instructions that when executed by the computer, further cause the computer to: determine, by the processor, a subset of frequently-repeating training samples from the historical data set, on the leaf path of the unknown sample; estimate, by the processor, a probability density function for the leaf path of the unknown sample; generate, by the processor, a set of synthetic values from the probability density function; and compute, by the processor, a set of quantiles from the leaf path of the unknown sample, based on specified service levels and the set of synthetic values.
- 9 . A computer-implemented method, comprising: pre-processing, by a processor, a historical data set the pre-processing executed once per trained machine-learning model; forecasting, by the processor, a value of an unknown sample while tracking a leaf path of the unknown sample; limiting, by the processor, the leaf path of the unknown sample to a subset of trees in each trained-machine model; and determining, by the processor, a set of related historical samples based on the leaf path of the unknown sample.
- 10 . The computer-implemented method of claim 9 , wherein pre-processing the historical data set comprises: training, by the processor, a tree-based machine learning model; making, by the processor, a forecast for the historical data set; tracking, by the processor, one or more leaf paths associated with the forecast for the historical data set; and creating, by the processor, a group of similar samples in the historical data set at each branch of the tree-based machine learning model.
- 11 . The computer-implemented method of claim 10 , wherein training the tree-based model comprises: preparing, by the processor, the historical data set for training; defining, by the processor, a target variable; selecting, by the processor, a set of features; and building, by the processor, a decision tree.
- 12 . The computer-implemented method of claim 9 , wherein determining the set of related historical samples comprises: determining, by the processor, a subset of frequently-repeating training samples from the historical data set, on the leaf path of the unknown sample; estimating, by the processor, a probability density function for the leaf path of the unknown sample; generating, by the processor, a set of synthetic values from the probability density function; and computing, by the processor, a set of quantiles from the leaf path of the unknown sample, based on specified service levels and the set of synthetic values.
Description
This application is a continuation of U.S. Ser. No. 18/180,418 filed on Mar. 8, 2023, which claims priority on both U.S. Ser. No. 63/317,600 filed Mar. 8, 2022; and U.S. Ser. No. 63/325,678 filed Mar. 31, 2022, the disclosure of each of which is hereby incorporated by reference in its respective entirety. BACKGROUND From a technical perspective, the conventional approach to forecasting a range (rather than a single point) in a machine learning model, is to use traditional time series models, such as Autoregressive Integrated Moving Average (ARIMA), since these inherently produce prediction intervals. However, ARIMA models are not always accurate, and can be computationally unstable. Other conventional approaches for forecasting a range of values (rather than a single point), include neural networks and Bayesian models. However, these have the drawback in that they are expensive and harder to tune. In fact, all of the conventional approaches for forecasting a range of values, require more data, more run-time, and are overall, computationally resource-intensive. This range of forecasts (also called “probabilistic forecasts”) has practical applications in a variety of fields. One such field is supply chain planning. Supply chain planners need more information than point estimate forecasts to effectively plan for demand surges or declines, as well as information about the trade-off between increasing the service level at specific locations and the impact on the quantity required to achieve specific service levels. For example, the amount of inventory to hold can be calculated based on the variation of historical demand. Such a variation can depend on many factors, which makes the variation, or range, both difficult and computationally intensive to forecast. This extra information can be computationally intensive to produce. This information should also be statistically consistent with all forecasts visible to the users. The point estimate forecast that is the best estimate of the demand should consistently be close to the 50th quantile of demand, and as an example the 55th quantile should consistently be lower than the 60th quantile of demand. The greater the accuracy in forecasting the variation of a demand, leads to a more cost-efficient loading of inventory, which thereby reduces inventory costs. There is thus a need to provide a range of forecasts in machine learning, in a manner that is computationally efficient. BRIEF SUMMARY While tree-based models are computationally efficient and fast, conventional uses of tree-based models do not forecast a range of values. That is, conventional use of tree-based models does not provide a probability of forecasts. However, systems and methods disclosed herein, use tree-based models in a non-conventional approach that results in probability forecasts. Furthermore, such a unique approach is less data-intensive, requires less run-time and is more computationally efficient than the Bayesian models, neural networks and traditional time series models used for probability forecasts. The systems and methods disclosed herein relate to an estimation of the probability of a forecasted quantity. The systems and methods disclosed herein may be presented in two parts. One part relates to systems and methods of estimating the probability of a forecasted value. A second, optional use case afforded by the first part, discloses systems and methods where values used by a process are optimized. In both parts, there is no restriction as to the context of the values. While there is no restriction as to the context of the quantities, non-limiting examples in the field of supply chain management are provided. In one aspect, a computer-implemented method includes: pre-processing, by a processor, a historical data set the pre-processing executed once per trained machine-learning model; forecasting, by the processor, a value of an unknown sample while tracking a leaf path of the unknown sample; limiting, by the processor, the leaf path of the unknown sample to a subset of trees in each trained-machine model; determining, by the processor, a set of related historical samples based on the leaf path of the unknown sample; and determining, by the processor, a set of quantiles from the leaf path of the unknown sample. In the computer-implemented method, pre-processing the historical data set may also include: training, by the processor, a tree-based machine learning model; making, by the processor, a forecast for the historical data set; tracking, by the processor, one or more leaf paths associated with the forecast for the historical data set; and creating, by the processor, a group of similar samples in the historical data set at each branch of the tree-based machine learning model. In the computer-implemented method, determining the set of related historical samples may also include: determining, by the processor, a subset of frequently-repeating training samples from the historical data set, on