CN-116975552-B - Single-period end-to-end inventory control method considering alternatives and product

CN116975552BCN 116975552 BCN116975552 BCN 116975552BCN-116975552-B

Abstract

The invention provides a single-cycle end-to-end inventory control method considering the replaceability and a product thereof, relating to the technical field of intelligent decision making. Determining the observed characteristics of the current period, and inputting the characteristics into an end-to-end inventory decision model to obtain an inventory control strategy, wherein the inventory control strategy is expressed as a multidimensional vector and represents the purchase quantity of each grade of resource. In the embodiment of the invention, the input of the end-to-end inventory decision model is an observable feature, and the output is an inventory control strategy. Compared with the traditional prediction-before-optimization framework, the end-to-end inventory decision model is easier to deploy in an actual production environment, is simpler to use and more efficient in decision, and the end-to-end inventory decision model provided by the embodiment of the invention is efficient in training and can learn related inventory control strategies.

Inventors

ZHANG ZHIHAI
GONG HAILEI

Assignees

清华大学

Dates

Publication Date: 20260505
Application Date: 20230518

Claims (9)

1. A method of single cycle end-to-end inventory control that contemplates alternatives, the method comprising: Determining the observed characteristics of the current period, and inputting the characteristics into an end-to-end inventory decision model to obtain an inventory control strategy, wherein the inventory control strategy is expressed as a multidimensional vector and represents the purchase quantity of each grade of resource; the training sample of the end-to-end inventory decision model comprises sample characteristics and corresponding historical real demands, wherein an optimal inventory control strategy is unknown in the training process of the end-to-end inventory decision model, the training process of the end-to-end inventory decision model is label-free learning, and the output of the end-to-end inventory decision model is an inventory control strategy for minimizing experience cost The minimized empirical cost is expressed as: ; Wherein, the Is a neural network model At a given data set The cost of experience is that of the following, , Is a set of all neural network models, Represent the first The characteristics of the individual samples are such that, Represent the first Historical real demands of individual samples; Gradient was calculated by the chain law: ; Wherein, the Wherein Is the purchase cost of the resource unit, Is the optimal solution of the following optimization problem: ; ; ; ; ; Wherein, the Indicating a grade of Is a resource purchasing quantity; The selling price of the unit of the product is represented, Representing a grade The cost of service per resource unit is determined, Representing a penalty cost per unit of unmet demand; Representing a grade The resource unit processing cost; the gradient is calculated according to the back propagation calculation formula of the neural network, = , wherein, Hiding layers for neural networks The output of the kth neuron of (c).
2. The method for single cycle end-to-end inventory control considered alternatives of claim 1, characterized in that the input of the neural network model is that In the case of a first hidden layer input pass Calculating that the output of the first hidden layer is Wherein the function is ( ) For activating function, for hiding layer The input is The output is The output of the neural network model is passed through And (5) calculating.
3. The method of claim 2, wherein the training objective of the neural network model is to find a set of parameters: Minimizing experience costs; Wherein the method comprises the steps of Setting according to the number of neurons between different layers; the neural network model adjusts weights by repeatedly using training data sets To perform training.
4. The method for single cycle end-to-end inventory control considering alternatives as claimed in claim 3, characterized by calculating gradients And updating the weights by: ; Wherein the method comprises the steps of Is the learning rate.
5. The alternative single cycle end-to-end inventory control method of claim 2, wherein the neural network model employs a five-tier network architecture with a number of neurons per tier of 11, 5, 3, employing a ReLU activation function.
6. The method of claim 1, wherein determining the observed characteristics of the current cycle comprises using the observed characteristics of the current cycle and a predetermined number of previous historical cycles as the observed characteristics of the current cycle.
7. An end-to-end inventory control device that considers alternatives single cycle, the device comprising: The strategy generation module is used for determining the observed characteristics of the current period, inputting the characteristics into the end-to-end inventory decision model to obtain an inventory control strategy, wherein the inventory control strategy is expressed as a multidimensional vector and represents the purchase quantity of each grade of resource; the training sample of the end-to-end inventory decision model comprises sample characteristics and corresponding historical real demands, wherein an optimal inventory control strategy is unknown in the training process of the end-to-end inventory decision model, the training process of the end-to-end inventory decision model is label-free learning, and the output of the end-to-end inventory decision model is an inventory control strategy for minimizing experience cost The minimized empirical cost is expressed as: ; Wherein, the Is a neural network model At a given data set The cost of experience is that of the following, , Is a set of all neural network models, Represent the first The characteristics of the individual samples are such that, Represent the first Historical real demands of individual samples; Gradient was calculated by the chain law: ; Wherein, the Wherein Is the purchase cost of the resource unit, Is the optimal solution of the following optimization problem: ; ; ; ; ; Wherein, the Indicating a grade of Is a resource purchasing quantity; The selling price of the unit of the product is represented, Representing a grade The cost of service per resource unit is determined, Representing a penalty cost per unit of unmet demand; Representing a grade The resource unit processing cost; the gradient is calculated according to the back propagation calculation formula of the neural network, = , wherein, Hiding layers for neural networks The output of the kth neuron of (c).
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the alternative single cycle end-to-end inventory control method of any one of claims 1-6 when the computer program is executed.
9. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the alternative single cycle end-to-end inventory control method of any of claims 1-6.

Description

Single-period end-to-end inventory control method considering alternatives and product Technical Field The embodiment of the invention relates to the technical field of intelligent decision making, in particular to a single-cycle end-to-end inventory control method considering the replaceability and a product. Background In a system that considers alternative inventory (hereinafter referred to as alternative inventory), demand may be satisfied by resources of different levels (types). The goal of considering alternative inventory control issues (hereinafter alternative inventory issues) is to control the inventory of different levels of resources to better match the needs, thereby minimizing the overall cost of the system. Alternative inventory is widely used in many production or service systems. Companies can flexibly use various resources of the companies to meet random demands of clients at different costs, so that the enterprise cost is reduced, and profits are improved. For example, a remanufacturer may remanufactur different grades of old product to meet customer needs. Because of the long delivery time, the order time window is short and the remanufacturer needs to purchase the recycle from the recycler ahead of time before the actual demand is revealed. After receiving the customer's order, the remanufacturer remanufactures the recycle. Typically, recyclates are classified into different grades, with both the purchase price and the remanufacturing cost being different. The purchasing cost of the high-grade recycled product is high, the remanufacturing cost is low, and the low-grade recycled product is reverse. How to trade-off between these two costs is critical to remanufacturing enterprises. Cloud computing companies provide remote computing services to their customers. Companies need to configure computing environments, such as high performance computing platforms, specific software services, etc., in advance of customer demand exposure. At the same time, the customer's needs can be satisfied by different configurations. For example, the computing requirements may be satisfied by configuring different models of CPU servers. Similar settings can also be found in the case of airline cabin allocation, electric car charging services, etc. The rational configuration of inventory levels of various resources is a very tricky problem. Different resources have different procurement costs (generated when purchasing the resources) and service costs (generated when meeting customer needs). When the allocated resources are too many, there is a higher purchase cost, and when the allocated resources are too few, the unmet demand will have a higher backorder cost. When the demand is relatively stable, it is appropriate to configure resources with the lowest total cost (purchase cost plus service cost) because this minimizes costs. However, when demand is less stable, it is necessary to configure some resources that are less costly to purchase (and possibly more costly to serve), because such resources can reduce the probability of out-of-stock when demand is higher, while sinking costs are lower when demand is lower. Because of the uncertainty of demand, most inventory management methods address this challenge with a framework called "predict-then-optimize". In this framework, a decision maker first trains a predictive model to estimate random demands, and then solves corresponding inventory optimization problems to obtain configuration decisions according to the estimated demands. It is inevitable that in the first step of estimating the parameters of the predictive model, it is necessary to assume that the unknown demand follows a particular distribution (e.g. normal distribution) or functional form (e.g. linear function). If the distribution learned by the predictive model matches the potentially true distribution, then an optimal inventory control scheme may be obtained by solving the optimization problem. However, in many related fields of application, the distribution of demand is not well known, and the predetermined form of a distribution presents a problem in that (1) a strong expert knowledge is required. The requirements of different products are affected by different factors, for example, the requirements of second-hand mobile phones are affected by factors such as customer evaluation, whether to release a new phone, price and the like, and cloud computing requirements show stronger periodic requirements. It can be seen that determining the form or distribution of compliance of different product requirements requires a relatively rich industry experience. (2) introducing model selection bias. Due to the high uncertainty and unknowing nature of the requirements, it is also not possible for a more experienced expert to fully determine the correct form of the requirements. Improper demand forms can introduce model selection bias that can be further amplified by the optimization step, resulting in sub-o