CN-122019145-A - Dynamic load awareness-based resource scheduling optimization method and system

CN122019145ACN 122019145 ACN122019145 ACN 122019145ACN-122019145-A

Abstract

The application discloses a resource scheduling optimization method and a system based on dynamic load perception, which belong to the technical field of resource scheduling, wherein the method comprises the steps of acquiring indexes of a consumer group and an AI model through a load perception agent when the consumer group processes real-time business data to obtain a plurality of operation indexes; if the operation index accords with the preset resource scheduling condition, carrying out resource allocation according to the operation index and the service priority of the real-time service data to construct a resource scheduling rule, calculating an optimal scheduling action through a dynamic scheduling engine based on the operation index and the resource scheduling rule, and updating the concurrency number of the consumer group and the configuration parameters of an AI model by using the optimal scheduling action. Therefore, the application can solve the problems of low service processing efficiency and resource waste caused by the fact that the consumption capacity of the data processing cluster and the reasoning capacity of the AI model cannot meet the actual requirements because resources cannot be dynamically allocated in the prior art.

Inventors

WANG CHEN

Assignees

广州市申迪计算机系统有限公司

Dates

Publication Date: 20260512
Application Date: 20260105

Claims (10)

1. The resource scheduling optimization method based on dynamic load perception is characterized by comprising the following steps of: when the consumer group processes the real-time business data, index collection is respectively carried out on the consumer group and the AI model through the load perception agent, so as to obtain a plurality of operation indexes; If the operation index accords with a preset resource scheduling condition, performing resource allocation according to the operation index and the service priority of the real-time service data, and constructing a resource scheduling rule; calculating an optimal scheduling action through a dynamic scheduling engine based on the operation index and the resource scheduling rule; and updating the concurrency number of the consumer group and the configuration parameters of the AI model by using the optimal scheduling action.
2. The resource scheduling optimization method based on dynamic load sensing according to claim 1, wherein the index collection is performed on the consumer group and the AI model by the load sensing agent, respectively, to obtain a plurality of operation indexes, specifically: After the consumer group reads and writes real-time service data each time, calling the load perception agent carried on the consumer group to acquire the running state of the consumer group in real time to obtain a partition Lag value, memory utilization rate and CPU utilization rate of the consumer group, wherein the partition Lag value is used for measuring the delay of the consumer group in processing the data; Acquiring the running state of the AI model in real time by using the load perception agent to obtain the reasoning queue length of the AI model; and obtaining the operation index according to the partition Lag value, the memory utilization rate, the CPU utilization rate and the reasoning queue length.
3. The resource scheduling optimization method based on dynamic load sensing according to claim 1, wherein the resource scheduling conditions are specifically: Setting corresponding scheduling thresholds for partition Lag values of consumer groups in the operation indexes, CPU utilization rate and reasoning queue length of an AI model respectively; when any operation index exceeds a corresponding scheduling threshold, the resource scheduling condition is met; and when all the operation indexes do not exceed the corresponding scheduling threshold values, maintaining the concurrency number of the current consumer group and the configuration parameters of the AI model.
4. The resource scheduling optimization method based on dynamic load sensing according to claim 3, wherein the resource allocation is performed according to the operation index and the service priority of the real-time service data, and a resource scheduling rule is constructed, specifically: setting the operation index exceeding the scheduling threshold as an abnormal index, and constructing a mapping relation between the index type of the abnormal index and the resource adjustment direction; Determining the service priority of the real-time service data according to the service type of the real-time service data; and determining the allocation type of the resources according to the mapping relation, determining the allocation proportion of the resources according to the service priority, allocating the resources for the consumer group through the allocation type and the allocation proportion, and constructing a resource scheduling rule.
5. The resource scheduling optimization method based on dynamic load sensing according to claim 4, wherein the mapping relationship between the index type of the constructed abnormal index and the resource adjustment direction is specifically: When the partition Lag value exceeds a corresponding scheduling threshold, constructing a mapping relation between the partition Lag value and the consumption capacity of the consumer group; when the CPU utilization rate exceeds a corresponding scheduling threshold value, a mapping relation between the CPU utilization rate and the consumer group load is constructed; And when the length of the reasoning queue exceeds the corresponding scheduling threshold, constructing a mapping relation between the length of the reasoning queue and the reasoning capacity of the AI model.
6. The resource scheduling optimization method based on dynamic load sensing according to claim 1, wherein the calculating, by a dynamic scheduling engine, an optimal scheduling action based on the operation index and the resource scheduling rule is specifically: integrating all the operation indexes into a state vector, wherein the state vector is used for describing the current load conditions of the consumer group and the AI model; constructing an initial motion vector according to the concurrency number of the current consumer group and the configuration parameters of the AI model; And (3) quantifying each optimization target of the resource scheduling rule into a composite rewarding function, taking optimization of the state vector as a target based on the composite rewarding function, performing interactive learning with an external environment through a dynamic scheduling engine to update the initial motion vector, and outputting the optimal scheduling motion after reaching a preset optimization termination condition.
7. The resource scheduling optimization method based on dynamic load sensing according to claim 1, wherein the updating the concurrency of the consumer group and the configuration parameters of the AI model by using the optimal scheduling action is specifically as follows: according to the optimal scheduling action, the concurrency number of the consumer group is adjusted through an interface of the data processing cluster, and the configuration parameters of the AI model are updated through the model configuration center; According to the optimal scheduling action, increasing or decreasing the instance capacity of the consumer group and the AI model by modifying the CPU threshold of the HPA; and verifying the scheduling effect based on the updated result after the optimal scheduling action is used.
8. The resource scheduling optimization method based on dynamic load sensing according to claim 7, wherein the verifying the scheduling effect based on the updated result after using the optimal scheduling action specifically comprises: And based on the updating result, checking the index change of the updated consumer group and the AI model, and triggering the resource allocation updating of the consumer group and the AI model again if the index change result does not reach the preset optimization requirement.
9. The resource scheduling optimization system based on dynamic load perception is characterized by comprising an operation index acquisition module, a scheduling rule construction module, a scheduling action calculation module and a resource configuration updating module; The operation index acquisition module is used for respectively carrying out index acquisition on the consumer group and the AI model through the load sensing agent when the consumer group processes the real-time service data to obtain a plurality of operation indexes; The scheduling rule construction module is used for carrying out resource allocation according to the operation index and the service priority of the real-time service data if the operation index accords with a preset resource scheduling condition, and constructing a resource scheduling rule; the scheduling action calculation module is used for calculating the optimal scheduling action through the dynamic scheduling engine based on the operation index and the resource scheduling rule; the resource allocation updating module is used for updating the concurrency number of the consumer group and the allocation parameters of the AI model by using the optimal scheduling action.
10. A terminal device comprising a processor and a memory, the memory storing a computer program, the processor implementing the steps of a dynamic load aware resource scheduling optimization method according to any one of claims 1 to 8 when executing the computer program.

Description

Dynamic load awareness-based resource scheduling optimization method and system Technical Field The application belongs to the technical field of resource scheduling, and particularly relates to a resource scheduling optimization method and system based on dynamic load sensing. Background Along with the deep application of artificial intelligence technology, enterprises need to seamlessly integrate an AI model reasoning service with a real-time business data stream processing system (such as a Kafka-based data processing cluster) to process business in real time, wherein the typical workflow is that after the data processing cluster writes real-time business data provided by a production end into a message queue, data is extracted from the message queue through a consumer group of the data processing cluster and submitted to an AI model for reasoning, and finally, the obtained reasoning result is returned to a downstream system. Therefore, through the storage and data transmission capabilities of the data processing clusters, the AI model can process various services in time. However, the above typical workflow adopts a fixed concurrent number of consumer groups and a fixed AI reasoning batch size, which cannot be dynamically adjusted according to real-time flow change, when the consumer groups write a large amount of data into the AI model at the same time, the consumption capacity of the AI model cannot be automatically improved by the system, so that phenomena of service processing delay and unavailable service occur, and when the written data volume is reduced, the reasoning resource is wasted, namely, the consumption capacity of the consumer groups and the reasoning capacity of the AI model are excessively large due to resource mismatch, so that the service processing efficiency is low and the reasoning resource is wasted. In addition, because the data processing cluster adopts a fixed consumer group concurrency number, when data is written into the message queue to be rapidly increased, the consumer group cannot timely extract the data in the message queue, so that the phenomena of data backlog, high data acquisition delay and the like appear in the message queue, and the real-time requirement of key business is difficult to guarantee. Disclosure of Invention The application provides a resource scheduling optimization method and a system based on dynamic load perception, which can solve the problems of low service processing efficiency and resource waste caused by the fact that the consumption capacity of a data processing cluster and the reasoning capacity of an AI model cannot meet the actual requirements because resources cannot be dynamically allocated in the prior art. The first aspect of the application provides a resource scheduling optimization method based on dynamic load sensing, which comprises the following steps: when the consumer group processes the real-time business data, index collection is respectively carried out on the consumer group and the AI model through the load perception agent, so as to obtain a plurality of operation indexes; If the operation index accords with a preset resource scheduling condition, performing resource allocation according to the operation index and the service priority of the real-time service data, and constructing a resource scheduling rule; calculating an optimal scheduling action through a dynamic scheduling engine based on the operation index and the resource scheduling rule; and updating the concurrency number of the consumer group and the configuration parameters of the AI model by using the optimal scheduling action. The above scheme aims at the situation that the number of concurrent consumer groups and the reasoning batch of the AI model are both fixedly set in the prior art, so that data backlog and resource waste are easy to occur when service data are suddenly increased or reduced, and therefore, when the service data are transmitted by the consumer groups, the consumer groups and the AI model are collected to realize the detection of consumption capacity and reasoning capacity of the consumer groups and the AI model, the current load condition is preliminarily determined, and a basic condition is provided for subsequent resource adjustment. And then judging the type of the resource to be regulated according to the resource scheduling conditions, and providing accurate regulation direction and regulation degree for subsequent resource regulation by constructing a resource scheduling rule so as to ensure that the resource scheduling can accurately accord with the actual load condition. And then, the optimal scheduling action is obtained through the dynamic scheduling engine, so that balance between consumed resources and inferred resources can be ensured, collaborative operation of a consumer group and an AI model is maintained, and defects of uneven resource allocation and the like are avoided. And updating the resource allocation of the cons