CN-121722571-B - Multi-agent optimal distribution method of dynamic routing algorithm

CN121722571BCN 121722571 BCN121722571 BCN 121722571BCN-121722571-B

Abstract

The invention provides a multi-agent optimal distribution method of a dynamic routing algorithm, which relates to the technical field of artificial intelligence and comprises the steps of obtaining a task feature matrix, constructing a multi-agent capacity matrix, constructing a multi-objective optimization function comprising a specific matching degree, cost and performance achievement scale based on the task feature matrix and the multi-objective optimization function, solving by adopting an improved genetic algorithm to obtain an optimal distribution scheme, supporting user intervention and automatic verification and updating, monitoring a load in real time and dynamically migrating a task when overload is detected. The invention improves the resource allocation efficiency, enhances the robustness of the system and realizes the intelligent load balancing.

Inventors

SUN SHUMENG
Jia Songrui
ZHAO LEIZHEN

Assignees

北京亦庄智能城市研究院集团有限公司

Dates

Publication Date: 20260508
Application Date: 20260211

Claims (7)

1. The multi-agent optimal distribution method of the dynamic routing algorithm is characterized by comprising the following steps of: The method comprises the steps of obtaining a task, extracting a task feature matrix, constructing a multi-agent capability matrix, analyzing task description, obtaining task type identification, extracting a special dimension and a requirement intensity from a preset knowledge base based on the task type identification to serve as special requirements, extracting resource quota and execution time limit of the task to serve as cost constraints, obtaining service quality indexes corresponding to the task to serve as performance requirements, obtaining offline capability evaluation data and online task execution feedback data aiming at each multi-agent in the multi-agent cluster, performing time sequence weighted fusion on the offline capability evaluation data and the online task execution feedback data to generate dynamic capability scores to serve as capability attributes, obtaining historical resource consumption data and task completion time length data to calculate unit task overhead to serve as cost components of the capability attributes, collecting calculation resource occupation data and task queue data in real time to calculate current load values to serve as load states; The method comprises the steps of constructing a multi-objective optimization function based on a task feature matrix and a multi-agent physical energy moment matrix, solving the multi-objective optimization function by adopting an improved genetic algorithm which introduces a load state as an auxiliary judgment condition in a selection operation to obtain an optimal multi-agent, generating an allocation instruction, calculating a comprehensive fitness sequencing value of each individual when the selection operation is executed, extracting a load state of the allocated multi-agent to judge whether the load threshold is exceeded or not, reducing the selected probability for the individual exceeding the load threshold, calculating the selected probability for the individual not exceeding the load threshold based on the comprehensive fitness sequencing value, selecting the individual to enter a crossover operation according to the selected probability, executing the crossover operation for the selected individual to generate a child individual, and executing a mutation operation for the child individual; The method comprises the steps of obtaining a current iteration algebra calculation evolution stage coefficient, extracting the standard deviation of population load distribution of load states of multiple intelligent agents in a multiple intelligent agent cluster, dynamically calculating a load threshold value based on the evolution stage coefficient and the standard deviation of the population load distribution, extracting the load deviation degree of the current load state and the load threshold value of the load states of the distributed multiple intelligent agents in a multiple intelligent agent distribution scheme corresponding to each individual, calculating a gradient penalty coefficient based on the load deviation degree when the load deviation degree is larger than zero through a nonlinear mapping function, obtaining historical load data of each individual, calculating the load change rate of the current load state relative to the historical load data, calculating a trend prediction coefficient when the load change rate is larger than zero, and taking the product of the gradient penalty coefficient and the trend prediction coefficient as a comprehensive adjustment coefficient to reduce the selected probability of the individual; receiving intervention operation of a user aiming at the optimal multi-agent, checking the intervention operation, updating corresponding parameters of the multi-objective optimization function, and re-solving; and when the overload level is detected, transferring the task of the overload multi-agent to the candidate multi-agent with normal load.
2. The method of claim 1, wherein constructing a multi-objective optimization function based on the task feature matrix and the multi-agent capability torque matrix comprises: calculating the similarity of the capability attribute and the expertise requirement in each expertise dimension, and generating an expertise matching degree term by normalization weighting summation; calculating the deviation value of the cost component and the cost constraint and normalizing to generate a cost item; judging whether the capability attribute meets the performance requirement, assigning a performance reaching scale value to the met multi-agent, and calculating a penalty value based on a performance gap to the unsatisfied multi-agent to generate a performance reaching scale item; The method comprises the steps of calculating load balancing degree of a multi-agent cluster based on load states of multiple agents in the multi-agent cluster, extracting a task feature matrix to generate a task feature vector, calculating similarity of the task feature vector and a historical task feature vector, selecting weight coefficients corresponding to a preset number of historical tasks with highest similarity, carrying out weighted average to obtain initial weight coefficients, carrying out dynamic adjustment on the initial weight coefficients based on the load balancing degree, lifting the weight coefficients of cost items and reducing the weight coefficients of special length matching degree items when the load balancing degree is lower than a balancing threshold value, and applying the adjusted weight coefficients to the special length matching degree items, and carrying out weighted summation on the cost items and the performance reaching scale items to construct the multi-objective optimization function.
3. The method of claim 1, wherein the step of solving the multi-objective optimization function to obtain an optimal multi-agent using a modified genetic algorithm that introduces a load state as an auxiliary decision condition in the selection operation, the step of generating the allocation instructions comprises: Initializing a population, and generating a multi-agent distribution scheme aiming at each individual, wherein the multi-agent distribution scheme represents the mapping relation between tasks and multi-agents; calculating a multi-objective optimization function value of each individual as a basic fitness, extracting a load state of the distributed multi-agent to calculate a load balance degree, normalizing the load balance degree into a load fitness component, and carrying out weighted fusion on the basic fitness and the load fitness component to generate a comprehensive fitness; When the selection operation is executed, calculating the comprehensive fitness sequencing value of each individual, extracting the load state of the distributed multi-agent to judge whether the load state exceeds a load threshold, reducing the selected probability for the individual exceeding the load threshold, calculating the selected probability for the individual not exceeding the load threshold based on the comprehensive fitness sequencing value, and selecting the individual to enter the cross operation according to the selected probability; performing crossover operation on the selected individuals to generate offspring individuals, and performing mutation operation on the offspring individuals; And iteratively executing until the termination condition is reached, selecting a multi-agent distribution scheme corresponding to the individual with the highest comprehensive fitness as an optimal multi-agent distribution scheme, and generating a distribution instruction, wherein the distribution instruction comprises a target multi-agent identifier and task execution parameters.
4. The method of claim 1, wherein the step of receiving user intervention operations for the optimal multi-agent, verifying the intervention operations, updating corresponding parameters of the multi-objective optimization function, and re-solving comprises: Capturing the intervention operation of a user on the optimal multi-agent to generate an intervention instruction, extracting the appointed replacement multi-agent identification in the intervention instruction, judging whether the capability attribute of the multi-agent meets the expertise requirement and the performance requirement in the task feature matrix, and recording effective intervention when the capability attribute meets the expertise requirement and the performance requirement; Acquiring capability attributes of the multi-agent corresponding to the effective intervention, calculating the matching degree of the capability attributes and the specialized requirements as intervention matching degree, calculating weight adjustment quantity based on the difference value between the intervention matching degree and the specialized matching degree item in the multi-objective optimization function, and applying the weight adjustment quantity to weight coefficient updating of the multi-objective optimization function; and re-executing a solving process based on the updated multi-objective optimization function to obtain a new optimal multi-agent distribution scheme, calculating the difference degree of the new and old distribution schemes, and generating a scheme change prompt when the difference degree exceeds a change threshold.
5. The method of claim 1, wherein the steps of collecting load data for each multi-agent in real time, determining a load level based on the load data, and when an overload level is detected, migrating the task of the overloaded multi-agent to a candidate multi-agent with a normal load include: Load data of each multi-agent in the multi-agent cluster are collected in real time, wherein the load data comprise a calculation resource occupancy rate and a task execution queue length, a comprehensive load value is calculated based on the calculation resource occupancy rate and the task execution queue length, and the comprehensive load value is compared with a preset multi-stage load threshold to judge a load level; When the load level reaches the overload level, extracting a task currently executed by the overload multi-agent, acquiring a task feature matrix, screening multi-agent with normal load level from a multi-agent cluster as a candidate multi-agent set, calculating the matching degree of the capability attribute of each candidate multi-agent in the candidate multi-agent set and the special requirement in the task feature matrix, and selecting the candidate multi-agent with the highest matching degree as a target migration multi-agent; And generating a task migration instruction to migrate the task of the overload multi-agent to the target migration multi-agent, and updating the load states of the overload multi-agent and the target migration multi-agent and synchronizing the load states to the multi-agent capacity matrix.
6. An electronic device, comprising: A processor; A memory for storing processor-executable instructions; wherein the processor is configured to invoke the instructions stored in the memory to perform the method of any of claims 1 to 5.
7. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1 to 5.

Description

Multi-agent optimal distribution method of dynamic routing algorithm Technical Field The invention relates to an artificial intelligence technology, in particular to a multi-agent optimal distribution method of a dynamic routing algorithm. Background In the traditional multi-agent task allocation research, a static allocation strategy is mostly adopted, and one-time matching is carried out according to the basic capability attribute of the agent and the task requirement. In the prior art, when the intelligent agent is selected, only the capability matching degree is often concerned, the real-time load state of the intelligent agent is ignored, so that certain intelligent agents with excellent performance in the system are in a high-load running state for a long time, and other intelligent agent resources are idle, so that the system resource allocation is unbalanced, and the overall efficiency is reduced. Disclosure of Invention The embodiment of the invention provides a multi-agent optimal distribution method of a dynamic routing algorithm, which can solve the problems in the prior art. In a first aspect of the embodiment of the present invention, a multi-agent optimal allocation method for a dynamic routing algorithm is provided, including: the method comprises the steps of acquiring a task, extracting a task feature matrix, constructing a multi-agent capacity matrix, wherein the task feature matrix comprises special requirements, cost constraint and performance requirements; Constructing a multi-objective optimization function based on the task feature matrix and the multi-agent energy moment matrix, wherein the multi-objective optimization function comprises a specific length matching degree item, a cost item and a performance reaching scale item; receiving the intervention operation of a user aiming at the optimal multi-agent, checking the intervention operation, and then updating the corresponding parameters of the multi-objective optimization function and re-solving; And when the overload level is detected, transferring the task of the overload multi-agent to the candidate multi-agent with normal load. The method comprises the steps of obtaining a task, extracting a task feature matrix, constructing a multi-agent capacity matrix, wherein the task feature matrix comprises special requirements, cost constraint and performance requirements, and the multi-agent capacity moment matrix comprises capacity attributes and load states and comprises the following steps: Analyzing task description to obtain task type identification, extracting expertise dimension and demand intensity from a preset knowledge base based on the task type identification as expertise demand, extracting resource quota and execution time limit of the task as cost constraint, and obtaining service quality index corresponding to the task as performance requirement; For each multi-agent in the multi-agent cluster, acquiring offline capability assessment data and online task execution feedback data, performing time sequence weighted fusion on the offline capability assessment data and the online task execution feedback data to generate dynamic capability scores as capability attributes, acquiring historical resource consumption data and task completion time length data calculation unit task overhead as cost components of the capability attributes, and acquiring calculation resource occupation data and task queue data in real time to calculate a current load value as a load state. The step of constructing a multi-objective optimization function based on the task feature matrix and the multi-body energy moment matrix comprises the following steps: calculating the similarity of the capability attribute and the expertise requirement in each expertise dimension, and generating an expertise matching degree term by normalization weighting summation; calculating the deviation value of the cost component and the cost constraint and normalizing to generate a cost item; judging whether the capability attribute meets the performance requirement, assigning a performance reaching scale value to the met multi-agent, and calculating a penalty value based on a performance gap to the unsatisfied multi-agent to generate a performance reaching scale item; The method comprises the steps of calculating load balancing degree of a multi-agent cluster based on load states of multiple agents in the multi-agent cluster, extracting a task feature matrix to generate a task feature vector, calculating similarity of the task feature vector and a historical task feature vector, selecting weight coefficients corresponding to a preset number of historical tasks with highest similarity, carrying out weighted average to obtain initial weight coefficients, carrying out dynamic adjustment on the initial weight coefficients based on the load balancing degree, lifting the weight coefficients of cost items and reducing the weight coefficients of special length matching degree ite