CN-122019388-A - Multi-agent collaborative industrial artificial intelligence algorithm evaluation system and method

CN122019388ACN 122019388 ACN122019388 ACN 122019388ACN-122019388-A

Abstract

The invention discloses a multi-agent collaborative industrial artificial intelligent algorithm evaluation system and method, and aims to solve the problems of unordered evaluation flow, insufficient compliance guarantee and poor system expansibility in the prior art. The system adopts a six-level hierarchical architecture, the core is a guiding agent, and the system performs double check by monitoring the task state in the global data layer and combining with national standard rules, and only performs special agents such as scheduling data division, script writing, algorithm parameter adjustment, report generation and the like in sequence when the state is in compliance. Each agent invokes a tool based on MCP packaging through standardized JSON interactions. The invention realizes automation, ordering and national standard compliance of the evaluation flow, supports flexible expansion through loose coupling design, and ensures comparability and traceability of the evaluation result.

Inventors

WU ZHENYU
WEI ZIXUAN

Assignees

北京邮电大学

Dates

Publication Date: 20260512
Application Date: 20260203

Claims (10)

1. The utility model provides an industry artificial intelligence algorithm evaluation system of multi-agent cooperation which characterized in that adopts six grades of layering frameworks, includes: a user layer for providing various operation interfaces and multi-role login functions; an interaction layer for resolving user instructions into system executable instructions; The multi-agent core layer comprises a dispatching center and at least four functional agents; A tool set layer for packaging an industrial evaluation special tool based on the MCP frame; the data layer is used for constructing a global data sharing layer by adopting an asynchronous database; An infrastructure layer for implementing containerized deployment and resource scheduling; The scheduling center of the multi-agent core layer is a guiding agent, and the at least four functional agents comprise a data dividing agent, an evaluation script writing agent, an algorithm parameter adjusting agent and a report generating agent; The data layer is used for storing task state and execution result data written by each agent; The navigation agent is configured to monitor the task state in the data layer in real time, check the current task state according to preset industrial algorithm evaluation flow rules and national standard compliance rules, recommend a matched next functional agent to a user when the state reaches the standard and the national standard compliance is achieved, and repeat the monitoring, checking and recommending flows until the evaluation flow is finished after the functional agent executes the task and updates the data layer.
2. The multi-agent collaborative industrial artificial intelligence algorithm evaluation system according to claim 1, wherein the preset industrial algorithm evaluation flow rule comprises the steps of sequentially executing data division, evaluation script writing, algorithm parameter adjustment and report generation, and the national standard compliance rule comprises at least one of a data division non-leakage rule, an evaluation index national standard compliance rule and a report format national standard compliance rule.
3. The multi-agent collaborative industrial artificial intelligence algorithm evaluation system of claim 1, wherein the data partitioning agents are configured to perform leak-free partitioning of industrial data sets according to a "physical part" and "time series" dual strategy and to perform preprocessing of missing value padding and outlier rejection on the data.
4. The multi-agent collaborative industrial artificial intelligence algorithm evaluation system of claim 1, wherein the evaluation script writing agent is configured to generate an executable evaluation script comprising preset national index calculation logic based on a national index library, wherein the algorithm tuning agent is configured to invoke a hyper-parameter optimization tool in the tool set layer to iteratively optimize hyper-parameters of a target industrial algorithm, and wherein the report generating agent is configured to integrate full flow data and generate an evaluation report comprising a national compliance specification.
5. The multi-agent collaborative industrial artificial intelligence algorithm evaluation system according to claim 1, wherein the tool set layer packaged industrial evaluation special tool at least comprises a Optuna-based super-parameter optimization tool, a national standard index calculation tool and a leak-free data division tool.
6. The multi-agent collaborative industrial artificial intelligence algorithm evaluation system according to claim 1, wherein the data stored in the data layer adopts JSON format, and the data structure defines a special field required for industrial algorithm evaluation, including at least one of a data set type field, a national standard index identification field, a leak-free division identification field, and a parameter range field.
7. A multi-agent collaborative industrial artificial intelligence algorithm evaluation method, applied to the system of any one of claims 1-6, executed by a navigation agent, comprising the steps of: S1, monitoring a current evaluation task state in a global data layer; S2, checking the current task state according to a preset industrial algorithm evaluation flow rule and a national standard compliance rule; s3, if the verification is passed, matching and recommending a next functional intelligent agent to be executed according to the current state; And S4, returning to the step S1 after the task state in the data layer is updated.
8. The method for evaluating an industrial artificial intelligence algorithm with multi-agent cooperation according to claim 7, wherein in step S3, the matching of the next functional agent to be executed according to the current state specifically includes: If the current state is that the data is uploaded and the format national standard is compliant, matching the data to divide the intelligent agent; if the current state is that the data division is completed and the non-leakage national standard requirement is met, matching the evaluation script to write the intelligent agent; If the current state is generated by an evaluation script and the index logic national standard is in compliance, matching algorithm parameter-adjusting intelligent agent; and if the current state is that the algorithm parameter adjustment is completed and the performance reaches the standard, generating an intelligent agent by a matching report.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the multi-agent collaborative industrial artificial intelligence algorithm evaluation method of claim 7 or 8 when executing the program.
10. A computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements the multi-agent collaborative industrial artificial intelligence algorithm evaluation method of claim 7 or 8.

Description

Multi-agent collaborative industrial artificial intelligence algorithm evaluation system and method Technical Field The invention relates to the technical field of industrial artificial intelligence, in particular to a multi-agent collaborative industrial artificial intelligence algorithm evaluation system and method, which are suitable for standardized evaluation scenes of industrial artificial intelligence algorithms such as state monitoring, fault diagnosis, life prediction and the like. Background With the evolution of industry 4.0 to intelligent manufacturing 2.0, predictive maintenance (PdM) has become a core support for cost reduction and efficiency improvement of manufacturing enterprises, and algorithm evaluation is a key link of scale landing of PdM technology. The release of the national GB/T43555-2023 and international IEC 63270-1:2025 standards clearly requires that the algorithm evaluation needs to cover the whole flow of data acquisition, feature extraction, model construction and decision optimization, and meets quantitative indexes such as fault prediction confidence level, maintenance response time and the like. However, in industry practice, the traditional evaluation mode still has pain points such as complicated flow, inconsistent environment, standardized absence and the like, and an automatic, standardized and low-threshold evaluation solution is needed. In the process of introducing predictive maintenance algorithms, iterative optimization and large-scale application, industrial enterprises face common pain points such as fragmented evaluation flow, high technical threshold, incomparable results, large cost investment and the like, and the traditional black box type blind evaluation and tool splicing type evaluation are difficult to meet the engineering floor requirements. Four major core technical problems existing in the field of evaluation of industrial predictive maintenance algorithms at present: 1) The traditional evaluation flow has low automation degree, depends on the manual series connection of a plurality of independent tools (such as a data preprocessing tool, a script writing tool, a parameter adjusting tool and the like), and has the advantages that multiple links such as data division, script writing and the like are needed to be completed manually, the traditional evaluation flow depends on the manual series connection of a plurality of independent tools (such as a data preprocessing tool, a script writing tool, a parameter adjusting tool and the like), the single algorithm evaluation period is as long as 4-8 days, and the efficiency is extremely low; 2) The method has the advantages that the consistency of the evaluation environment is poor, the tool version conflict causes 18% -23% of result deviation, the unified standard is lacking, the results of different platforms cannot be transversely compared, the intermediate state, decision basis and generated data in the evaluation process are dispersed in different tools or files, and the unified global data management and traceability mechanism is lacking, so that the problem investigation and evaluation reproduction are not facilitated. 3) The technical threshold is high, enterprises throw over 80 ten thousand yuan each year, small and medium-sized enterprises are hard to bear, quantitative evaluation mechanisms aiming at intelligent agent generation results are not available, the traditional evaluation system architecture is tightly coupled, core codes are often required to be modified when evaluation scenes, indexes or tools are newly added, development and maintenance costs are high, and rapid response to the diversified evaluation demands of industrial sites is difficult. 4) The existing system does not realize national standard adaptation, has insufficient report compliance, and cannot meet the requirements of specific national standards (such as GB/T43555-2023) or international standards (such as IEC 63270-1:2025). The existing general evaluation system or independent tool lacks a built-in national standard rule base and a compliance checking mechanism, and the compliance of the evaluation process and the result report is highly dependent on manual auditing by an expert, so that the efficiency is low and the error is easy to occur. The algorithm test index is an important basis for checking the performance of each algorithm, and according to whether the algorithm passes the test index, the enterprise can obtain the performance index, stability and generalization capability of the algorithm at the first time, continuously optimize the algorithm model and adjust the internal parameters of the algorithm so as to improve the reliability and stability of the algorithm. In the predictive maintenance algorithm test task, a state monitoring algorithm, a fault diagnosis algorithm and a predictive algorithm are mainly developed for testing, and different algorithms prescribe various test indexes so as to enrich the test di