CN-121998086-A - Multi-agent code generation method, system, storage medium and equipment

CN121998086ACN 121998086 ACN121998086 ACN 121998086ACN-121998086-A

Abstract

The invention discloses a multi-agent code generation method, a system, a storage medium and equipment, belonging to the field of automatic code generation, wherein the method comprises the steps of initializing an agent to generate an initial meta-strategy through depth problem understanding, and constructing a meta-experience pool and a candidate strategy pool; the method comprises the steps of realizing iterative optimization of meta-strategies through multi-agent cooperation, obtaining final meta-strategies, converting the final meta-strategies into executable codes and test cases, verifying, outputting final codes if verification is successful, entering jeopardy if verification is failed, realizing closed-loop optimization of the meta-strategies through causal analysis, updating experience into a meta-experience pool, and updating the strategies into a candidate strategy pool. The invention constructs a four-stage closed-loop evolution framework comprising meta-strategy initialization, meta-strategy evolution, meta-strategy implementation and meta-strategy causal and anti-thinking, realizes the capability of autonomous evolution of strategies from primary to advanced, and is particularly suitable for the automatic solution of complex algorithm problems and competition level programming challenges.

Inventors

KUANG PING
LIU YUHANG
MA YANG

Assignees

电子科技大学

Dates

Publication Date: 20260508
Application Date: 20260119

Claims (10)

1. A multi-agent code generation method, comprising the steps of: S1, an element strategy initialization stage, namely, initializing an agent to understand and generate an initial element strategy through a depth problem, and constructing an element experience pool and a candidate strategy pool; S2, implementing iterative optimization of the meta-strategy through multi-agent cooperation to obtain a final meta-strategy; S3, a meta-policy implementation stage, namely converting the final meta-policy into an executable code and a test case and verifying, outputting a final code if verification is successful, and entering a step S4 if verification fails; And S4, realizing closed-loop optimization of the meta-strategy through causal analysis, updating experience into a meta-experience pool, and updating the strategy into a candidate strategy pool.
2. The multi-agent code generation method of claim 1, wherein the initial meta-policy is represented by a triplet structure: Wherein, the The initial meta-policy is represented as such, Representing the weight vector of the quality attribute, Representing a set of defensive focal points, Representing an implementation framework; And the meta experience pool and the candidate strategy pool are both in a structured storage format.
3. The multi-agent code generation method according to claim 2, wherein the candidate policy pool adopts a dynamic ordering mechanism, when the new policy passing rate is higher than the minimum value in the pool, the replacement is triggered, and the meta policy initialization stage adopts a weighted BM25 algorithm to realize the retrieval of the meta experience pool.
4. The multi-agent code generation method according to claim 1, wherein the iterative optimization of meta-policies by multi-agent collaboration comprises: defining a policy with highest passing rate in a candidate policy pool as a historical optimal element policy, and generating a basic evolutionary element policy by an evolutionary agent based on the historical optimal element policy and a current element policy; The modularized intelligent agent decomposes the basic evolutionary element strategy into a plurality of independently optimizable sub-strategies; the fusion agent merges the sub-policies to generate a final meta-policy.
5. The multi-agent code generation method of claim 4, wherein the merging the plurality of sub-policies to generate the final meta-policy comprises: The fusion agent sequentially performs consistency alignment, conflict resolution, module independence and integrality verification on the plurality of sub-policies.
6. The multi-agent code generation method of claim 1, wherein the test cases overlay base cases and policy specificities for verifying correctness of the generated executable code.
7. The multi-agent code generation method of claim 1, wherein the implementing closed loop optimization of meta-policies by causal analysis comprises: failure reasons are diagnosed through three dimensions of analysis of intelligent agents from defense focus deficiency, sub-strategy failure and quality weight unbalance respectively, and the causal relationship between strategy and performance is summarized through three dimensions of strategy performance evaluation, exposure weakness and improvement direction of a disfiguring intelligent agent respectively.
8. A multi-agent code generation system, comprising the steps of: the meta-policy initializing module is used for initializing an intelligent agent to understand and generate an initial meta-policy through the depth problem and constructing a meta-experience pool and a candidate policy pool; the meta-strategy evolution module is used for realizing iterative optimization of the meta-strategy through multi-agent cooperation to obtain a final meta-strategy; The meta-policy implementation module is used for converting the final meta-policy into executable codes and test cases and verifying, outputting the final codes if the verification is successful, and entering the meta-policy causal anti-thinking module if the verification is failed; and the meta-strategy causal anti-thinking module is used for realizing closed-loop optimization of the meta-strategy through causal analysis, updating experience into a meta-experience pool and updating the strategy into a candidate strategy pool.
9. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the multi-agent code generation method of any one of claims 1-7.
10. An electronic device comprising a memory and a processor, the memory having stored thereon computer instructions executable on the processor, wherein the processor, when executing the computer instructions, performs the multi-agent code generation method of any of claims 1-7.

Description

Multi-agent code generation method, system, storage medium and equipment Technical Field The present invention relates to the field of automated code generation, and in particular, to a method, a system, a storage medium, and a device for generating a multi-agent code. Background In the field of automated code generation, with the rapid development of large language model technology, large language model-based code generation systems have demonstrated great potential. However, the existing method generally adopts a static linear reasoning mode, and focuses attention on single task solution, but neglects continuous optimization and iterative improvement of a strategy level. This limitation results in a system that, when faced with complex algorithm problems, often falls into the dilemma of repeated failures, failing to implement policy evolution through experience accumulation as in human programmers. Particularly, when the competition level programming is processed, the performance bottleneck of the existing method is obvious, and the self-lifting and the adaptability improvement of the strategy level are difficult to realize. The prior art scheme has obvious defects in a strategy optimization mechanism, and is mainly characterized by underutilization of failure experience and shallow strategy improvement. Conventional approaches typically treat each code generation as an independent event, lacking systematic experience accumulation and conversion mechanisms. When the generated code has errors, the system can only perform simple retry or local adjustment, and cannot perform deep policy thinking and reconstruction. This strategy solidification problem severely constrains the adaptability of the system in the face of new and complex problems, and is also the root cause of the difficulty in breaking through the performance bottleneck in the current technology. Particularly, in a multi-agent collaboration scene, interactions among agents often stay at an execution level, and depth fusion and optimization on a strategy level are lacking, so that generated codes are easy to sink into local optimum. Therefore, a code generation method capable of realizing policy continuous evolution, deep utilization failure experience and supporting multi-agent collaborative optimization is needed to break through the performance bottleneck and adaptability limitation of the prior art. Disclosure of Invention The invention aims to solve the problems of strategy solidification, insufficient experience utilization, limited evolution capability and the like in the prior art, and provides a multi-agent code generation method, a system, a storage medium and equipment, by taking the code generation strategy as a learnable object, a four-stage closed loop evolution framework comprising meta-strategy initialization, meta-strategy evolution, meta-strategy implementation and meta-strategy causal thinking is constructed, and the autonomous evolution capability of the strategy from primary to advanced is realized. The aim of the invention is realized by the following technical scheme: in a first aspect, a multi-agent code generation method is provided, as shown in fig. 1, including the following steps: S1, an element strategy initialization stage, namely, initializing an agent to understand and generate an initial element strategy through a depth problem, and constructing an element experience pool and a candidate strategy pool; S2, implementing iterative optimization of the meta-strategy through multi-agent cooperation to obtain a final meta-strategy; S3, a meta-policy implementation stage, namely converting the final meta-policy into an executable code and a test case and verifying, outputting a final code if verification is successful, and entering a step S4 if verification fails; And S4, realizing closed-loop optimization of the meta-strategy through causal analysis, updating experience into a meta-experience pool, and updating the strategy into a candidate strategy pool. As a preferred technical solution, the initializing agent generates an initial meta-policy through depth problem understanding, including: the initial meta-policy is represented by a triplet structure: Wherein, the The initial meta-policy is represented as such,Representing the defending focus set, identifying key risk areas (such as functional correctness, boundary processing and input verification) in algorithm implementation, meeting,Representing quality attribute weight vectors satisfyingAnd is also provided withQuantifying the priority of each quality attribute (e.g., performance, readability, robustness),Representing an implementation framework, defined as an ordered sequence of stepsAnd guiding the code generation flow. As a preferred technical scheme, in step S1, the meta experience pool constructed at the stageAnd a structured storage format is adopted for storing history failure experience and improvement information, and the format is as follows: wherein eac