Search

CN-122018968-A - Code change generation method based on multi-agent

CN122018968ACN 122018968 ACN122018968 ACN 122018968ACN-122018968-A

Abstract

The invention discloses a code change generation method based on a multi-agent library migration scene, which can accurately position all code acting fields related to library migration in a file range by carrying out structural analysis on a source code file and combining source library import information and code feature identification, thereby avoiding the problems of omission and misjudgment caused by manual analysis or simple rule matching and improving the integrity and accuracy of migration site identification. The method and the device generate the corresponding migration plan according to the identified scope to be migrated and the key information of the source library and the target library, so that the code changing process is not limited to simple interface replacement, the targeted adjustment can be carried out according to the specific use scene, and the consistency of the migrated code in terms of semantics and behaviors is ensured. According to the invention, the code change related to library migration is automatically generated and evaluated through the collaborative code change generation and verification mechanism, so that the reliability and usability of the generated result are improved while manual intervention is reduced.

Inventors

  • DENG SHUIGUANG
  • WANG YARONG
  • HAN JUNXIAO
  • QIN ZHEN

Assignees

  • 浙江大学

Dates

Publication Date
20260512
Application Date
20260205

Claims (10)

  1. 1. The code change generation method based on the multi-agent library migration scene is characterized by comprising the following steps: (1) Collecting a source code file to be migrated and a source library name and a target library name corresponding to the source code file; (2) Determining key identification information of a source library through a search tool, constructing an abstract syntax tree and a control flow structure for a source code file, and combining dependency propagation and text feature matching to obtain migration related nodes; (3) The scope information and the source library name of the migration related node are used as prompt words to be input into the positioning intelligent agent, and the positioning intelligent agent outputs structured sites to be migrated; (4) Inputting information of sites to be migrated, source library names and source code files as prompt words to a planning agent, calling a search tool by the planning agent to determine key API knowledge, and generating migration plans of all the sites to be migrated; (5) Inputting a source code file, a source library name, a target library name, a site to be migrated, key API knowledge and a migration plan as prompt words into a variant generating agent, and outputting migration-related code changes; (6) And (3) inputting the source code file, the code change and the scoring standard as prompt words into the verification agent, and scoring the code change by the verification agent and determining the final code change.
  2. 2. The code change generating method based on the multi-agent library migration scenario of claim 1, wherein the specific implementation manner of the step (1) is as follows: 1.1, acquiring the existing reference data set, wherein each migration record in the data set comprises a code warehouse, submitted information, a source library name and a target library name; 1.2 deleting the data with file catalog change before and after migration, filtering the library migration data with long tail characteristic, namely adopting the data programming paradigm to identify and filter the library migration data related to Django libraries, adopting a difflib tool to process the source code files before and after migration, and generating the corresponding code change file pairs.
  3. 3. The code change generating method based on the multi-agent library migration scenario of claim 1, wherein the specific implementation manner of the step (2) is as follows: 2.1 constructing a search request by adopting the name of the source library, calling an API of a search tool to obtain a search result, summarizing the search result based on the search request and the search result, calling the API of the DeepSeek model, and generating an imported name of the source library; 2.2 calling an AST tool to carry out grammar analysis on source codes before migration so as to construct an abstract grammar tree, dividing the scope of the source codes according to modules, classes and functions on the basis, identifying calling relations among different scopes, and thus constructing a calling dependency relation structure, wherein each node in the relation structure corresponds to one code scope in the source codes, the code scope comprises a module scope, a function scope or a class scope, the information stored by each node comprises identification information, a starting line number and an ending line number of the scope, edges among the nodes represent calling dependency relations among different code scopes, and the direction of a directed edge is called by a calling Fang Zhixiang; 2.3 traversing all variable access, attribute access and function call nodes in the abstract syntax tree based on the import name of the source library, identifying the syntax node related to the source library, marking the node as a migration related node when the use of the source library is contained in a certain node, transmitting the migration related mark to other nodes with the dependency relationship based on the call dependency relationship along the call direction, merging continuous or adjacent code lines into code fragments of a module level according to the line number of the code lines related to the source library in the module scope, taking the code fragments into the call dependency relationship structure as independent nodes, and finally outputting the start position and the end position of the node related to the migration and the corresponding code scope.
  4. 4. The code change generating method based on the multi-agent library migration scenario of claim 1, wherein in the step (3), the scope information of the migration related node and the source library name are input as prompt words to the positioning agent, the positioning agent is guided to output the site to be migrated structured in the scope, and the migration site includes the following information: the name of the code area where the locus to be migrated is located is a function name, a class name or a module name; the starting line number and the ending line number of the code region where the to-be-migrated site is located; the code blocks of the sites to be migrated are specific to the code content.
  5. 5. The code change generating method based on the multi-agent library migration scenario of claim 1, wherein the specific implementation manner of the step (4) is as follows: 4.1, using information of a site to be migrated and a source library name as prompt words, prompting a planning agent to output an API list to be migrated, wherein the list comprises the certainty degree of migration knowledge of each API, when the planning agent determines how to migrate a certain API, the planning agent defaults to master the knowledge of the API, and marks the API as an API without searching, otherwise marks the API as an API needing searching; 4.2 constructing search requests and calling search tools for the APIs one by one for the APIs marked as the APIs to be searched in the API list, searching how the APIs are migrated to a target library, and filtering keywords of knowledge content returned by the search to obtain a final key API knowledge list; And 4.3, inputting the sites to be migrated, the API list to be migrated, the key API knowledge list, the source library name, the target library name and the source code file to a planning agent, and guiding the planning agent to output a structured migration plan for each site to be migrated, wherein the structured migration plan comprises a migration step, a possible API mapping, rows to be deleted, contents to be added and confidence of the plan.
  6. 6. The method for generating code change in migration scenario based on multi-agent library according to claim 1, wherein the change generating agent in step (5) performs migration according to migration plan according to the migration site in the prompt word in combination with the content of the source code file, and generates and outputs the migration related code change.
  7. 7. The code change generation method based on the multi-agent library migration scenario of claim 1, wherein the specific implementation manner of the step (6) is as follows: 6.1, inputting code change, source library name, target library name and multidimensional scoring standard as prompt words into a verification agent, analyzing the scores of the code change in four dimensions by the verification agent according to the scoring standard, and taking the total score of the four dimensions as the round of scores of the code change; and 6.2, setting the total score as 10 points, setting the qualification score as 6 points, exiting the verification and outputting the current generated code change when the score of the current verification round output by the verification agent is more than or equal to 6 or the current verification round number exceeds 3, and outputting a verification report and returning the verification report to the change generation agent when the score of the current verification round output by the verification agent is less than 6 and the current verification round number is less than 3, and regenerating a new code change and starting a new round of verification by the change generation agent according to the verification report.
  8. 8. The method for generating code changes in a multi-agent library migration scenario of claim 7, wherein the four dimensions of the scoring criteria analysis comprise: The correctness of migration, namely 4 points are occupied by the dimension, and the dimension is used for measuring whether code change accords with functional correctness, if so, whether an inexistent API is introduced, and whether an inexistent parameter is introduced; the integrity of migration, wherein the dimension occupies 3 minutes and is used for measuring whether code change is migrated completely, and whether the use of a source library is completely removed and replaced by the use of a target library; The dimension accounts for 1 minute, is used for measuring whether code changes generated in the migration process follow the recommended usage of a target library, and does not adopt a abandoned or non-recommended method; The correctness of the code change, namely that the dimension occupies 2 points, is used for measuring whether the generated code change has grammar errors or format errors.
  9. 9. A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor is used for executing the computer program to realize the code change generation method based on the multi-agent library migration scene according to any one of claims 1-8.
  10. 10. A computer readable storage medium storing a computer program, wherein the computer program is executed by a processor to implement the code change generation method based on the multi-agent library migration scenario according to any one of claims 1 to 8.

Description

Code change generation method based on multi-agent Technical Field The invention belongs to the technical field of crossing of artificial intelligence and software engineering, and particularly relates to a code change generation method based on a multi-agent library migration scene. Background In the modern software development process, third party software libraries are widely applied to various software systems to improve development efficiency, reuse mature functions and reduce development cost. However, with the rapid evolution of software ecology, the original library may not gradually meet the system requirements due to the reasons of stopping maintenance, insufficient performance, interface change, security risk, and the like, so that the source library depending on the source code needs to be migrated to a target library with similar functions or better performance. The library migration process generally involves modification of a large number of source codes, including interface replacement, parameter adjustment, and call logic reconfiguration, so how to efficiently and accurately generate migration-related code changes becomes an important problem in software maintenance and evolution. With the increasing complexity of code files, library migration is not limited to simple interface replacement, but often involves code adjustment across modules, and manually completing library migration is time-consuming and labor-consuming, and is easy to introduce new defects due to missing migration sites or misuse of interfaces. In addition, library migration often requires a developer to have deep understanding of a source library and a target library, which further increases migration cost and technical threshold, so that library migration code modification is completed in an auxiliary manner by means of automation or intelligent means, and the method has important practical significance. Existing library migration related methods rely primarily on human experience or rule-based tools, which typically replace specific interface calls in code by predefined rules or pattern matching, e.g., mapping function names in a source library directly to corresponding function names in a target library. However, because different libraries often have large differences in design concepts, interface semantics and usage modes, simple rule replacement is difficult to cover complex migration scenes, and when conditional logic, context dependence or multiple calling modes exist in codes, the applicability and accuracy of the methods are obviously limited. With the development of large language models, research has been attempted to generate or modify codes by using large language models to assist in completing tasks such as API (application programming interface) replacement or code reconstruction, for example, literature [M. Islam, A. K. Jha, S. Nadi and I. Akhmetov, "PyMigBench: A Benchmark for Python Library Migration," 2023 IEEE/ACM 20th International Conference on Mining Software Repositories, Melbourne, Australia, 2023, pp. 511-515] directly generates migrated codes by providing source code fragments and target library information to the language models, and shows a certain flexibility at the level of local code fragments, so that the problem of interface replacement of non-simple mapping can be solved. However, this approach often lacks systematic analysis of the overall code file structure and dependencies, and the generation process often is in units of functions or code fragments, making it difficult to accurately identify all code locations associated with library migration. The difficulty of single file library migration code change generation is that the abstract syntax structure, control flow and library dependent use condition of source codes need to be comprehensively analyzed in a file level range, so that all code positions related to library migration are accurately positioned. However, the existing method generally regards migration as a one-time overall generation task, lacks of staged processing of migration site identification, migration planning and code generation processes, and is difficult to ensure consistency and reliability of generated results. In addition, the existing method generally lacks a verification mechanism for a single file migration result, the generated code change may be correct at the grammar level, but the problems of incomplete migration, incorrect call of interfaces or use of a disuse method and the like may still exist, and the quality and maintainability of the migrated code are affected. Therefore, how to perform system evaluation and screening on the generated library migration code changes in a single source code file still lacks effective technical means. Therefore, the existing method still has significant limitations under the single file library migration scene, namely, on one hand, the code replacement method based on rule or character string matching lacks understan