CN-121979505-A - Code generation method based on knowledge topology alignment and dynamic complexity protocol
Abstract
The invention discloses a code generation method based on knowledge topology alignment and dynamic complexity reduction, which comprises the steps of carrying out semantic analysis on user demands, constructing a demand dependency graph and calculating demand complexity, taking the demand dependency graph as an index to search an expert code library, extracting standard control semantic topology, simultaneously analyzing ST codes generated by a large model into generated code control topology, carrying out three-party topology alignment verification on the three to obtain alignment degree scores, constructing a complexity-alignment degree correlation threshold auditing flow based on the demand complexity, the code complexity and the alignment degree scores, adopting group relative strategy optimization to construct a reward function by combining the alignment degree scores and model prediction variance, carrying out fine adjustment on the large model, and selecting candidate codes with highest comprehensive acceptance rate and passing the auditing as final output. The invention realizes the quantitative evaluation and complexity self-adaptive adjustment of the quality of the generated codes, and improves the reliability and engineering applicability of ST codes in industrial control scenes.
Inventors
- REN JINGJUN
- ZHANG DONGPING
- LIU XUAN
- YU JIABIN
- SUN LONG
Assignees
- 中国计量大学
Dates
- Publication Date
- 20260505
- Application Date
- 20260407
Claims (9)
- 1. A code generation method based on knowledge topology alignment and dynamic complexity reduction, comprising the steps of: The method comprises the steps of 1, carrying out semantic analysis on user demands, extracting control intents, constructing a demand dependency graph RDG, and carrying out quantitative analysis on the demand dependency graph to calculate demand complexity, wherein the user demands are derived from structural text ST code writing demands in the field of industrial automation control; The requirement dependency graph is used as an index, a matching high-quality code is retrieved from an expert code library, and control logic abstraction of the code is extracted to form a standard control semantic topology SCST; step 3, constructing a complexity-alignment correlation threshold auditing flow based on the demand complexity, the GCT complexity and the alignment score, determining an acceptance threshold of the complexity and the alignment, combining the alignment score and the model prediction variance, constructing a reward function by adopting group relative strategy optimization GRPO, and performing fine tuning optimization on the large model; And 4, selecting the candidate codes which have highest comprehensive acceptance rate in the same user demand group and pass the auditing as final output results.
- 2. The knowledge topology alignment and dynamic complexity reduction based code generation method of claim 1, wherein said constructing a demand dependency graph RDG in step 1 further comprises: after receiving user demand input, carrying out semantic analysis on the user demand based on a large model, and carrying out word segmentation on a demand text; step 1.2, extracting key control characteristics from the word elements obtained by segmentation to characterize core control intention in user requirements; step 1.3, constructing a demand dependency graph RDG through the extracted key control characteristics, wherein the demand dependency graph is a graph structure for describing control intention, the nodes of the demand dependency graph are divided into entity nodes, action nodes and constraint nodes, and edges represent logical relations among the nodes; Step 1.4, carrying out structural analysis on the demand dependency graph, calculating the structural characteristics of the graph, and outputting quantized index demand complexity based on the characteristics 。
- 3. The knowledge topology alignment and dynamic complexity reduction based code generation method of claim 1, wherein the three-party topology alignment check in step 2 further comprises: Step 2.1, carrying out semantic retrieval in a high-quality PLC database based on the requirement dependency graph RDG constructed in the step 1 as an index, wherein the high-quality PLC database comprises expert codes verified by engineering practice, and matching to obtain a high-quality code closest to the current user requirement control intention by calculating the similarity of the requirement dependency graph and the expert code control semantics; step 2.2, performing control logic abstraction on the high-quality code, extracting a control behavior mode of the high-quality code, and forming a standard control semantic topology SCST; And 2.3, generating a plurality of groups of initial ST codes by using a large model for generating the ST codes based on user requirements, checking the compiling passing rate, screening out uncompiled codes, carrying out lightweight analysis on the initial ST codes passing the compiling checking according to a preset input format, and only extracting control structural relations and key logic links of the initial ST codes to form a generated code control topology GCT.
- 4. The knowledge topology alignment and dynamic complexity reduction based code generation method of claim 1, wherein the three-party topology alignment check in step 2 further comprises: Step 2.4, carrying out standardization alignment check on the standard control semantic topology SCST and the generated code control topology GCT, wherein the standardization alignment check is used for detecting whether a control structure of the generated code accords with a standard control normal form extracted from a high-quality expert code; Step 2.5, carrying out coverage alignment check on the requirement dependency graph RDG and the generated code control topology GCT, wherein the coverage alignment is used for detecting whether the generated code completely covers key control elements and constraint relations in the requirement of a user; And 2.6, comprehensively scoring the generated codes according to preset weights based on the three-point topology alignment verification, namely, the results of standardization alignment and coverage alignment, calculating to obtain corresponding alignment scores and outputting the corresponding alignment scores.
- 5. The method according to claim 4, wherein in the step 2.4, the normalization alignment check evaluates the structural normalization by calculating a graph edit distance GED for generating the code control topology GCT and its corresponding standard control semantic topology SCST, the graph edit distance is a statistic of the number of steps required to be operated for modifying the GCT to the SCST, and at the same time, checks whether the generated code control topology includes a core sub-graph in its corresponding standard control semantic topology, and if no corresponding core sub-graph or a corresponding path break in the core sub-graph is found, directly determines a logic error and generates a feedback signal.
- 6. The knowledge topology alignment and dynamic complexity reduction based code generation method of claim 4, wherein in said step 2.5, said coverage alignment comprises the steps of: step 2.5.1, mapping a requirement dependency graph RDG and each node in a generated code control topology GCT into vectors by adopting a graph neural network so that the two are comparable in the same dimension space; 2.5.2, calculating cosine similarity between each node in the demand dependency graph and each node in the generated code control topology; 2.5.3, searching an optimal matching pair based on a bipartite graph matching algorithm, wherein the optimal matching pair consists of a vector generated by mapping a certain key node in the RDG and a vector generated by mapping a certain node in the GCT with the maximum cosine similarity, and calculating a demand coverage score based on the optimal matching pair; And 2.5.4, checking whether the generated code control topology comprises nodes corresponding to key nodes in the demand dependency graph, if the corresponding nodes cannot be found, namely, if the cosine similarity is smaller than a preset threshold value, directly judging that logic is missing, and generating a feedback signal.
- 7. The knowledge topology alignment and dynamic complexity reduction-based code generation method of claim 1, wherein said constructing a complexity-alignment correlation threshold auditing procedure in step 3 further comprises: Step 3.1, constructing triple data based on the demand complexity, the code complexity and the alignment degree score, constructing a probability mapping model by adopting Gaussian process regression, and outputting a prediction mean value and uncertainty variance of the alignment degree score; step 3.2, fixing the required complexity of the current task, extracting the pareto front edge scored by the code complexity and the alignment degree, identifying the elbow point and defining the pareto tail area; step 3.3, constructing a discriminant function introducing complexity regular terms, comprehensively auditing the generated codes, and screening candidate codes passing the auditing; And 3.4, performing reinforcement learning fine tuning training on the large model generated by the ST codes by adopting a group relative strategy optimization method, constructing a reward function based on the alignment degree score of the generated codes, the pareto tail penalty term and the uncertainty variance of the prediction model, and calculating the reward value for all the compilable codes.
- 8. The knowledge topology alignment and dynamic complexity reduction based code generation method of claim 7, wherein said step 3.3 of comprehensively accepting conditions of the discriminant function comprises: Predicting that the alignment degree score reaches a qualification standard; Code complexity does not fall into pareto tail region; the uncertainty variance of the prediction model is within a preset safety threshold.
- 9. The knowledge topology alignment and dynamic complexity reduction-based code generation method of claim 7, wherein said performing reinforcement learning fine tuning training on a large model using group relative strategy optimization in step 3.4 comprises: constructing a reward function by rewarding based on the alignment degree score, the pareto tail penalty term, the uncertainty variance and compiling; carrying out standardized processing on rewarding values of generated codes in the same demand group, and calculating relative advantages in the group; Based on the relative dominance updating strategy gradient, the guide model generates ST codes with moderate complexity and stable quality.
Description
Code generation method based on knowledge topology alignment and dynamic complexity protocol Technical Field The invention belongs to the technical field of computers, and particularly relates to a code generation method based on knowledge topology alignment and dynamic complexity reduction. Background The Structured Text (ST) language is one of the main programmable controller (Programmable Logic Controller, PLC) programming languages defined in the IEC 61131-3 standard. Compared with graphical programming languages such as a ladder diagram, the ST language adopts a text programming form, has stronger logic expression capability and good expandability, can conveniently describe complex control logic, condition judgment and mathematical operation processes, is suitable for realizing industrial control programs with complex functions and more logic layers, and is widely applied to industrial automation control systems. Along with the continuous improvement of the functional complexity of the industrial control system, the ST codes gradually present the problem of higher writing and maintenance difficulty in practical engineering application. In the prior art, the ST codes are usually designed, written and debugged in a manual mode, the requirements of the development process on experience and professional ability of engineering personnel are high, and the code reusability and maintainability are insufficient. Meanwhile, the inspection and quality evaluation means for ST codes are relatively limited, manual inspection or rule-based static analysis is mainly used, systematic evaluation of code quality is difficult to carry out from the angles of overall structure complexity, logic rationality and the like, and potential logic defects are easy to cause and cannot be found in time, so that the stability and safety of an industrial control system are affected. In recent years, with the development of Large Language Models (LLMs) in the fields of natural language processing and code generation, the prior art has attempted to introduce large language models for automatic generation or assisted programming of ST codes in order to reduce development costs and improve programming efficiency. However, because the number of high-quality ST code samples in the industrial field is limited, and programming specifications and engineering constraints under different application scenes are large in difference, the problems of unstable generation quality, redundant code structure, excessive complexity and the like of the existing large language model still exist when ST codes are generated. In addition, the lack of effective quality evaluation and screening mechanisms for generated codes makes it difficult to quantitatively judge the generated results and form effective feedback, thereby limiting the practical application effect of the large language model in the field of PLC programming. Disclosure of Invention In order to solve the problems of lack of unified quantization standard in ST code quality evaluation in the prior art, unstable generation quality, uncontrolled complexity and the like of ST codes generated based on a large language model, realize quantitative evaluation and complexity self-adaptive adjustment of the quality of the generated ST codes, and improve the reliability, maintainability and engineering applicability of the generated ST codes in an industrial control scene, the invention adopts the following technical scheme: A code generation method based on knowledge topology alignment and dynamic complexity reduction comprises the following steps: The method comprises the steps of 1, carrying out semantic analysis on user demands, extracting control intents, constructing a demand dependency graph RDG, and carrying out quantitative analysis on the demand dependency graph to calculate demand complexity; The requirement dependency graph is used as an index, a matching high-quality code is retrieved from an expert code library, and control logic abstraction of the code is extracted to form a standard control semantic topology SCST; step 3, constructing a complexity-alignment correlation threshold auditing flow based on the demand complexity, the GCT complexity and the alignment score, determining an acceptance threshold of the complexity and the alignment, combining the alignment score and the model prediction variance, constructing a reward function by adopting group relative strategy optimization GRPO, and performing fine tuning optimization on the large model; And 4, selecting the candidate codes which have highest comprehensive acceptance rate in the same user demand group and pass the auditing as final output results. Further, the building the demand dependency graph RDG in step 1 further includes: after receiving user demand input, carrying out semantic analysis on the user demand based on a large model, and carrying out word segmentation on a demand text; step 1.2, extracting key control characteristics