Search

CN-122018887-A - Simulink code optimization method and device based on large language model

CN122018887ACN 122018887 ACN122018887 ACN 122018887ACN-122018887-A

Abstract

The application provides a Simulink code optimization method and device based on a large language model, and relates to the technical field of code optimization, wherein the method comprises the steps of executing code generation on a target Simulink model, and carrying out semantic information supplementation on the generated code to obtain a code to be optimized; the method comprises the steps of identifying equivalent code segments, identifying the equivalent code segments of codes to be optimized by utilizing a large language model to obtain a plurality of candidate code segments, generating a sharing function corresponding to each candidate code segment in the plurality of candidate code segments, verifying the semantic equivalence between each sharing function and the corresponding candidate code segment, and utilizing the verified sharing function to carry out code overwriting and compiling optimization to generate a final optimized code. The Simulink code optimization method and device based on the large language model provided by the application have the advantages that the large language model is utilized to carry out equivalent identification on the code segments, and the identified code segments are reduced so as to reduce the memory occupation.

Inventors

  • YU ZEHONG
  • SU ZHUO
  • Pu Jinxiao
  • SHI DALONG
  • JIANG YU

Assignees

  • 清华大学
  • 中航国际金网(北京)科技有限公司

Dates

Publication Date
20260512
Application Date
20251224

Claims (10)

  1. 1. A Simulink code optimization method based on a large language model is characterized by comprising the following steps: Executing code generation on the target Simulink model, and supplementing semantic information to the generated code to obtain a code to be optimized, wherein the semantic information is used for identifying equivalent code segments; Performing semantic equivalent code segment identification on the code to be optimized by using a large language model to obtain a plurality of candidate code segments, and generating a sharing function corresponding to each candidate code segment in the plurality of candidate code segments; And verifying the semantic equivalence between each shared function and the corresponding candidate code segment, and performing code rewriting and compiling optimization by using the verified shared function to generate a final optimized code.
  2. 2. The method according to claim 1, wherein the performing semantic information supplementation on the generated code to obtain the code to be optimized includes: Based on a pre-constructed parameterized mapping relation table, replacing hard coding constants in the generated codes with variable forms to obtain replaced codes; Determining an input port and an output port of each subsystem based on a model structure and a connection relation of the target Simulink model, and adding comments to the replaced codes based on the input port and the output port of each subsystem to obtain the codes to be optimized; wherein the annotations in the code to be optimized are used to identify different code regions.
  3. 3. The method of claim 1, wherein the large language model comprises an equivalent extraction agent and a constraint verification agent; the identifying the code to be optimized by using a large language model to obtain a plurality of candidate code segments, and generating a sharing function corresponding to each candidate code segment in the plurality of candidate code segments, including: constructing a first prompt word of the equivalent extraction agent and a second prompt word of the constraint inspection agent based on the first prompt information; based on the first prompt word, utilizing the equivalent extraction agent to identify candidate code segments of the code to be optimized, obtaining a plurality of code segments, and generating a sharing function corresponding to each code segment; Based on the second prompt word, verifying the plurality of code segments and the sharing function corresponding to each code segment by using the constraint verification agent, and repairing the sharing function under the condition that abnormality is verified, so as to finally obtain the plurality of candidate codes and the sharing function corresponding to each candidate code; The first prompt message comprises at least one of an equivalent case, subsystem semantics, formatting specifications, and counterexamples.
  4. 4. A method according to claim 3, wherein verifying the shared function corresponding to the plurality of code segments and each code segment by the constraint checking agent and repairing the shared function if an exception is verified comprises: compiling the target sharing function by using a compiler, if the compiling is successful, determining that the target sharing function is successfully verified, otherwise, constructing a third prompting word based on the second prompting information, and repairing the target sharing function by using the constraint checking agent based on the third prompting word; The target sharing function is any one of a plurality of sharing functions corresponding to the plurality of code segments, and the second prompt information comprises at least one of compiling error information, the target sharing function and counterexamples.
  5. 5. The method of claim 1, wherein said verifying semantic equivalence between each shared function and a corresponding candidate code segment comprises: generating a test function corresponding to each candidate code segment based on a data flow graph, wherein the data flow graph is used for representing a source end and a target end connected with the target Simulink model; And carrying out semantic equivalence verification on the test function and the shared function corresponding to each candidate code segment by using the satisfaction modulus theory SMT solver, and generating a verification result of each candidate code segment.
  6. 6. The method of claim 5, wherein the performing code rewriting and compilation optimization with the validated shared function generates final optimized code, comprising: Under the condition that the verification result of the target candidate code segment indicates that the corresponding test function and the sharing function pass verification, the target candidate code segment is rewritten by utilizing the sharing function corresponding to the target candidate code segment, and a rewritten code segment is obtained; compiling the target candidate code segment and the rewritten code segment by using a compiler respectively, and if a compiling result indicates that the sharing function corresponding to the target candidate code segment can reduce the code quantity, generating a final optimized code based on the sharing function corresponding to the target candidate code segment; wherein the target candidate code segment is any one of the plurality of candidate code segments.
  7. 7. The method of claim 5, wherein after generating the verification result for each candidate code segment, the method further comprises: Under the condition that the verification result of the target candidate code segment indicates that the corresponding test function and the sharing function do not pass the verification, the failure feedback information is sent to the equivalent extraction intelligent body weight to regenerate the sharing function until the sharing function generated by the equivalent extraction intelligent body passes the verification; the failure feedback information comprises at least one item of target candidate code segments, sharing functions corresponding to the target candidate code segments and counterexample information output by the SMT solver.
  8. 8. A Simulink code optimization apparatus based on a large language model, the apparatus comprising: The code generation module is used for executing code generation on the target Simulink model, and supplementing semantic information to the generated code to obtain a code to be optimized, wherein the semantic information is used for identifying equivalent code segments; the equivalent identification module is used for carrying out semantic equivalent code segment identification on the code to be optimized by utilizing a large language model to obtain a plurality of candidate code segments, and generating a sharing function corresponding to each candidate code segment in the plurality of candidate code segments; and the code optimization module is used for verifying the semantic equivalence between each shared function and the corresponding candidate code segment, and performing code rewriting and compiling optimization by utilizing the verified shared function to generate a final optimized code.
  9. 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the large language model based Simulink code optimization method of any one of claims 1 to 7 when the program is executed.
  10. 10. A computer readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, implements the steps of the large language model based Simulink code optimization method according to any one of claims 1 to 7.

Description

Simulink code optimization method and device based on large language model Technical Field The application relates to the technical field of code optimization, in particular to a Simulink code optimization method and device based on a large language model. Background Simulinl is a key tool for Model-driven development (MDD), which has become a cornerstone in safety-critical system engineering, such as automotive systems, aerospace systems, and medical systems. The complex system behavior is abstracted into a high-level model, so that the production efficiency, the reliability and the maintainability of development are improved. Automatic code generation is a core component of Simulink, which can significantly reduce personnel effort and minimize potential coding errors. However, due to strict resource constraints, particularly in embedded scenarios, there are high requirements on the quality and volume of the generated code, and efficient and accurate code optimization is difficult to achieve with code generation schemes in the related art. Disclosure of Invention The application aims to provide a Simulink code optimization method and device based on a large language model, which are used for carrying out equivalent identification on code segments by using the large language model and reducing the identified code segments so as to reduce memory occupation. The application provides a Simulink code optimization method based on a large language model, which comprises the following steps: Executing code generation on a target Simulink model, supplementing semantic information on the generated code to obtain a code to be optimized, identifying equivalent code segments by using semantic information, identifying the code to be optimized by using a large language model to obtain a plurality of candidate code segments, generating a sharing function corresponding to each candidate code segment in the plurality of candidate code segments, verifying semantic equivalence between each sharing function and the corresponding candidate code segment, and performing code rewriting and compiling optimization by using the verified sharing function to generate a final optimized code. The method comprises the steps of generating a code to be optimized, carrying out semantic information supplementation on the generated code to obtain the code to be optimized, replacing hard coding constants in the generated code with variable forms based on a pre-constructed parameterized mapping relation table to obtain the replaced code, determining input ports and output ports of all subsystems based on a model structure and a connection relation of a target Simulink model, and adding comments to the replaced code based on the input ports and the output ports of all subsystems to obtain the code to be optimized, wherein the comments in the code to be optimized are used for identifying different code areas. The large language model comprises an agent to be extracted equivalently and an agent to be subjected to constraint inspection, semantic equivalent code segment identification is conducted on the code to be optimized through the large language model to obtain a plurality of candidate code segments, a sharing function corresponding to each candidate code segment in the plurality of candidate code segments is generated, the large language model comprises a first prompt word of the agent to be extracted equivalently and a second prompt word of the agent to be subjected to constraint inspection based on first prompt information, the candidate code segment identification is conducted on the agent to be extracted equivalently based on the first prompt word to obtain a plurality of code segments, the sharing function corresponding to each code segment is generated, verification is conducted on the sharing function corresponding to the plurality of code segments and each code segment through constraint inspection based on the second prompt word, and restoration of the sharing function is conducted on the plurality of candidate codes under the condition that abnormality is verified, and finally the sharing function corresponding to each candidate code is obtained, wherein the first prompt information comprises at least one of equivalence, a case specification, an inverse format and an inverse system. Optionally, the verifying the plurality of code segments and the shared function corresponding to each code segment by using the constraint verifying agent, and repairing the shared function under the condition that abnormality is verified, includes compiling a target shared function by using a compiler, if the compiling is successful, determining that the target shared function is successfully verified, otherwise, constructing a third hint word based on second hint information, and repairing the target shared function by using the constraint verifying agent based on the third hint word, wherein the target shared function is any one of the plurality of sh