CN-121979507-A - Method and related equipment for generating codes based on specific languages in multiple semantic domain fields
Abstract
The embodiment of the invention relates to the technical field of artificial intelligence and discloses a method and related equipment for generating codes based on a multi-semantic domain specific language, wherein the method comprises the steps of receiving a formal protocol file describing a financial system, wherein the protocol file is written based on the multi-semantic domain specific language, and the semantic domain at least comprises an entity domain, a transaction domain, a strategy domain and an attribute domain; the method comprises the steps of operating a protocol processing engine to analyze a protocol file, extracting system-level attributes, generating a structured code generation instruction through a multi-layer prompt template, calling an artificial intelligent code generation model to generate a source code, abstracting the behavior of the source code into a finite state model, carrying out exhaustive verification on the finite state model based on the system-level attributes, and outputting the source code as a qualified source code after the verification is passed. By the method, the reliability, the safety and the compliance of code generation in the financial field can be improved.
Inventors
- LI MINGJUN
- YE XIAOJUN
Assignees
- 国信证券股份有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260409
Claims (10)
- 1. A method of generating code based on a multi-semantic domain-specific language, the method comprising: Receiving a formal specification file describing a financial system, wherein the specification file is written based on a domain-specific language of a multi-semantic domain, the semantic domain at least comprises an entity domain, a transaction domain, a strategy domain and an attribute domain, the entity domain comprises an entity statement, the entity statement defines a data model, a state variable and an invariance which is constantly established on the data model of the system, the transaction domain comprises a transaction statement, the transaction statement defines business operation of the system and corresponds to a pre-condition and a post-condition for each business operation specification; The method comprises the steps of operating a preset protocol processing engine based on the protocol file, analyzing the protocol file, extracting system-level attributes based on the analyzed protocol file, and generating a structured code generation instruction through a multi-layer prompt template; Abstracting the behavior of the source code into a finite state model, carrying out exhaustive verification on the finite state model based on the system-level attribute, and outputting the source code as qualified source code after verification is passed.
- 2. The method of claim 1, wherein the semantic domain further comprises an integration domain, an interaction domain, and an organization domain, The integration domain defines interface contracts, service calls and communication protocols between the financial system and an external system; the interaction domain defines a user interface component, user-executable operations, and a multi-step manual approval workflow; the organization domain defines user roles, operating rights of the user roles, and responsibility separation rules within the financial system.
- 3. The method of claim 1, wherein parsing the specification file, extracting system level attributes based on the parsed specification file, comprises: performing lexical analysis and grammar analysis on the specification file to generate an abstract grammar tree, and performing multi-pass semantic analysis on the abstract grammar tree to obtain an abstract grammar tree with semantic notes; converting the abstract syntax tree of semantic notes into a syntax-independent intermediate representation, and extracting the system-level attributes based on the intermediate representation.
- 4. A method according to claim 3, wherein the multi-pass semantic analysis comprises: Traversing the abstract grammar tree, identifying statement sentences, recording the statement symbols and basic information into a layered symbol table, checking types and references based on the symbol table, and carrying out consistency and logic verification.
- 5. A method according to claim 3, wherein the intermediate representation is a set of mutually referenced data objects including a solid model object, a trade contract object, a compliance policy object and a time sequence property object.
- 6. The method of claim 1, wherein the multi-layered hint template comprises a system layer, a specification layer, a task layer, and a constraint layer, wherein generating the structured code generation instruction via the multi-layered hint template comprises: setting roles and global instructions of the code generation instructions based on the system layer, injecting specification information into the code generation instructions based on the specification layer, defining specific generation targets of the code generation instructions based on the task layer, and defining output formats of the code generation instructions based on the constraint layer to obtain structured code generation instructions.
- 7. The method of claim 1, wherein abstracting the behavior of the source code into a finite state model and performing exhaustive verification of the finite state model based on the system-level attributes comprises: Analyzing variables related to the entities in the entity domain in the source code, enumerating the value combination of the variables into a state set, extracting a function call path and conditional branch logic for changing the variables into a state transition relation, and mapping atomic propositions in the attribute domain into tag functions on states so as to abstract the behavior of the source code into the finite state model; All reachable states of the finite state model are explored exhaustively through a model checking algorithm to verify whether all execution paths of the finite state model satisfy the system level attributes.
- 8. An apparatus for generating code based on a multi-semantic domain-specific language, the apparatus comprising: The system comprises a receiving module, a forming and managing module, a processing module and a processing module, wherein the receiving module is used for receiving a formal protocol file describing a financial system, the protocol file is written based on a domain-specific language of a plurality of semantic domains, the semantic domains at least comprise an entity domain, a transaction domain, a strategy domain and an attribute domain, the entity domain comprises an entity statement, the entity statement defines a data model, a state variable and an invariance which is constantly established on the data model of the system, the transaction domain comprises a transaction statement, the transaction statement defines business operation of the system and corresponds to a precondition and a post-condition for each business operation protocol, the strategy domain comprises a strategy statement, the strategy statement defines global compliance and a wind control rule which span a single transaction, the attribute domain comprises an attribute statement, the attribute statement is expressed by adopting time sequence logic, and the safety and activity attribute which is satisfied by the financial system on all execution paths are regulated; The protocol processing module is used for running a preset protocol processing engine based on the protocol file, analyzing the protocol file, extracting system-level attributes based on the analyzed protocol file, and generating a structured code generation instruction through a multi-layer prompt template; and the verification module is used for abstracting the behavior of the source code into a finite state model, carrying out exhaustive verification on the finite state model based on the system-level attribute, and outputting the source code as qualified source code after the verification is passed.
- 9. The computer equipment is characterized by comprising a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus; The memory is configured to hold at least one executable instruction that causes the processor to perform the method of any one of claims 1-7.
- 10. A computer readable storage medium having stored therein at least one executable instruction which, when executed on a computer device, causes the computer device to perform the method of any of claims 1-7.
Description
Method and related equipment for generating codes based on specific languages in multiple semantic domain fields Technical Field The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a method and related equipment for generating codes based on a specific language in the field of multiple semantic domains. Background Artificial intelligence techniques, represented by the large language model LLM, are changing traditional software development models, which have great efficiency potential, as well as inherent fundamental drawbacks-probabilistic and unpredictable. The artificial intelligence generated code may appear functionally correct but may be deep enough to conceal subtle logic errors, security vulnerabilities, or deviations from complex business rules and regulatory requirements. In some general software development, these problems can be corrected by traditional testing and manual inspection, but in the financial field, since the financial system is a complex, multiple dimensions such as data, transaction, compliance, system integration, man-machine interaction and organization authority are involved, the efficiency and accuracy are low by traditional code generation, testing and manual inspection, the reliability is also insufficient, and especially in the core business systems such as transaction, clearing and wind control, any tiny errors can cause linkage reaction, so that huge systematic risks are caused. Therefore, the problems of insufficient reliability, safety and compliance existing in the technology of generating the AI code in the financial field are to be solved. Disclosure of Invention In view of the above problems, embodiments of the present invention provide a method and related device for generating codes based on a specific language in a multi-semantic domain field, which are used for solving the problems existing in the prior art. According to an aspect of an embodiment of the present invention, there is provided a method of generating code based on a multi-semantic domain-specific language, the method comprising: Receiving a formal specification file describing a financial system, wherein the specification file is written based on a domain-specific language of a multi-semantic domain, the semantic domain at least comprises an entity domain, a transaction domain, a strategy domain and an attribute domain, the entity domain comprises an entity statement, the entity statement defines a data model, a state variable and an invariance which is constantly established on the data model of the system, the transaction domain comprises a transaction statement, the transaction statement defines business operation of the system and corresponds to a pre-condition and a post-condition for each business operation specification; The method comprises the steps of operating a preset protocol processing engine based on the protocol file, analyzing the protocol file, extracting system-level attributes based on the analyzed protocol file, and generating a structured code generation instruction through a multi-layer prompt template; Abstracting the behavior of the source code into a finite state model, carrying out exhaustive verification on the finite state model based on the system-level attribute, and outputting the source code as qualified source code after verification is passed. In an alternative, the semantic domain further comprises an integration domain, an interaction domain and an organization domain, The integration domain defines interface contracts, service calls and communication protocols between the financial system and an external system; the interaction domain defines a user interface component, user-executable operations, and a multi-step manual approval workflow; the organization domain defines user roles, operating rights of the user roles, and responsibility separation rules within the financial system. In an optional manner, the parsing the protocol file, extracting the system-level attribute based on the parsed protocol file, includes: performing lexical analysis and grammar analysis on the specification file to generate an abstract grammar tree, and performing multi-pass semantic analysis on the abstract grammar tree to obtain an abstract grammar tree with semantic notes; converting the abstract syntax tree of semantic notes into a syntax-independent intermediate representation, and extracting the system-level attributes based on the intermediate representation. In an alternative way, the multi-pass semantic analysis includes: Traversing the abstract grammar tree, identifying statement sentences, recording the statement symbols and basic information into a layered symbol table, checking types and references based on the symbol table, and carrying out consistency and logic verification. In an alternative manner, the intermediate representation is a set of mutually referenced data objects including a solid model object, a trade c