Search

CN-122019567-A - Training method of code generation model, query code generation method and related device

CN122019567ACN 122019567 ACN122019567 ACN 122019567ACN-122019567-A

Abstract

The application provides a training method of a code generation model, a query code generation method and a related device, wherein the training method of the code generation model acquires a screening field and a return field from a target database, and determines attribute information of the screening field and attribute information of the return field; the method comprises the steps of constructing a target mapping pair based on a filtering field, a returning field, attribute information of the filtering field and attribute information of the returning field, and taking the target mapping pair as a training sample, wherein the target mapping pair comprises a natural language query sentence and a query code corresponding to the natural language query sentence, and utilizing the training sample to perform query code generation training on a pre-constructed code generation model to obtain a trained code generation model. By adopting the technical scheme of the application, a code generation model can be constructed, and the corresponding database query code can be automatically generated according to the natural language query statement, so that the generation efficiency of the data query code is improved, and the database query efficiency is further improved.

Inventors

  • WANG XU
  • LIANG HUADONG
  • XU FEIYANG
  • LI XIN
  • LIU JUNHUA
  • WANG SHIJIN
  • LIU CONG
  • HU GUOPING

Assignees

  • 科大讯飞股份有限公司

Dates

Publication Date
20260512
Application Date
20260120

Claims (13)

  1. 1. A method of training a code generation model, comprising: Acquiring a screening field and a return field from a target database, and determining attribute information of the screening field and attribute information of the return field; Constructing a target mapping pair based on the filtering field, the return field, the attribute information of the filtering field and the attribute information of the return field, and taking the target mapping pair as a training sample, wherein the target mapping pair comprises a natural language query statement and a query code corresponding to the natural language query statement; and carrying out query code generation training on the pre-constructed code generation model by utilizing the training sample to obtain a trained code generation model.
  2. 2. The training method of a code generation model according to claim 1, wherein the attribute information of the screening field includes all values corresponding to the screening field; Constructing a target mapping pair based on the filtering field, the return field, the attribute information of the filtering field, and the attribute information of the return field, including: Constructing a field combination based on the filtering field and the return field, wherein the field combination comprises at least one filtering field and at least one return field; determining screening conditions corresponding to the field combinations based on all values of the screening fields, and constructing target combinations based on the screening conditions and returned fields in the field combinations; And constructing a target mapping pair corresponding to the target combination.
  3. 3. The training method of a code generation model according to claim 2, wherein the attribute information of the return field includes a code call rule of the return field; Constructing a target mapping pair corresponding to the target combination, including: Generating a natural language query sentence corresponding to the target combination according to a pre-constructed natural language query sentence template; Generating a query code corresponding to the target combination based on a pre-constructed query code generation rule and a code calling rule of the return field; and taking the natural language query statement corresponding to the target combination and the query code corresponding to the target combination as a target mapping pair corresponding to the target combination.
  4. 4. The training method of a code generation model according to claim 2, wherein, in the case that the filtering field in the field combination is one, determining the filtering condition corresponding to the field combination based on all the values of the filtering field, and constructing the target combination based on the filtering condition and the return field in the field combination, includes: each value corresponding to the screening field in the field combination is respectively assigned to the screening field, and screening conditions corresponding to each value are obtained; And respectively combining each screening condition with a return field in the field combination to form a target combination.
  5. 5. The training method of a code generation model according to claim 2, wherein, in the case where a plurality of filtering fields are selected in the field combination, determining a filtering condition corresponding to the field combination based on all values of the filtering fields, and constructing a target combination based on the filtering condition and a return field in the field combination, includes: Constructing a value queue corresponding to the screening field based on all the values of the screening field; Assigning the value output by the value queue corresponding to the screening field in the field combination to the screening field to obtain screening conditions corresponding to each screening field in the field combination; And forming a target combination by the screening conditions corresponding to the screening fields in the field combination and the return fields in the field combination.
  6. 6. The training method of a code generation model according to claim 2, wherein the method further comprises, based on all values of the filtering field, determining a filtering condition corresponding to the field combination, and based on the filtering condition and a return field in the field combination, constructing a target combination, and then: And if the field combination comprises a plurality of screening fields or a plurality of return fields, sequentially adjusting a plurality of screening conditions and/or a plurality of return fields in the target combination to obtain the target combination after the sequential adjustment.
  7. 7. The training method of a code generation model according to claim 2, wherein the attribute information of the filtering field includes a data type corresponding to the filtering field; Constructing a target mapping pair based on the filtering field, the return field, the attribute information of the filtering field, and the attribute information of the return field, further comprising: if the data type corresponding to the screening field in the field combination accords with the preset complex value type, extracting an edge value from all the values corresponding to the screening field in the field combination; Assigning the edge value corresponding to the screening field in the field combination to the screening field to obtain an edge screening condition; Forming an edge target combination by the edge screening condition and a return field in the field combination; And constructing a target mapping pair corresponding to the edge target combination.
  8. 8. The training method of a code generation model according to claim 1, wherein after constructing a target map pair based on the filtering field, the return field, the attribute information of the filtering field, and the attribute information of the return field, further comprising: And according to a preset sentence rewriting rule, rewriting the natural language query sentence in the target mapping pair, and taking the target mapping pair rewritten by the natural language query sentence as a training sample.
  9. 9. A method for generating a query code, comprising: Acquiring a natural language query statement of a user; Inputting the user natural language query sentence into a pre-trained code generation model to obtain a target query code corresponding to the user natural language query sentence; wherein the code generation model is determined using the training method of the code generation model according to any one of claims 1 to 8.
  10. 10. A training device for a code generation model, comprising: The field acquisition module is used for acquiring a screening field and a return field from the target database and determining attribute information of the screening field and attribute information of the return field; The construction module is used for constructing a target mapping pair based on the filtering field, the return field, the attribute information of the filtering field and the attribute information of the return field, and taking the target mapping pair as a training sample, wherein the target mapping pair comprises a natural language query statement and a query code corresponding to the natural language query statement; and the training module is used for utilizing the training sample to perform query code generation training on the pre-constructed code generation model to obtain a trained code generation model.
  11. 11. A query code generation apparatus, comprising: the sentence acquisition module is used for acquiring a user natural language query sentence; The code generation module is used for inputting the user natural language query statement into a pre-trained code generation model to obtain a target query code corresponding to the user natural language query statement; wherein the code generation model is determined using the training method of the code generation model according to any one of claims 1 to 8.
  12. 12. An electronic device is characterized by comprising a memory and a processor; the memory is connected with the processor and used for storing programs; The processor is configured to implement the training method of the code generation model according to any one of claims 1 to 8 or the query code generation method according to claim 9 by running the program in the memory.
  13. 13. A computer program product comprising computer program instructions which, when executed by a processor, cause the processor to implement the training method of a code generation model according to any one of claims 1 to 8 or the query code generation method according to claim 9.

Description

Training method of code generation model, query code generation method and related device Technical Field The present application relates to the field of database query technologies, and in particular, to a training method for a code generation model, a query code generation method, and a related device. Background In the era of data-driven digitization, databases are widely used as core carriers for data storage and management in various fields such as scientific research, industry, finance, medical treatment and the like. Currently, the mainstream implementation manner of database query depends on professional query language, a user needs to deeply grasp the structure of a target database, write a query statement conforming to grammar specifications according to query requirements, submit a query request through a database client or an API interface, and finally acquire a query result. However, the above query method has high requirements on professional knowledge, and if a user needs to query the database, the user needs to learn the query language or relies on the assistance of professionals, so that the database query efficiency is severely restricted, and even if professionals write query sentences, the professionals also need to spend a certain time, and the database query efficiency is affected. Disclosure of Invention Based on the above requirements, the application provides a training method of a code generation model, a query code generation method and a related device, which can construct the code generation model and realize automatic generation of corresponding database query codes according to natural language query sentences, thereby improving the generation efficiency of the data query codes and further improving the database query efficiency. In order to achieve the above purpose, the present application proposes the following technical scheme: According to a first aspect of an embodiment of the present application, there is provided a training method of a code generation model, including: Acquiring a screening field and a return field from a target database, and determining attribute information of the screening field and attribute information of the return field; Constructing a target mapping pair based on the filtering field, the return field, the attribute information of the filtering field and the attribute information of the return field, and taking the target mapping pair as a training sample, wherein the target mapping pair comprises a natural language query statement and a query code corresponding to the natural language query statement; and carrying out query code generation training on the pre-constructed code generation model by utilizing the training sample to obtain a trained code generation model. Optionally, the attribute information of the filtering field includes all values corresponding to the filtering field; Constructing a target mapping pair based on the filtering field, the return field, the attribute information of the filtering field, and the attribute information of the return field, including: Constructing a field combination based on the filtering field and the return field, wherein the field combination comprises at least one filtering field and at least one return field; determining screening conditions corresponding to the field combinations based on all values of the screening fields, and constructing target combinations based on the screening conditions and returned fields in the field combinations; And constructing a target mapping pair corresponding to the target combination. Optionally, the attribute information of the return field includes a code call rule of the return field; Constructing a target mapping pair corresponding to the target combination, including: Generating a natural language query sentence corresponding to the target combination according to a pre-constructed natural language query sentence template; Generating a query code corresponding to the target combination based on a pre-constructed query code generation rule and a code calling rule of the return field; and taking the natural language query statement corresponding to the target combination and the query code corresponding to the target combination as a target mapping pair corresponding to the target combination. Optionally, in the case that the filtering field in the field combination is one, determining a filtering condition corresponding to the field combination based on all values of the filtering field, and constructing a target combination based on the filtering condition and a return field in the field combination, where the method includes: each value corresponding to the screening field in the field combination is respectively assigned to the screening field, and screening conditions corresponding to each value are obtained; And respectively combining each screening condition with a return field in the field combination to form a target combination. Optionally, when the n