Search

CN-122019597-A - Data query method, related device and computer program product

CN122019597ACN 122019597 ACN122019597 ACN 122019597ACN-122019597-A

Abstract

The application discloses a data query method, related equipment and a computer program product, which are used for determining a target service scene to which a request belongs after receiving a query request of a user, and obtaining a first data table set corresponding to the target service scene by querying the corresponding relation between a database table and the service scene. And acquiring the table structure information of each data table in the first data table set, calling the large model to generate a structured query statement based on the query request and the table structure information, and executing the structured query statement in the database to obtain a returned query result. According to the application, the first data table set related to the query request can be screened according to the service scene to which the query request belongs, so that the operation in a full database is avoided, and the interference of irrelevant data tables is reduced. By calling the large model, a structured query statement meeting the requirements of query request semantics and data table structure can be generated, and more accurate query results can be obtained after the structured query statement is executed in the database.

Inventors

  • ZHA HONGYU
  • WANG YONGHAI
  • HUANG TAO

Assignees

  • 科大讯飞股份有限公司

Dates

Publication Date
20260512
Application Date
20260413

Claims (10)

  1. 1. A method of querying data, comprising: Acquiring a query request of a user, and determining a target service scene to which the query request belongs; Inquiring the corresponding relation between a configured database table and a business scene to obtain a first data table set corresponding to the target business scene, wherein the corresponding relation between the database table and the business scene is determined according to the following modes that database metadata are acquired, a group of business scenes are induced by a second prompt word indicating a large model based on the database metadata, scene parameters are given for each induced business scene, the business scene to which each database table belongs in the database is determined, and the corresponding relation between the database table and the business scene is generated, wherein the scene parameters comprise at least one of scene definition, keywords, typical tables and boundary rules; Acquiring table structure information of each data table in the first data table set, and indicating a large model to generate a structured query statement based on the query request and the table structure information through a first prompt word; and executing the structured query statement in a database to obtain a query result.
  2. 2. The method of claim 1, wherein the table structure information of each data table comprises a data dictionary and a configuration file; The data dictionary is recorded with a data table and the definitions of fields therein, the definitions of the fields comprise field types, and the field types comprise index classes and dimension classes; The configuration file includes enumerated values for at least a portion of the dimension class fields in the data table.
  3. 3. The method of claim 1, further comprising, prior to obtaining the query request of the user: Indicating a large model through a third prompting word, and generating a paraphrase of each data table according to the table structure of each data table in the database; Indicating a large model through a fourth prompting word, and carrying out semantic analysis on each field according to the table name, the field name and the field type of each data table to obtain the definition of the field, wherein the definition of the field comprises the type of the field, and the field type comprises an index class and a dimension class; For a dimension field in a data table, if the data quantity of the field after the enumeration value is de-duplicated is greater than a set threshold value, the enumeration value is vectorized and then written into a vector library, and if the data quantity of the field after the enumeration value is de-duplicated is not greater than the set threshold value, the enumeration value is written into a configuration file of the data table; the definition of the data table and the definition of the field in the data table form a data dictionary, and the data dictionary and the configuration file form table structure information of the data table.
  4. 4. The method of claim 1, wherein the first prompting word includes: A success sample, wherein the success sample comprises a query request sample and a corresponding correct structured query statement; The method comprises a failure sample and correction, wherein the failure sample and correction comprises a query request sample and a structural query statement corresponding to the query request sample, and the structural query statement after the correction is carried out on the structural query statement with the error.
  5. 5. The method of claim 4, further comprising, prior to obtaining the query request from the user: acquiring an existing real structured query statement, and calling a large model to reversely generate a diversified query request corresponding to the real structured query statement; Based on each query request, generating a multi-candidate structured query statement corresponding to the query request, screening the structured query statement which can be executed and has consistent execution results, and forming a success sample with the query request.
  6. 6. The method of claim 1, further comprising, prior to instructing, by the first hint word, that the large model generates a structured query statement based on the query request and the table structure information: acquiring the configured slot type corresponding to the target service scene, and calling a large model to extract slot information of the slot of the type corresponding to the target service scene from the query request; for the slot information, searching the first N values with the highest matching degree with the slot information in a vector library of the corresponding type of slot by adopting a vector searching mode, wherein N is more than or equal to 1; Calling a large model, based on the query request, identifying a best-matched target value from the first N values with highest matching degree with the slot information, and replacing the slot information in the query request by using the target value to obtain a rewritten query request; A process of indicating, by the first hint word, that the large model generates a structured query statement based on the query request and the table structure information, comprising: And indicating a large model by a first prompt word to generate a structured query statement based on the rewritten query request and the table structure information.
  7. 7. The method of any one of claims 1-6, further comprising: And calling a large model, generating a natural language reply based on the query request and the query result, and outputting the natural language reply.
  8. 8. An electronic device is characterized by comprising a memory and a processor; The memory is used for storing programs; The processor is configured to execute the program to implement the steps of the data query method according to any one of claims 1 to 7.
  9. 9. A readable storage medium having stored thereon a computer program, which, when executed by a processor, implements the steps of the data querying method according to any of claims 1-7.
  10. 10. A computer program product comprising a computer program which, when executed by a processor, implements the steps of the data querying method as claimed in any one of claims 1 to 7.

Description

Data query method, related device and computer program product Technical Field The present application relates to the field of data query technologies, and in particular, to a data query method, related devices, and a computer program product. Background Along with the continuous deep digital transformation, the traditional data analysis method generally requires a user to have higher professional technical capability (such as SQL writing, visual tool operation and report design), so that a large number of product business personnel, operation and management personnel cannot acquire required data information autonomously and timely. The partial scheme adopts a large language model LLM to assist in data query, and answers matched with the user query request are searched on a database full data table by utilizing the LLM and are output to the user. However, when the number of data tables in the database is huge, part of data is easily lost in the scheme, so that the query result is inaccurate. Disclosure of Invention In view of the foregoing, the present application provides a data query method, related apparatus, and computer program product for improving accuracy of data query results. The specific scheme is as follows: In a first aspect, a data query method is provided, including: Acquiring a query request of a user, and determining a target service scene to which the query request belongs; Inquiring the corresponding relation between a configured database table and a business scene to obtain a first data table set corresponding to the target business scene, wherein the corresponding relation between the database table and the business scene is determined according to the following modes that database metadata are acquired, a group of business scenes are induced by a second prompt word indicating a large model based on the database metadata, scene parameters are given for each induced business scene, the business scene to which each database table belongs in the database is determined, and the corresponding relation between the database table and the business scene is generated, wherein the scene parameters comprise at least one of scene definition, keywords, typical tables and boundary rules; Acquiring table structure information of each data table in the first data table set, and indicating a large model to generate a structured query statement based on the query request and the table structure information through a first prompt word; and executing the structured query statement in a database to obtain a query result. In another implementation manner of the first aspect of the embodiment of the present application, the table structure information of each data table includes a data dictionary and a configuration file; The data dictionary is recorded with a data table and the definitions of fields therein, the definitions of the fields comprise field types, and the field types comprise index classes and dimension classes; The configuration file includes enumerated values for at least a portion of the dimension class fields in the data table. In another implementation manner of the first aspect of the embodiment of the present application, before the obtaining the query request of the user, the method further includes: Indicating a large model through a third prompting word, and generating a paraphrase of each data table according to the table structure of each data table in the database; Indicating a large model through a fourth prompting word, and carrying out semantic analysis on each field according to the table name, the field name and the field type of each data table to obtain the definition of the field, wherein the definition of the field comprises the type of the field, and the field type comprises an index class and a dimension class; For a dimension field in a data table, if the data quantity of the field after the enumeration value is de-duplicated is greater than a set threshold value, the enumeration value is vectorized and then written into a vector library, and if the data quantity of the field after the enumeration value is de-duplicated is not greater than the set threshold value, the enumeration value is written into a configuration file of the data table; the definition of the data table and the definition of the field in the data table form a data dictionary, and the data dictionary and the configuration file form table structure information of the data table. In another implementation manner of the first aspect of the embodiment of the present application, the first prompting word includes: A success sample, wherein the success sample comprises a query request sample and a corresponding correct structured query statement; The method comprises a failure sample and correction, wherein the failure sample and correction comprises a query request sample and a structural query statement corresponding to the query request sample, and the structural query statement after the correction is car