Search

CN-122021572-A - Method, device, equipment and medium for automatically generating report by batch database table

CN122021572ACN 122021572 ACN122021572 ACN 122021572ACN-122021572-A

Abstract

The application belongs to the technical field of data processing, and relates to a method, a device, equipment and a medium for automatically generating reports by batch database tables. The method comprises the steps of obtaining a batch of tables of a report to be generated, generating a knowledge graph, obtaining business text data, obtaining an enhancement vector, obtaining a mixed index according to the knowledge graph and the enhancement vector, obtaining a query statement of a user, outputting a fusion vector, quantizing and reconstructing the fusion vector, splicing the fusion vector with a semantic tag vector to obtain a subspace, searching according to the subspace and the mixed index, carrying out internal recall by adopting a coarse recall and fine-ranking two-stage strategy to obtain a priority context, inserting a mark at a splicing position, carrying out intention and pattern alignment on the priority context according to the inserted mark, checking to obtain a query result, and taking the query result as input of a large language model to obtain the report. The report can be automatically generated for the batch form by adopting the application.

Inventors

  • LIU YUEHUA

Assignees

  • 湖南正宇软件技术开发有限公司

Dates

Publication Date
20260512
Application Date
20260409

Claims (10)

  1. 1. A method for automatically generating reports from batch database tables, comprising: acquiring a batch database table of a report to be generated, deriving and analyzing a corresponding dynamic link library to obtain fields, meanings and relations of the dynamic link library and semantic tags of each field so as to generate a knowledge graph, acquiring service text data and encoding to obtain data vectors, and splicing the data vectors with the semantic tag vectors to obtain enhancement vectors; The method comprises the steps of obtaining a query sentence of a user, obtaining a JSON character string, respectively sending the JSON character string into a Transformer double tower sharing a freezing trunk, and outputting a fusion vector, quantizing and reconstructing the fusion vector, and splicing the fusion vector with a semantic tag vector to form an entry vector so as to obtain a subspace; according to the insertion mark, aligning the intention and the mode of the priority context, and checking to generate an executable statement; And focusing by taking the query result as the input of the large language model, and matching the core dimension to obtain a report.
  2. 2. The method for automatically generating reports according to claim 1, wherein obtaining a batch of tables to be reported, deriving and parsing a corresponding dynamic link library to obtain fields, meanings, relationships of the dynamic link library and semantic tags of each field, to generate a knowledge graph, comprises: acquiring a batch database table of a report to be generated, and exporting a corresponding dynamic link library; Extracting a main external key, a table field value, a table field name and a data type of a database table in the dynamic link library by adopting a parser to obtain fields, meanings and relations of the dynamic link library, and adding semantic tags for each field; and generating a knowledge graph by taking the fields as nodes, the relationships as edges, and the data types, the meanings and the semantic tags as node attributes.
  3. 3. The method for automatically generating reports from batch database tables according to claim 2, wherein storing knowledge maps in a map database, storing enhancement vectors in a vector database, and obtaining a mixed index of maps and vectors, comprises: And storing the knowledge graph in a graph database, storing the enhancement vector in a vector database, and associating the graph database and the vector database according to the identity identification number of the node to obtain the mixed index of the graph and the vector.
  4. 4. A method for automatically generating reports from batch database tables according to any one of claims 1 to 3, wherein obtaining query sentences of a user, obtaining JSON strings, and respectively sending the JSON strings to a Transformer double tower sharing a frozen backbone, and outputting fusion vectors, comprises: acquiring a query sentence of a user, obtaining a keyword, and combining a knowledge graph to obtain a JSON character string; and respectively sending the query statement and the JSON character string of the user into a Transformer double tower of the shared freezing trunk, and outputting a fusion vector.
  5. 5. The method for automatically generating reports on batch database tables according to claim 4, wherein quantizing and reconstructing the fusion vector and concatenating the fusion vector with the semantic tag vector to form an entry vector for obtaining the subspace, comprising: Quantizing and reconstructing the fusion vector to obtain a residual vector, and splicing the residual vector with the semantic tag vector to form an entry vector; from the entry vector, a subspace is obtained.
  6. 6. The method for automatically generating reports on batch database tables according to claim 5, wherein retrieving according to subspaces and hybrid indexes and internal recall using a coarse recall and fine-ranked bi-level strategy, obtaining a priority context and inserting a tag, comprises: Searching according to subspaces and the mixed indexes, and performing internal recall by adopting a two-stage strategy of coarse recall and fine discharge to output candidate vectors; and obtaining a priority context according to the candidate vector, and inserting a mark.
  7. 7. A method for automatically generating reports from batch database tables according to any of claims 1 to 3, wherein the aligning of the intent and pattern of the priority context according to the inserted mark and the checking are performed to generate an executable statement, and the obtaining the query result according to the executable statement comprises: according to the inserted mark, adopting a regular expression to align the intention and the mode of the priority context; after alignment, a DSL-Skeleton generator is called, the identity number of the node corresponding to the candidate vector is used as a leaf node, a minimum connected subgraph algorithm is operated on the knowledge graph, and a most simple table connection path is generated along the edge; After generating the most simple table connection path, checking to generate an executable statement; Sending the executable statement to a database to obtain an original result set; And judging that the original result set is not empty, generating a statistical JSON statement to obtain the query result.
  8. 8. An apparatus for automatically generating reports from batch database tables, comprising: The system comprises a first module, a second module, a third module, a fourth module, a fifth module, a sixth module, a seventh module, a first module and a fourth module, wherein the first module is used for acquiring a batch database table of a report to be generated, deriving and analyzing a corresponding dynamic link library to obtain fields, meanings and relations of the dynamic link library and semantic labels of each field so as to generate a knowledge graph; The second module is used for obtaining query sentences of users, obtaining JSON character strings, respectively sending the JSON character strings into a transform double tower sharing a freezing trunk, outputting fusion vectors, quantizing and reconstructing the fusion vectors, and splicing the fusion vectors with semantic tag vectors to form entry vectors so as to obtain subspaces; The third module is used for aligning the intention and the mode of the priority context according to the insertion mark and checking the intention and the mode to generate an executable statement; And the fourth module is used for focusing by taking the query result as the input of the large language model and matching the core dimension to obtain a report.
  9. 9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
  10. 10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.

Description

Method, device, equipment and medium for automatically generating report by batch database table Technical Field The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a medium for automatically generating a report in a batch database table. Background With the advancement of science and technology, the manner in which data is processed has also changed. In the prior art, sql query sentences are generally written manually to extract data from a database, and then the report is generated by manual summary analysis. With the report generation method described above, there are the following problems: 1. the data retrieval relies on the traditional database query, and the user semantics cannot be understood, so that the retrieval efficiency is low and the retrieval result is inaccurate. 2. The report generation is dependent on manual work, and has low automation degree and long generation period. 3. When the user intent changes, it is not possible to respond in real time and generate a new report. Disclosure of Invention Based on the above, it is necessary to provide a method, apparatus, device and medium for automatically generating reports from batch database tables, which can understand the semantics of users for batch tables, quickly retrieve data related to the needs of users from the database, and automatically generate high-quality statistical analysis reports by using a large language model. A method for automatically generating reports from batch database tables, comprising: acquiring a batch database table of a report to be generated, deriving and analyzing a corresponding dynamic link library to obtain fields, meanings and relations of the dynamic link library and semantic tags of each field so as to generate a knowledge graph, acquiring service text data and encoding to obtain data vectors, and splicing the data vectors with the semantic tag vectors to obtain enhancement vectors; The method comprises the steps of obtaining a query sentence of a user, obtaining a JSON character string, respectively sending the JSON character string into a Transformer double tower sharing a freezing trunk, and outputting a fusion vector, quantizing and reconstructing the fusion vector, and splicing the fusion vector with a semantic tag vector to form an entry vector so as to obtain a subspace; according to the insertion mark, aligning the intention and the mode of the priority context, and checking to generate an executable statement; And focusing by taking the query result as the input of the large language model, and matching the core dimension to obtain a report. In one embodiment, obtaining a batch of tables of a report to be generated, deriving a corresponding dynamic link library and analyzing the batch of tables to obtain fields, meanings, relations and semantic tags of each field of the dynamic link library to generate a knowledge graph, including: acquiring a batch database table of a report to be generated, and exporting a corresponding dynamic link library; Extracting a main external key, a table field value, a table field name and a data type of a database table in the dynamic link library by adopting a parser to obtain fields, meanings and relations of the dynamic link library, and adding semantic tags for each field; and generating a knowledge graph by taking the fields as nodes, the relationships as edges, and the data types, the meanings and the semantic tags as node attributes. In one embodiment, storing the knowledge-graph in a graph database, storing the enhancement vectors in a vector database, and obtaining a hybrid index of the graph and the vectors, includes: And storing the knowledge graph in a graph database, storing the enhancement vector in a vector database, and associating the graph database and the vector database according to the identity identification number of the node to obtain the mixed index of the graph and the vector. In one embodiment, obtaining a query sentence of a user, obtaining JSON strings, and respectively sending the JSON strings to a converter double tower sharing a frozen backbone, and outputting a fusion vector, including: acquiring a query sentence of a user, obtaining a keyword, and combining a knowledge graph to obtain a JSON character string; and respectively sending the query statement and the JSON character string of the user into a Transformer double tower of the shared freezing trunk, and outputting a fusion vector. In one embodiment, quantizing and reconstructing the fusion vector and concatenating with the semantic tag vector to form an entry vector to obtain a subspace, comprising: Quantizing and reconstructing the fusion vector to obtain a residual vector, and splicing the residual vector with the semantic tag vector to form an entry vector; from the entry vector, a subspace is obtained. In one embodiment, retrieving according to subspace and hybrid index, and internal recal