CN-121980008-A - Content generation method and device, electronic equipment and storage medium

CN121980008ACN 121980008 ACN121980008 ACN 121980008ACN-121980008-A

Abstract

The application discloses a content generation method, a content generation device, electronic equipment and a storage medium. The method comprises the steps of searching a plurality of candidate text blocks related to content generation instruction semantics in a vector database according to content generation instructions, searching corresponding target sub-graphs in a knowledge graph according to the content generation instructions, wherein the target sub-graphs comprise target entities in the content generation instructions, calculating semantic topological consistency coefficients of target entity pairs in the candidate text blocks according to the target sub-graphs for each candidate text block, wherein the semantic topological consistency coefficients are used for representing the consistency degree of semantic similarity and logic relation strength between the two target entities, determining the candidate text blocks with the semantic topological consistency coefficients equal to or larger than a preset consistency coefficient threshold value as target text blocks, and generating contents corresponding to the content generation instructions according to the target text blocks and a generation model. The application improves the accuracy of content generation.

Inventors

LI TAO
XING CHENG

Assignees

郑州阿帕斯数云信息科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260120

Claims (10)

1. A method of generating content, comprising: According to a content generation instruction input by a user, searching a plurality of candidate text blocks semantically related to the content generation instruction in a vector database; According to the content generation instruction, searching a corresponding target sub-graph in a knowledge graph of the target field, wherein the target sub-graph comprises a target entity in the content generation instruction; For each candidate text block, calculating semantic topological consistency coefficients of target entity pairs in the candidate text block according to the target subgraph, wherein the target entity pairs comprise two target entities, and the semantic topological consistency coefficients are used for representing the consistency degree of semantic similarity and logic relationship strength between the two target entities; Determining the candidate text blocks with the semantic topological consistency coefficients equal to or larger than a preset consistency coefficient threshold as target text blocks; and generating the content corresponding to the content generation instruction according to the target text block and the generation model.
2. The method of claim 1, wherein said calculating semantic topological consistency coefficients for target entity pairs in said candidate text block from said target subgraph comprises: identifying the target entity pair in the candidate text block; Calculating the similarity of two target entities in the pair of target entities in a vector space; Calculating the shortest path hops of the two target entities in the target subgraph; And calculating the semantic topological consistency coefficient according to the similarity and the shortest path hop count.
3. The method of claim 2, wherein said calculating said semantic topological consistency factor based on said similarity and said shortest path hop count comprises: Calculating the semantic topological consistency coefficient by adopting the following logic formula: ; Wherein Cst is the semantic topological consistency coefficient, α is a preset first weight coefficient, V sim is a similarity, β is a preset second weight coefficient, decay is a nonlinear decay function, and D graph is the shortest path hop count.
4. The method as recited in claim 1, further comprising: performing graph similarity verification on the content according to the knowledge graph; if the verification is passed, determining that the content is qualified; And if the verification is not passed, generating a correction instruction, and feeding back the correction instruction to the generation model to control the generation model to regenerate the content.
5. The method of claim 4, wherein the performing graph similarity verification on the content according to the knowledge-graph comprises: analyzing the content into corresponding fact triples; Mapping the fact triples into a temporary graph structure; Retrieving a corresponding standard subgraph in the knowledge graph according to the fact triplet; calculating the graph similarity between the temporary graph structure and the standard subgraph; And judging whether the content passes the verification according to the similarity of the graph.
6. The method of claim 5, wherein said calculating graph similarity between the temporary graph structure and the standard subgraph comprises: And calculating a graph editing distance between the temporary graph structure and the standard subgraph.
7. The method of claim 6, wherein said determining whether the content passes verification based on the graph similarity comprises: if the graph editing distance is equal to zero, judging that the content verification passes; and if the graph editing distance is greater than zero, judging that the content verification is not passed.
8. A content generating apparatus, comprising: The vector retrieval module is used for retrieving a plurality of candidate text blocks semantically related to the content generation instruction in a vector database according to the content generation instruction input by a user; The map retrieval module is used for retrieving a corresponding target sub-map in a knowledge map of the target field according to the content generation instruction, wherein the target sub-map comprises target entities in the content generation instruction; The computing module is used for computing semantic topological consistency coefficients of target entity pairs in the candidate text blocks according to the target subgraph for each candidate text block, wherein the target entity pairs comprise two target entities, and the semantic topological consistency coefficients are used for representing the consistency degree of semantic similarity and logic relationship strength between the two target entities; the determining module is used for determining the candidate text blocks with the semantic topological consistency coefficients equal to or larger than a preset consistency coefficient threshold value as target text blocks; and the generation module is used for generating the content corresponding to the content generation instruction according to the target text block and the generation model.
9. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which program or instruction when executed by the processor implements the steps of the method of any of claims 1-7.
10. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1-7.

Description

Content generation method and device, electronic equipment and storage medium Technical Field The application belongs to the technical field of artificial intelligence, and particularly relates to a content generation method, a device, electronic equipment and a storage medium. Background At present, the content generation based on the model usually adopts a retrieval enhancement generation (RETRIEVAL-augmented Generation, abbreviated as RAG) technology, and usually performs semantic retrieval of unstructured data by combining a Vector database (Vector DB) or performs relational retrieval of structured data by combining a Knowledge Graph (KG). But text blocks retrieved by the vector database (based on semantic similarity) often conflict with logical facts in the knowledge-graph (based on topological connections). In the related technology, a simple splicing or fixed priority strategy is generally adopted, so that the 'who is when the semantic similarity is high but the map path does not exist' can not be judged dynamically, the model generates 'plausible' phantom content, and the accuracy of content generation is reduced. Disclosure of Invention An object of an embodiment of the present application is to provide a method, an apparatus, an electronic device, and a storage medium for generating content, so as to solve the problem that in the related art, the accuracy of content generation is reduced. In order to achieve the above purpose, the embodiment of the present application adopts the following technical scheme: According to a first aspect, the embodiment of the application provides a content generation method, which comprises the steps of searching a plurality of candidate text blocks related to content generation instruction semantics in a vector database according to content generation instructions input by a user, searching corresponding target subgraphs in a knowledge graph of a target field according to the content generation instructions, wherein the target subgraphs comprise target entities in the content generation instructions, calculating semantic topological consistency coefficients of target entity pairs in the candidate text blocks according to the target subgraphs for each candidate text block, wherein the target entity pairs comprise two target entities, the semantic topological consistency coefficients are used for representing the consistency degree of semantic similarity and logic relation strength between the two target entities, determining the candidate text blocks with the semantic topological consistency coefficients equal to or larger than a preset consistency coefficient threshold value as target text blocks, and generating content corresponding to the content generation instructions according to the target text blocks and a generation model. In a second aspect, an embodiment of the application provides a content generation device, which comprises a vector retrieval module, a map retrieval module, a calculation module and a generation module, wherein the vector retrieval module is used for retrieving a plurality of candidate text blocks related to content generation instruction semantics in a vector database according to content generation instructions input by a user, the map retrieval module is used for retrieving corresponding target subgraphs in a knowledge map of a target field according to the content generation instructions, the target subgraphs comprise target entities in the content generation instructions, the calculation module is used for calculating semantic topological consistency coefficients of target entity pairs in the candidate text blocks according to the target subgraphs, the target entity pairs comprise two target entities, the semantic topological consistency coefficients are used for representing the consistency degree of semantic similarity and logic relation strength between the two target entities, the determination module is used for determining the candidate text blocks with the semantic topological consistency coefficients equal to or larger than a preset consistency coefficient threshold value as target text blocks, and the generation module is used for generating content corresponding to the content generation instructions according to the target text blocks and a generation model. In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction implementing the steps of the method according to the embodiment of the first aspect of the present application when executed by the processor. In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor perform the steps of the method according to the embodiments of the first aspect of the present application. The above at l