CN-122021901-A - Training data generation method and content auditing method

CN122021901ACN 122021901 ACN122021901 ACN 122021901ACN-122021901-A

Abstract

The application relates to a training data generation method and a content auditing method. The training data generation method comprises the steps of building an information map based on webpage mention entities respectively associated with a plurality of webpage nodes, iteratively executing the steps of sampling a plurality of evidence paths from the information map, generating question-answer pairs corresponding to the evidence paths through a question generation model and an inference response model respectively aiming at the evidence paths, selecting training question-answer pairs from the question-answer pairs corresponding to the evidence paths, updating model parameters of a question generation model and model parameters of an inference response model according to the question-answer pairs corresponding to the evidence paths, wherein the question-answer pairs comprise questions generated by the question generation model based on path nodes in the evidence paths and question answers generated by the inference response model based on path nodes in the evidence paths, and building a training data set based on the training question-answer pairs to train content auditing agents. The method can improve the content auditing quality of the content auditing intelligent agent.

Inventors

YIN YUYANG
CHEN XI

Assignees

腾讯科技（深圳）有限公司

Dates

Publication Date: 20260512
Application Date: 20260129

Claims (16)

1. A method of generating training data, the method comprising: Constructing an information map based on the webpage mention entities respectively associated with a plurality of webpage nodes, wherein the webpage nodes in the information map are associated with the same webpage mention entity with adjacent webpage nodes; The method comprises the steps of performing iterative execution, sampling a plurality of evidence paths from an information map, generating question-answer pairs corresponding to the evidence paths through a question generation model and an inference response model respectively for each evidence path, selecting training question-answer pairs from the question-answer pairs corresponding to the evidence paths respectively, and updating model parameters of the question generation model and model parameters of the inference response model according to the question-answer pairs corresponding to the evidence paths respectively, wherein the question-answer pairs comprise questions generated by the question generation model based on path nodes in the evidence paths and question answers generated by the inference response model based on path nodes in the evidence paths and the questions; And constructing a training data set based on the training question-answer pairs selected in the iterative execution process, wherein the training data set is used for verifying the intelligent agent by training content.
2. The method of claim 1, wherein updating the model parameters of the question generation model and the model parameters of the inference response model based on the question-answer pairs corresponding to each of the plurality of evidence paths comprises: aiming at each evidence path, carrying out answer accuracy assessment according to questions and answers to the questions in question-answer pairs corresponding to the evidence paths to obtain accuracy assessment results; determining a problem corresponding to the evidence path to generate a reward signal and an inference response reward signal according to the accuracy evaluation result; generating a reward signal based on the problem, determining a first strategy gradient corresponding to the evidence path, and determining a second strategy gradient corresponding to the evidence path based on the reasoning response reward signal; And updating model parameters of the problem generation model based on the first strategy gradient corresponding to each evidence path, and updating model parameters of the reasoning response model based on the second strategy gradient corresponding to each evidence path.
3. The method of claim 2, wherein the generating a reward signal based on the question, determining a first policy gradient corresponding to the evidence path, and determining a second policy gradient corresponding to the evidence path based on the inferential reply reward signal, comprises: generating a reward signal and a first generation probability of the questions in the question-answer pair corresponding to the evidence path based on the questions, and determining a first strategy gradient corresponding to the evidence path; And determining a second strategy gradient corresponding to the evidence path based on the reasoning response reward signal and a second generation probability of the question answer in the question answer pair corresponding to the evidence path.
4. The method according to claim 2, wherein the method further comprises: Generating a reward signal and sampling the number of evidence paths according to the problem corresponding to each evidence path, and calculating a first expected reward aiming at the problem generation model; calculating a second expected reward for the reasoning response model according to the reasoning response reward signal corresponding to each evidence path and the sampling evidence path number; and carrying out iteration stopping evaluation based on the first expected rewards and the second expected rewards to obtain iteration stopping evaluation results, and terminating iteration execution when the iteration stopping evaluation results are passed.
5. The method according to claim 1, wherein the method further comprises: Sampling logic paths corresponding to a plurality of target problems from the information map, and predicting the target problems corresponding to the logic paths based on the logic paths by initially generating a model for each logic path so as to determine problem generation conditional probability; Determining the sampling probability of the logic path, and calculating the problem generation quality score corresponding to the logic path according to the sampling probability and the problem generation conditional probability; generating quality scores based on the problems corresponding to each logic path, and determining the problem generating quality scores of the initial generating model; and obtaining the problem generation model under the condition that the problem generation quality score of the initial generation model meets the generation training stopping condition.
6. The method of claim 5, wherein the determining the sampling probability of the logical path comprises: For each logic path, accumulating the weights of path edges connecting logic path nodes in the logic path to obtain the path score of the logic path; Based on the path score and a predefined temperature parameter of the logical path, a sampling probability of the logical path is calculated.
7. The method of any one of claims 1 to 6, wherein selecting a training question-answer pair from question-answer pairs corresponding to each of the plurality of evidence paths comprises: Determining a first question-answer pair with correct answer and a second question-answer pair with wrong answer from question-answer pairs corresponding to the evidence paths respectively; Selecting a positive sample question-answer pair from the first question-answer pair, and selecting a negative sample question-answer pair from the second question-answer pair; and obtaining a training question-answer pair according to the positive sample question-answer pair and the negative sample question-answer pair.
8. The method according to any one of claims 1 to 6, wherein the constructing an information map based on web page reference entities associated with each of the plurality of web page nodes includes: determining adjacent webpage nodes of each webpage node based on webpage mention entities respectively associated with the webpage nodes; For each webpage node, performing webpage association degree analysis according to the same webpage mention entity associated with the webpage node and the adjacent webpage node, and determining the weight of the association edge between the webpage node and the adjacent webpage node; And constructing an information map according to each webpage node, adjacent webpage nodes of each webpage node and weights of associated edges between each webpage node and the adjacent webpage nodes.
9. The method of claim 8, wherein for each of the web page nodes, performing a web page relevance analysis based on the same web page reference entity with which the web page node and the neighboring web page node are associated, determining a weight of an associated edge between the web page node and the neighboring web page node, comprises: for each webpage node, according to the same webpage mention entity associated with the webpage node and the adjacent webpage node, entity rarity analysis is carried out, and the entity rarity of the same webpage mention entity is determined; Based on the webpage release time respectively associated with the webpage node and the adjacent webpage node, carrying out release time association analysis to determine release time association; And determining the weight of the association edge between the webpage node and the adjacent webpage node based on the entity rarity of the same webpage reference entity and the release time association degree.
10. The method of claim 9, wherein for each of the web page nodes, performing entity rarity analysis based on the same web page reference entity associated with the web page node and the neighboring web page node, determining the entity rarity of the same web page reference entity comprises: For each webpage node, carrying out associated webpage node statistics according to the same webpage mention entity associated with the webpage node and the adjacent webpage node, determining the number of the webpage nodes associated with the same webpage mention entity, and determining the total number of the webpage nodes; And determining entity rarity of the same webpage reference entity according to the number of the webpage nodes and the total number of the webpage nodes.
11. The method of claim 9, wherein the performing a post time association analysis based on the post time of the web page associated with each of the web page node and the neighboring web page node, determining the post time association comprises: determining a release time difference value based on the web page release time respectively associated with the web page node and the adjacent web page node; and determining the distribution time association degree according to the distribution time difference value.
12. A method of content auditing, the method comprising: The method comprises the steps of obtaining content to be audited, inputting the content to be audited into a trained content auditing agent, wherein the content auditing agent is obtained by training based on a training data set, and the training data set is constructed by the method according to any one of claims 1 to 11; and carrying out semantic understanding on the content to be inspected through the content inspection intelligent agent, generating a conclusion to be inspected, inquiring an information map according to the conclusion to be inspected, obtaining map evidence data, and carrying out inspection judgment on the conclusion to be inspected based on the map evidence data to obtain a content inspection result.
13. A training data generation apparatus, the apparatus comprising: The system comprises a map construction module, a map generation module and a map generation module, wherein the map construction module is used for constructing an information map based on webpage mention entities respectively associated with a plurality of webpage nodes; The iterative execution module is used for performing iterative execution, and is used for sampling a plurality of evidence paths from the information map, generating question-answer pairs corresponding to the evidence paths through a question generation model and an inference response model respectively for each evidence path, selecting training question-answer pairs from the question-answer pairs corresponding to the evidence paths respectively, updating model parameters of the question generation model and model parameters of the inference response model according to the question-answer pairs corresponding to the evidence paths respectively, wherein the question-answer pairs comprise questions generated by the question generation model based on path nodes in the evidence paths and question answers generated by the inference response model based on path nodes in the evidence paths and the questions; The data set construction module is used for constructing a training data set based on training question-answer pairs selected in the iterative execution process, and the training data set is used for verifying the intelligent agent through training content.
14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 12 when the computer program is executed.
15. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 12.
16. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any one of claims 1 to 12.

Description

Training data generation method and content auditing method Technical Field The present application relates to the field of computer technology, and in particular, to a training data generating method, apparatus, computer device, computer readable storage medium and computer program product, and a content auditing method, apparatus, computer device, computer readable storage medium and computer program product. Background With the development of computer technology, a content auditing agent appears, wherein the content auditing agent is an agent which automatically detects and judges contents such as texts, images and videos by utilizing an artificial intelligence technology, and can judge the quality of the input contents to be audited to obtain a content auditing result. The content auditing agent may be obtained through training. In the conventional technology, when a content auditing agent is trained, the content in a website is usually collected completely, then a plurality of question-answer pairs are automatically generated according to the content in the website, and the generated plurality of question-answer pairs are used for training to obtain the content auditing agent. However, in the conventional method, the generated multiple question-answer pairs are too simple and have large differences from the real scene of the content audit, so that the content audit quality of the content audit agent obtained through training is difficult to meet the requirement. Disclosure of Invention In view of the foregoing, it is desirable to provide a training data generating method, apparatus, computer device, computer-readable storage medium, and computer program product capable of supporting improvement of the content auditing quality of a content auditing agent, and a content auditing method, apparatus, computer device, computer-readable storage medium, and computer program product capable of improving the content auditing quality of a content auditing agent. In a first aspect, the present application provides a training data generating method, including: Constructing an information map based on the webpage mention entities respectively associated with a plurality of webpage nodes, wherein the webpage nodes in the information map are associated with the same webpage mention entity with adjacent webpage nodes; The method comprises the steps of performing iterative execution, sampling a plurality of evidence paths from an information map, generating question-answer pairs corresponding to the evidence paths through a question generation model and an inference response model respectively for each evidence path, selecting training question-answer pairs from the question-answer pairs corresponding to the evidence paths respectively, and updating model parameters of the question generation model and model parameters of the inference response model according to the question-answer pairs corresponding to the evidence paths respectively, wherein the question-answer pairs comprise questions generated by the question generation model based on path nodes in the evidence paths and question answers generated by the inference response model based on path nodes in the evidence paths and the questions; And constructing a training data set based on the training question-answer pairs selected in the iterative execution process, wherein the training data set is used for verifying the intelligent agent by training content. In a second aspect, the present application further provides a training data generating apparatus, including: The system comprises a map construction module, a map generation module and a map generation module, wherein the map construction module is used for constructing an information map based on webpage mention entities respectively associated with a plurality of webpage nodes; The iterative execution module is used for performing iterative execution, and is used for sampling a plurality of evidence paths from the information map, generating question-answer pairs corresponding to the evidence paths through a question generation model and an inference response model respectively for each evidence path, selecting training question-answer pairs from the question-answer pairs corresponding to the evidence paths respectively, updating model parameters of the question generation model and model parameters of the inference response model according to the question-answer pairs corresponding to the evidence paths respectively, wherein the question-answer pairs comprise questions generated by the question generation model based on path nodes in the evidence paths and question answers generated by the inference response model based on path nodes in the evidence paths and the questions; The data set construction module is used for constructing a training data set based on training question-answer pairs selected in the iterative execution process, and the training data set is used for verifying the intelligent agent throug