CN-121981084-A - Text generation method, device, electronic equipment, storage medium and product

CN121981084ACN 121981084 ACN121981084 ACN 121981084ACN-121981084-A

Abstract

The disclosure provides a text generation method, a text generation device, electronic equipment, a storage medium and a product. The method comprises the steps of obtaining a plurality of candidate text fragments corresponding to a user query request, sorting the plurality of candidate text fragments to obtain a candidate list, selecting at least one target text fragment from the candidate list, inputting the target text fragment into a target generation model to generate an output text corresponding to the user query request, obtaining the target generation model through retrieval enhancement fine tuning and reinforcement learning training, guaranteeing the matching degree and accuracy of the output text and the user query request, avoiding fact deviation and key information omission, further guaranteeing the matching degree and accuracy of the output text and the user query request, and obtaining the high-quality output text.

Inventors

HU QIAN

Assignees

北京中科金得助智能科技有限公司

Dates

Publication Date: 20260505
Application Date: 20251229

Claims (10)

1. A text generation method, comprising: Acquiring a plurality of candidate text fragments corresponding to a user query request; Sequencing the plurality of candidate text fragments to obtain a candidate list; Selecting at least one target text segment from the candidate list; And inputting the target text segment into a target generation model to generate an output text corresponding to the user query request, wherein the target generation model is obtained through retrieval enhancement fine tuning and reinforcement learning training.
2. The method of claim 1, wherein the obtaining a plurality of candidate text segments corresponding to the user query request comprises: Acquiring a user query request; and searching a target knowledge base based on the user query request to obtain a plurality of candidate text fragments corresponding to the user query request.
3. The method according to claim 2, wherein retrieving the target knowledge base based on the user query request to obtain the plurality of candidate text segments corresponding to the user query request comprises: Performing keyword matching retrieval and semantic similarity retrieval on a target knowledge base based on the user query request to obtain a first candidate text corresponding to the user query request and a second candidate text corresponding to the user query request; and carrying out fusion processing on the first candidate text and the second candidate text to obtain a plurality of candidate text fragments corresponding to the user query request.
4. The method of claim 1, wherein the ranking the plurality of candidate text segments to obtain a candidate list comprises: Determining a basic relevance score between each candidate text segment and the user query request, a matching degree score between metadata information associated with each candidate text segment and the user query request, and an editing distance relevance score between the user query request and a file identifier; Determining a target relevance score based on the base relevance score, the degree of matching score, and the edit distance relevance score; and sequencing the candidate text fragments according to the target relevance score to obtain the candidate list.
5. The method of claim 1, wherein the inputting the target text segment into a target generation model, prior to generating the output text corresponding to the user query request, comprises: Acquiring a training data set, wherein the training data set comprises a user query request, a positive sample associated with the user query request, a negative sample associated with the user query request and a standard output text corresponding to the user query request; performing supervision fine tuning on the initial generation model by utilizing the training data set; and carrying out iterative optimization on the supervised fine-tuned initial generation model based on the reinforcement learning strategy to obtain a target generation model.
6. The method of claim 5, wherein the reinforcement learning strategy employs a multi-dimensional rewards mechanism including at least one of rewards based on training output text length rationality generated during model training, rewards based on training output text matching criteria generated during model training, credibility rewards based on target text segment reference priorities, and rewards based on training output text format normalization generated during model training.
7. A text generating apparatus, comprising: the acquisition unit is used for acquiring a plurality of candidate text fragments corresponding to the user query request; The sorting unit is used for sorting the plurality of candidate text fragments to obtain a candidate list; a selecting unit, configured to select at least one target text segment from the candidate list; And the generating unit is used for inputting the target text segment into a target generating model to generate an output text corresponding to the user query request, and the target generating model is obtained through retrieval enhancement fine tuning and reinforcement learning training.
8. An electronic device, comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 6.
9. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the method of any one of claims 1 to 6.

Description

Text generation method, device, electronic equipment, storage medium and product Technical Field The disclosure relates to the technical field of artificial intelligence, and in particular relates to a text generation method, a text generation device, electronic equipment, a storage medium and a product. Background In the related text generation scheme, based on a mode that a generation model directly responds to a user query request, all candidate text fragments returned by retrieval are generally directly input into the generation model, and the problems that the matching degree of the generated text and the actual requirement of a user is low, and fact deviation or key information omission easily occurs exist. Although some schemes attempt to combine the search and the generation of the model, noise existing in the search result still causes inaccuracy of the output text, and it is difficult to generate high-quality output text. Disclosure of Invention The present disclosure provides a text generation method, apparatus, electronic device, storage medium, and product to solve the problems in the related art. An embodiment of a first aspect of the present disclosure provides a text generation method, including: Acquiring a plurality of candidate text fragments corresponding to a user query request; Sequencing the candidate text fragments to obtain a candidate list; selecting at least one target text segment from the candidate list; And inputting the target text segment into a target generation model, generating an output text corresponding to the user query request, and obtaining the target generation model through retrieval enhancement fine tuning and reinforcement learning training. In one embodiment, obtaining a plurality of candidate text segments corresponding to a user query request includes: Acquiring a user query request; And searching the target knowledge base based on the user query request to obtain a plurality of candidate text fragments corresponding to the user query request. In one embodiment, retrieving the target knowledge base based on the user query request to obtain a plurality of candidate text segments corresponding to the user query request includes: keyword matching retrieval and semantic similarity retrieval are respectively carried out on the target knowledge base based on the user query request, and a first candidate text corresponding to the user query request and a second candidate text corresponding to the user query request are obtained; And carrying out fusion processing on the first candidate text and the second candidate text to obtain a plurality of candidate text fragments corresponding to the user query request. In one embodiment, sorting the plurality of candidate text segments to obtain a candidate list includes: determining a basic relevance score between each candidate text segment and a user query request, a matching degree score between metadata information associated with each candidate text segment and the user query request, and an editing distance relevance score between the user query request and a file identifier; Determining a target relevance score based on the base relevance score, the degree of matching score, and the edit distance relevance score; And sequencing the candidate text fragments according to the target relevance score to obtain a candidate list. In one embodiment, the method provided by the present disclosure before inputting a target text segment into a target generation model and generating an output text corresponding to a user query request includes: Acquiring a training data set, wherein the training data set comprises a user query request, a positive sample associated with the user query request, a negative sample associated with the user query request and a standard output text corresponding to the user query request; performing supervision fine tuning on the initial generation model by using a training data set; and carrying out iterative optimization on the supervised fine-tuned initial generation model based on the reinforcement learning strategy to obtain a target generation model. In one embodiment, the reinforcement learning strategy employs a multi-dimensional rewarding mechanism including at least one of rewarding based on training output text length rationality generated during model training, rewarding based on training output text matching standard output text generated during model training, credibility rewarding based on target text segment reference priority, and rewarding based on training output text format normalization generated during model training. An embodiment of a second aspect of the present disclosure proposes a text generating apparatus, the apparatus including: the acquisition unit is used for acquiring a plurality of candidate text fragments corresponding to the user query request; the ordering unit is used for ordering the plurality of candidate text fragments to obtain a candidate list; A selecting unit,