CN-121981088-A - Text generation method, device, equipment and storage medium

CN121981088ACN 121981088 ACN121981088 ACN 121981088ACN-121981088-A

Abstract

The embodiment of the disclosure provides a text generation method, a device, equipment and a storage medium, and particularly discloses a text generation request is received, model parameters related to semantic association in a pre-trained large language model are activated according to a first dynamic weight mask to obtain a messenger unit, model parameters related to causal logic verification in the large language model are activated according to a second dynamic weight mask to obtain a suspected speaker unit, the first dynamic weight mask and the second dynamic weight mask are determined based on functions of the large language model parameters, in a latent space of the large language model, a countermeasure game between the messenger unit and the suspected speaker unit is executed, the countermeasure game is to pass candidate semantic vectors generated by the messenger unit through verification by the suspected speaker unit, target semantic vectors are screened out according to verification results, and target semantic vectors are decoded to generate texts.

Inventors

LI TAO
SHAN XIN

Assignees

郑州阿帕斯数云信息科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260123

Claims (10)

1. A text generation method, comprising: receiving a text generation request; Activating model parameters related to semantic association in a pre-trained large language model according to a first dynamic weight mask to obtain messenger units, and activating model parameters related to causal logic verification in the large language model according to a second dynamic weight mask to obtain suspected argument units, wherein the first dynamic weight mask and the second dynamic weight mask are determined based on functions of the large language model parameters; Executing an countermeasure game between the messenger unit and the suspected speaker unit in a latent space of the large language model, wherein the countermeasure game is to deliver candidate semantic vectors generated by the messenger unit to the suspected speaker unit for verification, and screening out target semantic vectors according to a verification result; and decoding the target semantic vector to generate a text.
2. The method of claim 1, wherein the suspicion unit performs authentication comprising: The method comprises the steps of obtaining candidate semantic vectors generated by a messenger unit, obtaining corresponding antagonism scores, wherein the antagonism scores are absolute values of differences between a first evaluation score and a second evaluation score of the candidate semantic vectors, the first evaluation score is generated by the messenger unit based on a first evaluation function, the second evaluation score is generated by the doubtful speaker unit based on a second evaluation function, the first evaluation function is used for evaluating the candidate semantic vectors, semantic fluency of texts generated after decoding and correlation degree of text generation requests, and the second evaluation function is used for evaluating the candidate semantic vectors, causal logic self-consistency degree of text contents generated after decoding and whether the causal logic self-consistency degree accords with a specified logic rule and a specified causal relation, wherein the specified logic rule and the specified causal relation are determined in a training stage of the large language model based on training data of the large language model; screening out the target semantic vector according to the verification result, including: Under the condition that the antagonism score corresponding to the candidate semantic vector is not higher than the preset antagonism threshold value, determining that the candidate semantic vector passes verification; and determining the candidate semantic vector passing the verification as a target semantic vector.
3. The method of claim 2, wherein the executing the anti-betting game between the messenger unit and the suspected speaker unit comprises: The countermeasure game is executed through multiple rounds of iteration until target semantic vectors meeting preset termination conditions are screened out, wherein in each round of iteration, the messenger unit adjusts a generation strategy based on a verification result of candidate semantic vectors in the previous round to generate a new candidate semantic vector set, the suspicion unit verifies the new candidate semantic vector set, and the preset termination conditions are that the change amount of the resistance score corresponding to each round of iteration can pass verification in continuous rounds of iteration and does not exceed a preset stability threshold.
4. A method according to claim 3, wherein after said iteratively performing the above steps, the method further comprises: After the continuous iteration of the preset round, if all the candidate semantic vectors cannot meet the preset termination condition, terminating executing the countermeasure game, and outputting preset refusal information.
5. The method of claim 1, wherein prior to said executing the anti-betting game between the messenger unit and the suspected speaker unit, the method further comprises: Determining the task type of a current text generation task according to the text generation request, wherein the task type comprises a knowledge intensive task and a creative generation task; The method includes dynamically adjusting a validation severity corresponding to the opponent game based on the task type, the validation severity being determined based on at least one of a preset opponent threshold and an iteration round of the opponent game, performing at least one of decreasing the preset opponent threshold and increasing the iteration round of the opponent game to increase the validation severity corresponding to the opponent game if the task type is the knowledge-intensive task, and performing at least one of increasing the preset opponent threshold and decreasing the iteration round of the opponent game to decrease the validation severity corresponding to the opponent game if the task type is the creative generation task.
6. The method of claim 1, wherein prior to said screening out the target semantic vector based on the validation result, the method further comprises: acquiring an original request semantic representation formed by the text generation request in a latent space of the large language model; Obtaining a reconstructed initial semantic representation through reverse mapping based on the candidate semantic vector passing verification currently; calculating a consistency score of the reconstructed initial semantic representation and the original request semantic representation; And (3) removing the corresponding candidate semantic vectors from the verified vector set of the current iteration, wherein the consistency score is lower than a preset consistency threshold value in the calculated consistency scores.
7. The method of claim 1, wherein the executing the anti-betting game between the messenger unit and the suspected speaker unit comprises: Monitoring the confidence level of the candidate semantic vector generated by the messenger unit; And (3) rejecting the corresponding candidate semantic vector from the verified vector set of the current iteration, wherein the confidence is lower than the confidence of the preset propagation threshold value in the monitored confidence.
8. A text generating apparatus, comprising: The receiving module is used for receiving the text generation request; The system comprises a text generation module, an activation module, a first dynamic weight mask, a second dynamic weight mask, a first dynamic weight mask and a second dynamic weight mask, wherein the text generation module is used for responding to the text generation request, activating model parameters related to semantic association in a pre-trained large language model to obtain a messenger unit, and activating model parameters related to causal logic verification in the large language model to obtain a suspected argument unit; the executing module is used for executing the countermeasure game between the messenger unit and the suspected speaker unit in the latent space of the large language model, wherein the countermeasure game is to deliver the candidate semantic vector generated by the messenger unit to the suspected speaker unit for verification, and screen out a target semantic vector according to the verification result; And the generating module is used for decoding the target semantic vector to generate a text.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program implementing the steps of the method according to any one of claims 1 to 7 when executed by the processor.
10. A computer readable storage medium for storing computer executable instructions which when executed by a processor implement the steps of the method of any one of the preceding claims 1 to 7.

Description

Text generation method, device, equipment and storage medium Technical Field The present invention relates to the field of artificial intelligence technologies, and in particular, to a text generating method, apparatus, device, and storage medium. Background Currently, when generating text, a large language model, such as a large language model based on a Transformer architecture, can be used to generate text with higher quality. However, such models can present significant "illusion" problems in generating text, i.e., the generated content is inconsistent with the fact or context. In order to correct the illusion errors in the generated text, a RAG (RETRIEVAL-augmented Generation, search enhancement generation) technology may be generally adopted to check and correct the errors in the generated text by searching an external knowledge base, but this method can only correct after the text is generated, and cannot be recognized and suppressed in real time at the early stage of illusion generation, thus resulting in problems such as failure to correct errors in real time, high calculation cost, and delayed response. In addition, due to the unidirectional generation characteristic of the large language model, errors generated in the initial stage are further amplified and propagated in the subsequent generation step, so that the correction difficulty is increased. Disclosure of Invention The invention mainly aims to provide a text generation method, a device, equipment and a storage medium, which aim to solve the problems that the existing text generation technology cannot correct errors in real time, has high calculation cost and propagates and amplifies errors due to the fact that a post-correction mechanism and model unidirectional generation characteristics are relied on. In a first aspect, an embodiment of the present disclosure provides a text generating method, including: receiving a text generation request; Activating model parameters related to semantic association in a pre-trained large language model according to a first dynamic weight mask to obtain messenger units, and activating model parameters related to causal logic verification in the large language model according to a second dynamic weight mask to obtain suspected argument units, wherein the first dynamic weight mask and the second dynamic weight mask are determined based on functions of the large language model parameters; Executing an countermeasure game between the messenger unit and the suspected speaker unit in a latent space of the large language model, wherein the countermeasure game is to deliver candidate semantic vectors generated by the messenger unit to the suspected speaker unit for verification, and screening out target semantic vectors according to a verification result; and decoding the target semantic vector to generate a text. In a second aspect, an embodiment of the present disclosure provides a text generating apparatus, including: The receiving module is used for receiving the text generation request; The system comprises a text generation module, an activation module, a first dynamic weight mask, a second dynamic weight mask, a first dynamic weight mask and a second dynamic weight mask, wherein the text generation module is used for responding to the text generation request, activating model parameters related to semantic association in a pre-trained large language model to obtain a messenger unit, and activating model parameters related to causal logic verification in the large language model to obtain a suspected argument unit; the executing module is used for executing the countermeasure game between the messenger unit and the suspected speaker unit in the latent space of the large language model, wherein the countermeasure game is to deliver the candidate semantic vector generated by the messenger unit to the suspected speaker unit for verification, and screen out a target semantic vector according to the verification result; And the generating module is used for decoding the target semantic vector to generate a text. In a third aspect, an embodiment of the present disclosure provides an electronic device comprising a processor and a memory configured to store computer-executable instructions that, when executed, cause the processor to implement the steps of the method of the first aspect described above. In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium for storing computer-executable instructions which, when executed by a processor, implement the steps of the method of the first aspect described above. In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of the first aspect described above. The above at least one technical scheme provided by the embodiment of the invention can achieve the following technical effects: In the e