Search

CN-122021559-A - Non-parallel text style migration method and device based on prompt fine adjustment

CN122021559ACN 122021559 ACN122021559 ACN 122021559ACN-122021559-A

Abstract

The invention provides a non-parallel text style migration method and device based on prompt fine adjustment, relates to the technical field of artificial intelligence, and aims to solve the technical problem that an existing non-parallel text style migration model is poor in migration effect on similar styles. The method comprises the steps of obtaining a source style text to be migrated and a target style instruction, encoding the target style instruction into a style controller, fusing the style controller and the source style text to obtain a structured prompt sequence, inputting the prompt sequence into a target non-parallel text style migration model obtained through prompt fine tuning training in advance, and executing the following operation of generating the target text by word elements in an autoregressive decoding mode under the double constraint of a learnable prefix vector and the style controller based on the prompt sequence, wherein in the process of generating the target text, word element probability distribution of the model is continuously guided by the learnable prefix vector, so that word element generation probability which does not accord with the target style is restrained.

Inventors

  • ZHOU XI
  • BI RAN
  • YANG FENGYI
  • YANG YATING
  • MA BO
  • Abibula Atawura
  • Aihetamu River. Aihemeti
  • SHAO YIMIN
  • Han Bangju

Assignees

  • 中国科学院新疆理化技术研究所

Dates

Publication Date
20260512
Application Date
20260130

Claims (10)

  1. 1. A non-parallel text style migration method based on hinting, the method comprising: acquiring a source style text to be migrated and a target style instruction; Coding the target style instruction into style control symbols corresponding to the target styles one by one, and fusing the style control symbols and the source style text to obtain a structured prompt sequence; Inputting the prompt sequence into a target non-parallel text style migration model which is obtained after prompt fine adjustment training in advance, wherein the target non-parallel text style migration model comprises a plurality of fransformer layers, each fransformer layer is inserted with a learned prefix vector which is trained, and the learned prefix vector is configured to be capable of participating in attention calculation as a virtual word element and performs the following operations: Generating candidate texts by adopting an autoregressive decoding mode under the double constraint of the learnable prefix vector and the style controller based on the prompt sequence, wherein in the process of generating the candidate texts, the character probability distribution of a model is continuously guided by the learnable prefix vector, so that the character generation probability which does not accord with the target style is restrained; And outputting the candidate text as a target style text in response to the satisfaction of a preset condition, wherein the target style text accords with the expression characteristics of the target style on the premise of keeping the core semantic consistency of the source style text.
  2. 2. The method of claim 1, wherein training the target non-parallel text style migration model by hinting beforehand comprises: Acquiring non-parallel text style migration corpus, and preprocessing the non-parallel text style migration corpus to obtain training data; generating non-parallel text style migration task evaluation indexes based on the text style migration task requirements; Constructing a prompt template, wherein the prompt template is used for guiding a large language model to generate a de-stylized paraphrasing; Based on the paraphrases, screening the paraphrases higher than a meaning similarity threshold from the paraphrases by using the non-parallel text style migration task evaluation index as a pseudo-parallel data set; inserting a trainable prompt vector into the large language model as a prefix, and performing fine adjustment on the large language model according to the training prompt vector to obtain a large language model prompt fine adjustment framework; selecting similar positive samples of the same style for each source style text, randomly sampling negative samples from other styles, and constructing a first loss function based on a language modeling method; Optimizing the adjustable parameters of the large language model prompt fine tuning framework according to the pseudo parallel data set and the first loss function, wherein in the optimizing process, only the parameters of the prefix part are updated, and the parameters of the large language model are kept unchanged; And sampling a fixed number of paraphrasing samples, and generating a target style text according to the paraphrasing samples by using the large language model prompt fine tuning framework, wherein in the generation process, the large language model prompt fine tuning framework can automatically establish a mapping relation between a source style and a target style through paraphrasing, and finally, a target non-parallel text style migration model is obtained.
  3. 3. The method of claim 2, wherein training the target non-parallel text style migration model by hinting beforehand further comprises: Obtaining a quantitative evaluation result according to the non-parallel text style migration task evaluation index, sampling positive and negative samples from the quantitative evaluation result by adopting a pareto optimization method, and constructing an initial preference alignment data set; The penalty term based on the odds ratio is fused into the large language model prompt fine tuning framework, the distinction degree of positive and negative examples is increased by inhibiting the generation of negative examples, and a second loss function is constructed based on a comparison learning method; And training the large language model prompt fine tuning framework according to the preference alignment data set, iteratively updating the preference alignment data set and prefix parameters, continuously improving the text style migration effect, and finally obtaining a target non-parallel text style migration model.
  4. 4. The method of claim 2, wherein the preprocessing the non-parallel text style migration corpus to obtain training data comprises: Deleting invalid fields in the non-parallel text style migration corpus; Carrying out noise elimination treatment on the non-parallel text style migration corpus after the invalid field deletion treatment; and carrying out privacy information duplication removal processing on the non-parallel text style migration corpus after the noise removal processing.
  5. 5. The method of claim 2, wherein generating non-parallel text style migration task evaluation metrics based on text style migration task requirements comprises: evaluating the target style strength by using a pre-trained style classifier; evaluating the meaning similarity by utilizing cosine similarity between text semantic embedding; Evaluating grammar smoothness by utilizing a pre-trained grammar judgment classifier; and generating non-parallel text style migration task evaluation indexes based on the target style intensity, the meaning similarity and the grammar smoothness.
  6. 6. The method of claim 2, wherein constructing the hint template comprises: establishing a mapping dictionary containing style labels and style brief descriptions; defining system instructions, wherein the system instructions are used for constraining roles and output forms of the large language model; Defining user instructions, wherein the user instructions are used for stimulating internal parameterized knowledge of a large language model; And loading a large language model with excellent performance in an online or offline mode, packaging the system instruction and the user instruction to form prompt information, and sending the prompt information to the large language model for batch reply generation.
  7. 7. The method of claim 2, wherein selecting similar positive samples of the same style for each source style text, randomly sampling negative samples from other styles, and constructing a first loss function based on a language modeling method comprises: Storing the training data through a style classifier to form an embedded pool; for each source style text, selecting a specific number of samples higher than a similarity threshold from the embedded pool of the same style as positive samples; Uniformly sampling from the embedded pools of other styles to obtain a negative sample; Calculating the comparison loss of the output of the hidden layer of the large language model and the positive and negative samples based on cosine similarity; and based on the comparison loss, combining the standard causal language modeling cross entropy loss to construct a first loss function.
  8. 8. The method according to claim 3, wherein sampling positive and negative samples from the quantitative evaluation result by using a pareto optimization method includes: Defining a priority order based on the non-parallel text style migration task evaluation index; extracting the source style text from a single sample, and traversing and extracting evaluation scores of all generated target style texts; for any two texts, when all the evaluation indexes of one text are not worse than those of the other text and at least one evaluation index strictly exceeds the other text, judging that one text dominates the other text; Sorting all generated target style texts, initializing a pareto front list, traversing whether the sorted current texts are dominated by any text, if not, removing all the texts dominated by the current texts from the pareto front, adding the current texts into the pareto front, repeating the process and returning the sorted pareto front texts; traversing all texts, checking whether each text which is not in the pareto front is dominated by any text in the pareto front, if so, marking the text as dominated text, repeating the process and returning to the ordered dominated text list; using Chebyshev distance to measure text difference, calculating absolute difference values of scores of all indexes for two texts, and taking the largest absolute difference value in all indexes as a distance, wherein the smaller the distance is, the closer the two texts are in an index space; searching a pareto front from a text set, selecting the best text in the pareto front as a positive example, selecting the text closest to the positive example, namely the text with the smallest Chebyshev distance as a negative example from the dominated sample, verifying whether the positive example dominates the negative example, and returning to a positive example text pair if the condition is met; When there is no pareto front or text to be subjected to domination, weighting is allocated to each index, the weighted index score ranking of each text is calculated, and the text with the highest score is selected as a positive example, and the text with the highest score and different from the positive example is selected as a negative example.
  9. 9. The method of claim 8, wherein the incorporating the odds ratio-based penalty term into the large language model hint fine tuning architecture increases the discrimination of positive and negative examples samples by suppressing negative example generation, and builds a second loss function based on a contrast learning method, comprising: wrapping the logarithm of the odds ratio using a logarithmic sigmoid function to maximize the odds ratio between the generated selected response and the rejected response; the penalty term based on the odds ratio is integrated into the traditional negative log likelihood loss, and the generation difference between popular response and unpopular response is effectively distinguished on the basis of maximizing the word element probability of popular response generation; based on the generated difference, a second loss function is constructed in combination with contrast learning.
  10. 10. A non-parallel text style migration method device based on prompt fine tuning, characterized in that the device comprises: the acquisition module is used for acquiring the source style text to be migrated and the target style instruction; The coding module is used for coding the target style instruction into style control symbols corresponding to the target styles one by one, and fusing the style control symbols with the source style text to obtain a structured prompt sequence; The input module is used for inputting the prompt sequence into a target non-parallel text style migration model obtained after prompt fine adjustment training in advance, wherein the target non-parallel text style migration model comprises a plurality of fransformer layers, each fransformer layer is inserted with a learned prefix vector which is trained, the learned prefix vector is configured to be capable of participating in attention calculation as a virtual word element, and the following modules are executed: The generation module is used for generating candidate texts by adopting an autoregressive decoding mode under the double constraint of the learnable prefix vector and the style controller based on the prompt sequence, wherein in the process of generating the candidate texts, the character probability distribution of a model is continuously guided by the learnable prefix vector, so that the character generation probability which does not accord with the target style is restrained; the output module is used for responding to the condition that the preset condition is met and outputting the candidate text as a target style text, wherein the target style text accords with the expression characteristics of the target style on the premise that the core semantic consistency of the source style text is kept.

Description

Non-parallel text style migration method and device based on prompt fine adjustment Technical Field The invention relates to the technical field of artificial intelligence, in particular to a non-parallel text style migration method and device based on prompt fine adjustment. Background With the increasing complexity of application scenarios, the generation of artificial intelligence (GENERATIVE ARTIFICIAL INTELLIGENCE, GENERATIVE AI) is required to avoid generating misleading or inappropriate content as well as to meet specific user needs. These demands have greatly driven the development of controllable text generation (Controllable Text Generation, CTG) technology, ensuring that the output meets the control conditions, while maintaining a high degree of applicability, fluency and diversity. As an important task for controllable Text generation, text style migration (Text STYLE TRANSFER, TST) aims at changing specific styles such as Text emotion, text body or mood, and the like, and simultaneously keeps core semantic information, so that the method has a wide application prospect in various fields such as Text color rendering, poetry creation, man-machine conversation, community comment environment improvement and the like. In the process of realizing the inventive concept, the inventor finds that at least the following defects exist in the related technology (1) because of the requirement of professional linguistic knowledge and large amount of labor cost, the construction of high-quality parallel corpus is relatively difficult, and the current small amount of parallel data sets are difficult to be suitable for more complex application scenes. (2) The definition of the text style is highly dependent on the context or group consensus, the definition of the text style lacks clear linguistic definition, the established standard is usually changed according to tasks, the texts with similar semantics but different styles share a large number of hierarchical language features such as vocabulary, syntax and the like, the model is easy to be confused, and the migration effect is obviously reduced. Disclosure of Invention In view of the above, the invention provides a non-parallel text style migration method and device based on prompt fine adjustment, which aims to solve the technical problem that the conventional non-parallel text style migration model is poor in migration effect on similar styles. The invention provides a non-parallel text style migration method based on prompt fine adjustment, which comprises the steps of obtaining source style texts to be migrated and target style instructions, encoding the target style instructions into style control symbols corresponding to target styles one by one, fusing the style control symbols with the source style texts to obtain a structured prompt sequence, inputting the prompt sequence into a target non-parallel text style migration model obtained after prompt fine adjustment training in advance, wherein the target non-parallel text style migration model comprises a plurality of fransformer layers, each fransformer layer is inserted with a trained and learnable prefix vector, the learnable prefix vector is configured to be used as a virtual word to participate in attention calculation, and performing the following operation that candidate texts are generated by adopting an autoregressive decoding mode under the double constraint of the learnable prefix vector and the style control symbols based on the prompt sequence, wherein word element probability distribution of the model is continuously guided by the learnable prefix vector in the process of generating the candidate texts, so that word element generation probability of the candidate texts does not accord with the target styles is restrained, and the candidate texts are output as target text under the condition that the target text accords with the target style characteristics. According to the embodiment of the invention, a target non-parallel text style migration model is trained in advance through prompting fine tuning, and the method comprises the steps of obtaining non-parallel text style migration corpus, and preprocessing the non-parallel text style migration corpus to obtain training data; the method comprises the steps of generating a non-parallel text style migration task evaluation index based on text style migration task requirements, constructing a prompt template, screening out paraphrases higher than a meaning similarity threshold from the paraphrases by using the non-parallel text style migration task evaluation index based on the paraphrases, inserting a trainable prompt vector into the large language model as a prefix, fine-tuning the large language model to obtain a large language model prompt fine-tuning framework, selecting similar positive samples of the same style for each source style text, randomly sampling negative samples from other styles, constructing a first loss function based on