CN-122021660-A - Condition perception semantic enhancement method based on zero sample capacity of large language model

CN122021660ACN 122021660 ACN122021660 ACN 122021660ACN-122021660-A

Abstract

The application provides a condition-aware semantic enhancement method based on zero sample capability of a large language model, which belongs to the technical field of semantic enhancement, and comprises the steps of obtaining a dialogue text to be enhanced and a context condition; the method comprises the steps of processing context conditions based on an end-side large language model with zero sample semantic understanding capability to obtain natural language problems, processing dialogue texts to be enhanced and the natural language problems based on the end-side large language model to obtain hidden states of an output layer of the end-side large language model, determining condition perception semantic vectors aiming at the understanding results according to the hidden states, inputting the dialogue texts to be enhanced into a general embedding model to obtain basic semantic vectors, and fusing the condition perception semantic vectors and the basic semantic vectors to obtain enhanced semantic representation vectors. The method can perform characterization enhancement on the dialogue text on the premise of not depending on manual labeling and not trimming the basic universal embedded model.

Inventors

HE ZHAOFENG
HUANG ZIXI
Xiang Liuyu
WU HUIJIA
LI PEIPEI

Assignees

北京邮电大学

Dates

Publication Date: 20260512
Application Date: 20260131

Claims (10)

1. A method for conditional sense semantic enhancement based on zero sample capability of a large language model, comprising: Acquiring a dialogue text to be enhanced and a context condition of the dialogue text to be enhanced; Processing the context condition based on an end-side large language model with zero sample semantic understanding capability to obtain a natural language problem corresponding to the context condition; Processing the dialogue text to be enhanced and the natural language problem based on the end-side large language model to obtain a hidden state of an output layer of the end-side large language model, wherein the hidden state is used for representing an understanding result of the end-side large language model on the dialogue text to be enhanced and the natural language problem; Determining a conditional sense semantic vector for the understanding result according to the hidden state; inputting the dialogue text to be enhanced into a universal embedded model to obtain a basic semantic vector; and fusing the condition-aware semantic vector and the basic semantic vector to obtain an enhanced semantic representation vector corresponding to the dialogue text to be enhanced.
2. The method for enhancing the condition-aware semantics based on zero sample capability of a large language model according to claim 1, wherein the processing the context condition based on the end-side large language model with zero sample semantic understanding capability to obtain the natural language question corresponding to the context condition comprises: Embedding the context condition into a first placeholder in a first preset prompting word to obtain a first prompting word corresponding to the context condition, wherein the first preset prompting word is used for indicating the end side big language model to output a problem corresponding to a text in the first placeholder; and inputting the first prompt word into the end-side large language model with zero sample semantic understanding capability to obtain a natural language problem corresponding to the context condition.
3. The method for enhancing the condition-aware semantics based on zero sample ability of a large language model according to claim 1, wherein the processing the dialogue text to be enhanced and the natural language question based on the large language model at the end side to obtain the hidden state of the output layer of the large language model at the end side comprises: embedding the dialogue text to be enhanced into a second placeholder in a second preset prompting word, and embedding the natural language problem into a third placeholder in the second preset prompting word to obtain a second prompting word corresponding to the natural language problem of the dialogue text to be enhanced, wherein the second preset prompting word is used for indicating the end-side large language model to take the text in the second placeholder as a context basis and restricting the end-side large language model to respond to the text in the third placeholder; and inputting the second prompt word into the end-side large language model to obtain the hidden state of the output layer of the end-side large language model.
4. A method of conditional sense semantic enhancement based on zero sample capability of a large language model as claimed in claim 3, wherein said determining a conditional sense semantic vector for the understanding result from the hidden state comprises: Determining the position of the last valid word element in the second prompting word; And determining a hidden state vector corresponding to the position of the last valid word element in the hidden state as the conditional perception semantic vector.
5. The method for enhancing the conditional sense semantic based on zero sample capability of a large language model according to claim 1, wherein the fusing the conditional sense semantic vector and the basic semantic vector to obtain the enhanced semantic representation vector corresponding to the dialog text to be enhanced comprises: Fusing the condition-aware semantic vector and the basic semantic vector in any one of the following modes to obtain an enhanced semantic representation vector corresponding to the dialogue text to be enhanced: residual fusion, splice projection fusion or gate fusion.
6. The method for enhancing conditional sense semantics based on zero sample ability of large language model of claim 5, wherein the step of performing a stitching projection fusion on the conditional sense semantics vector and the basic semantics vector to obtain the enhanced semantics expression vector corresponding to the dialog text to be enhanced comprises the steps of: Splicing the conditional sense semantic vector and the basic semantic vector to obtain a spliced vector; and projecting based on a preset projection matrix and the spliced vector to obtain an enhanced semantic representation vector corresponding to the dialogue text to be enhanced.
7. The method for enhancing conditional sense semantics based on zero sample ability of large language model according to claim 5, wherein the process of performing gating fusion on the conditional sense semantics vector and the basic semantics vector to obtain the enhanced semantics expression vector corresponding to the dialog text to be enhanced is implemented based on the following manner: Wherein, the The gating weight is represented as a function of the gating weight, The activation function is represented as a function of the activation, The basic semantic vector is represented by a vector of the basic semantic, Representing the conditional sense semantic vector, A linear weight matrix is represented and is used, Representing the enhanced semantic representation vector.
8. A large language model zero sample capability based conditional sense semantic enhancement device, comprising: the data acquisition module is used for acquiring the text of the dialogue to be enhanced and the context condition of the text of the dialogue to be enhanced; The problem determining module is used for processing the context condition based on an end-side large language model with zero sample semantic understanding capability to obtain a natural language problem corresponding to the context condition; The hidden state acquisition module is used for processing the dialogue text to be enhanced and the natural language problem based on the end-side large language model to obtain a hidden state of an output layer of the end-side large language model, wherein the hidden state is used for representing an understanding result of the end-side large language model on the dialogue text to be enhanced and the natural language problem; the conditional sense semantic vector acquisition module is used for determining a conditional sense semantic vector aiming at the understanding result according to the hidden state; the semantic vector-based acquisition module is used for inputting the dialogue text to be enhanced into a universal embedded model to obtain a basic semantic vector; and the semantic enhancement module is used for fusing the condition-aware semantic vector and the basic semantic vector to obtain an enhanced semantic representation vector corresponding to the dialogue text to be enhanced.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and running on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.

Description

Condition perception semantic enhancement method based on zero sample capacity of large language model Technical Field The application belongs to the technical field of semantic enhancement, and particularly relates to a conditional sense semantic enhancement method based on zero sample capacity of a large language model. Background In a multi-turn dialog system deployed on the end side, the system typically requires efficient memorization and retrieval of user history dialogs for context consistency and personalized interactions. The current mainstream scheme generally adopts a "search enhancement generation" paradigm, i.e., historical dialog segments are encoded into semantic vectors using a generic embedding model and stored in a vector database. When a user initiates a new query, the system retrieves the most relevant historical segments by computing vector similarity according to current context conditions (e.g., user intent, specific topic restrictions, etc.). The method is a mainstream scheme of the current intelligent system because of high calculation efficiency and easy integration. However, the universal embedded model has a certain cross-domain universality, but the training target of the universal embedded model is generally oriented to a universal semantic matching task, and the explicit modeling capability of the dialog specific semantic elements is lacking. Particularly when dealing with complex semantic matching tasks that are conditional on, there is a significant technical bottleneck. To solve the above-mentioned problems, the prior art generally employs a scheme of performing supervised fine tuning on task-specific data. However, this solution requires the construction of a large number of high quality "condition-text-tag" paired data, the manual labeling cost is extremely high, and the trimmed model often overfits the training data, once the application scene is switched, the semantic discrimination capability of the model for the new field is drastically reduced. Disclosure of Invention The application aims to provide a condition perception semantic enhancement method based on zero sample capacity of a large language model, which aims to characterize and enhance dialogue sentences on the premise of not relying on manual labeling and not fine tuning a basic general embedded model. In a first aspect of the embodiment of the present application, a method for enhancing conditional awareness semantics based on zero sample capability of a large language model is provided, including: Acquiring a dialogue text to be enhanced and a context condition of the dialogue text to be enhanced; Processing the context condition based on the end-side large language model with zero sample semantic understanding capability to obtain a natural language problem corresponding to the context condition; Processing the dialogue text to be enhanced and the natural language problem based on the end-side large language model to obtain a hidden state of an output layer of the end-side large language model, wherein the hidden state is used for representing an understanding result of the dialogue text to be enhanced and the natural language problem by the end-side large language model; determining a conditional sense semantic vector for the understanding result according to the hidden state; inputting the dialogue text to be enhanced into a general embedded model to obtain a basic semantic vector; And fusing the condition-aware semantic vector and the basic semantic vector to obtain an enhanced semantic representation vector corresponding to the dialogue text to be enhanced. In a second aspect of the embodiment of the present application, there is provided a condition-aware semantic enhancement device based on zero sample capability of a large language model, including: the data acquisition module is used for acquiring the text of the dialogue to be enhanced and the context condition of the text of the dialogue to be enhanced; The problem determining module is used for processing the context condition based on the end-side large language model with zero sample semantic understanding capability to obtain a natural language problem corresponding to the context condition; the hidden state acquisition module is used for processing the dialogue text to be enhanced and the natural language problem based on the end-side large language model to obtain the hidden state of the output layer of the end-side large language model, wherein the hidden state is used for representing the understanding result of the end-side large language model on the dialogue text to be enhanced and the natural language problem; the condition perception semantic vector acquisition module is used for determining a condition perception semantic vector aiming at an understanding result according to the hidden state; The semantic vector-based acquisition module is used for inputting the dialogue text to be enhanced into the universal embedded model to obta