CN-122021891-A - Dynamic prototype coding and prompt self-construction method for zero sample relation extraction
Abstract
The invention discloses a dynamic prototype coding and prompt self-construction method for extracting a zero sample relation, which comprises the following steps of S1 defining tasks, S2 generating suitability prompt words through a dynamic prompt generator to construct a dynamic prompt template, S3 dynamically generating and aggregating context-related prototype representations through a dynamic prototype aggregator, and S4 training and testing. And the distinguishing capability of the model on complex relations and similar relations is improved while the sentences and prototype semantics are aligned accurately.
Inventors
- LU LING
- SUN WENJUN
- ZHU HONGHAI
- HUANG DAN
- HU CHONG
- GONG PENG
- LEI JIANLI
Assignees
- 重庆理工大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260120
Claims (9)
- 1. A dynamic prototype coding and prompt self-construction method for extracting zero sample relation is characterized by comprising the following steps: S1, defining a task; s2, generating suitability prompt words through a dynamic prompt generator, and constructing a dynamic prompt template; s3, dynamically generating and aggregating context-dependent prototype representations by a dynamic prototype aggregator; S4, training and testing.
- 2. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 1, wherein step S1 comprises: S1-1, setting a visible relation, an invisible relation and a visible data set; Is provided with For visible relationships, a trained set of relationships is represented, wherein, , Represent the first A visible relationship; Is provided with For invisible relationships, represent an untrained set of relationships, wherein, , Represent the first A non-visible relationship; Is provided with In order to be able to view the data set, Is an invisible dataset; S1-2, training a model M through a visible data set; s1-3, testing the model M through the invisible data set.
- 3. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 1, wherein step S2 comprises: s2-1, inputting sentences and marking; s2-2, constructing a prompting template of a T5 model; Generating a verb phrase according to the entity mark by a dynamic prompt generator, and automatically constructing a prompt with a [ MASK ] mark; S2-3 input sentence An input to a T5 model; S2-4, predicting a relation phrase according to the generated prompt and the input sentence through a T5 model; S2-5, optimizing the relation phrase predicted by the T5 model through a Beam Search algorithm; The formula is as follows: ; Wherein, the Representing the decoding status of the current time step, Representation is based on current decoding status An operation of the beam search is performed, Representation of candidate sequences { Fetch the sequence that maximizes the following expression, The width is searched for the bundle and, Is the first in the candidate sequence The candidate phrase for the location is selected, Represent the first Personal word At the front To the point of Probability product under conditions Is given that the candidate phrase is given all words before Probability of (2); s2-6, after obtaining the optimal pre-selected words, constructing a personalized prompting template for each input sentence; s2-7, using the hidden state of the three masks of the prompt template after being averaged as a final sentence representation, wherein the formula is as follows: ; Wherein, the The characteristics obtained by the final polymerization are shown, The operation of pooling is represented as such, Representing the corresponding feature of the header entity with a MASK tag, Features with a [ MASK ] tag corresponding to the tail entity, Features with a MASK label corresponding to the representation relationship.
- 4. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 1, wherein the format converted in step S2-3 is as follows: Wherein, the method comprises the steps of, 、 Respectively represent sentences The head entity and the tail entity of the (c), Representing stitching.
- 5. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 1, wherein step S2-1 comprises the following steps: Input sentence , wherein, The sentence to be input is represented as such, Representing sentences In (3) the 1 st token of the number, Representing sentences N-th token in (a); By four special tokens 、 、 And Marking sentences The positions of the middle head entity and the tail entity are obtained The following are provided: ; Wherein, the The sentence after the processing is represented as such, 、 、 Respectively representing the 1 st token, the 2 nd token and the n th token in the original sentence; a start token representing the head entity, Representing the original word content to which the head entity corresponds, An end token representing the head entity, A start token representing the tail entity, The original word content representing the tail entity, An end token representing a tail entity.
- 6. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 1, wherein step S3 comprises the steps of: s3-1, relation prototype processing; s3-2, extracting text features; and S3-3, polymerizing the characteristics processed in the steps S3-1 and S3-2 to obtain the final prototype embedding.
- 7. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 6, wherein step S3-1 comprises the steps of: s3-1-1 input relation Is a label of (2) Description of the invention And three randomly selected tag name aliases 、 、 Information, the 5 kinds of prototype information are all encoded by the same BERT encoder; S3-1-2, stacking the embedded vectors corresponding to the 5-type prototype information to form a prototype feature matrix as follows: ; Wherein, the Is a relationship of The corresponding prototype feature matrix is used to determine, In order to hide the dimensions of the layer, In the case of an encoder, Operating for maximum pooling; s3-1-2, capturing key information in the information through a multi-head self-attention mechanism; Matrix of prototype features The input unit is input to the self-attention mechanism, and the Query, key and Value are obtained through linear projection as follows: ; Wherein, the Are all a matrix of parameters that can be learned, In order to query the matrix, In the form of a matrix of keys, Is a value matrix; Then, the dot product attention is parallelly calculated in the multi-head space, so that the importance of different side information is dynamically captured; S3-1-3, carrying out normalization treatment through LayerNorm layers; The context-adaptive relationship prototype obtained through attention weighted aggregation and residual connection is represented as follows: ; Wherein, the For a context-adaptive prototype representation, The number of heads representing multi-head attention, each head independently calculating the attention for capturing semantic relationships in different subspaces; For normalizing the exponential function, for calculating the attention weight between nodes, For each head dimension, for scaling the dot product to stabilize the gradient; Is a relationship of Is used to determine the query matrix of (1), Is a relationship of Is provided with a matrix of keys which, For a key matrix Is to be used in the present invention, Is a relationship of Is used for the value matrix of (a), Is a relationship of A corresponding prototype feature matrix.
- 8. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 6, wherein step S3-2 comprises the steps of: S3-2-1, inputting sentences containing target entities; s3-2-2, encoding the input sentence through a BERT encoder, and extracting a semantic vector representation of context perception; S3-2-3, processing the information coded in the step S3-2 through Pooling layers, and extracting the text global features.
- 9. The method for dynamic prototype coding and hint self-construction with zero-sample relation extraction of claim 1, wherein step S4 comprises the steps of: S4-1, acquiring sentence representation and prototype embedding; S4-2, determining a positive sample relation prototype corresponding to the current sample ; S4-3, selecting a plurality of negative sample relation prototype sets irrelevant to positive samples ; S4-4, setting super parameters ; S4-5, calculating similarity; S4-5-1 calculating sentences Prototype with positive sample Similarity of (2) ; ; Wherein, the Representing a computed sentence Prototype with positive sample Similarity of (2); s4-5-2 calculating sentences Prototype with each negative sample Maximum value of medium similarity ; ; Wherein, the In order to take the maximum value it is, Representing sentences And negative sample prototype Similarity of (2); s4-6, calculating marginal loss; The calculation is performed by the following function: ; Wherein, the In order to achieve a loss value, the value of the loss, For the representation of the current input sentence, Embedded for the positive sample prototype corresponding to the current input sentence, For the negative-sample embedding, The negative sample representing random sampling is The number of the two-dimensional space-saving type, Super parameters for controlling the minimum distance between positive and negative samples; Representing the number of samples; S4-7, when L=0, the positive and negative sample distinguishing degree reaches the standard, the model is not required to be updated, and when L >0, the positive and negative sample distinguishing degree is insufficient, and the model parameters are required to be updated through back propagation.
Description
Dynamic prototype coding and prompt self-construction method for zero sample relation extraction Technical Field The invention relates to the technical field of data processing, in particular to a dynamic prototype coding and prompting self-construction method for zero sample relation extraction. Background The aim of the ZeroSRE task is to identify new relationships without training examples, and existing ZeroSRE studies are largely divided into three categories, classification-based, generation-based and prototype-based methods, respectively. The method based on classification models ZeroSRE tasks as a multi-classification problem, encodes text and relation description by using a pre-training model, and performs classification prediction by comparing semantic similarity of sentences and the relation description. The method has strong dependence on relationship description and limited generalization. The method based on the generation directly generates the relation tag by using the generation model without constructing an explicit classifier, and is easy to be interfered by text noise, and the accuracy and stability of the generated result are not enough. As one of the mainstream methods, a prototype-based method models ZeroSRE tasks as a semantic matching problem, aligns sentences with corresponding relationship descriptions in vector space, predicts invisible relationships by minimizing the distance of sentence embedding from corresponding prototype embedding. This method is poor in terms of similarity and complexity because it focuses only on sentence-level representation. RE-Matching provides a fine-grained semantic Matching method, and the sentence-level similarity score is decomposed into an entity Matching score and a context Matching score, so that the model is more concerned about the distinction between specific features in sentences, but the processing of similarity relations and complexity is still limited. AlignRE, constructing a more robust prototype representation by using side information outside the relation description, and constructing a unified prompting template for sentences to reduce the coding difference between the sentence and the prototype, wherein the fixed prompting template is difficult to adapt to diversified language expression, so that the generalization capability of the model is restricted. The CE-DA improves the representation capability of the prototype to a certain extent by aggregating the side information, but the aggregation strategy is simple, the semantic relation and the importance difference between different information are difficult to fully model, and the prototype representation has insufficient adaptability to ZSRE tasks. Recently, a ZeroSRE method based on a large language model has been significantly developed, which is a sample enhancement for feedback driving by using the multi-round dialogue capability of LLM, predicts without labeling data, and provides a new idea for ZeroSRE task, but also accompanies higher calculation cost. Although the existing prototype-based zero-sample relation extraction method obtains good effects by utilizing methods such as side information, constructing a prompt template and the like. However, the existing prototype-based method generally relies on static prototype coding and a fixed hint template, is difficult to deal with complex or semantically similar relationship types, and easily has the problem that the alignment of the prototype and the context semantics is not accurate enough. Disclosure of Invention The invention aims to solve the technical problems in the prior art, and particularly creatively provides a dynamic prototype coding and prompting self-construction method for extracting zero sample relation, which can accurately align sentences and prototype semantics and improve the distinguishing capability of a model on complex relations and similar relations. In order to achieve the above object, the present invention provides a method for dynamic prototype coding and hint self-construction for zero sample relation extraction, comprising the following steps: S1, defining a task; s2, generating suitability prompt words through a dynamic prompt generator, and constructing a dynamic prompt template; s3, dynamically generating and aggregating context-dependent prototype representations by a dynamic prototype aggregator; S4, training and testing. In the scheme, the step S1 comprises the following steps: S1-1, setting a visible relation, an invisible relation and a visible data set; Is provided with For visible relationships, a trained set of relationships is represented, wherein,,Represent the firstA visible relationship; Is provided with For invisible relationships, represent an untrained set of relationships, wherein,,Represent the firstA non-visible relationship; Is provided with In order for the set of data to be visible,Is an invisible dataset. S1-2, training a model M through a visible data set; s1-3, testing t