CN-122019776-A - Document level relation extraction method based on multistage feature collaborative modeling

CN122019776ACN 122019776 ACN122019776 ACN 122019776ACN-122019776-A

Abstract

The present invention relates to the field of Natural Language Processing (NLP), and more particularly to document level relationship extraction techniques. The traditional relation extraction method has the problems of insufficient local semantics, insufficient global semantic modeling, difficulty in reasoning complex relations and the like when processing long documents, cross-sentence relations and long-distance dependence. In order to solve the technical problems, the invention provides a document-level relation extraction method based on multistage feature collaborative modeling. The method comprises the steps of obtaining context semantic representation of a document by utilizing a pre-training language model, constructing entity representation through dynamic aggregation of multiple references of entities, capturing neighborhood fine-granularity interaction characteristics between entity pairs by combining a local interaction convolution module, and extracting key information context related to entity relationship inference through a global attention mechanism. Furthermore, the invention designs a multistage stacked feature fusion structure to realize progressive collaborative modeling of local semantics and global semantics, thereby enhancing the expression capability of the model on cross-sentence relationships, long-distance reasoning and complex relationships. In addition, the invention provides a mixed self-adaptive loss function to improve the robustness of the model to difficult-to-classify samples and low-frequency relations, introduces a teacher-student knowledge distillation mechanism, and guides the student model to learn by using pseudo labels and evidence distribution, thereby improving the overall relation inference performance. The method has strong relation modeling capability, reasoning capability and generalization capability, and can be widely applied to tasks such as knowledge graph construction, information extraction and the like.

Inventors

ZHANG PENG
HU WEI
ZHANG YANPING
Gui Yikai
ZHANG JIAWEI

Assignees

重庆邮电大学

Dates

Publication Date: 20260512
Application Date: 20251208

Claims (12)

1. The document level relation extraction method based on multistage feature collaborative modeling is characterized by comprising the following steps of: (a) And a document coding step, namely inputting a target document, and generating word level representation containing document context information by utilizing a pre-training language model, wherein the word level representation is used for capturing long-distance dependence and inter-sentence semantic structures. (B) And the entity representation construction step is to aggregate a plurality of mentioned positions of the entity in the document to obtain an entity level representation and form an initial feature vector of the entity pair. (C) And the local interaction modeling step is to extract neighborhood interaction information based on the initial vector of the entity pair through a local interaction convolution module and is used for representing the fine granularity interaction mode of the entity pair in a local semantic range. (D) And the global semantic modeling step is to focus global semantic information most relevant to the current entity pair through an attention mechanism based on the correlation between the entity pair characteristics and the whole semantic representation of the document, and generate a global semantic interactive representation. (E) And a multi-level feature fusion step, namely carrying out multi-level stacking on the local interactive convolution module and the global semantic module, so that semantic information with different scales is gradually fused between layers, and obtaining deep and multi-scale entity pair semantic features. (F) And the evidence selection step is to carry out evidence weight estimation on each sentence in the document based on the deep fusion characteristics so as to obtain the key sentence distribution supporting the relationship inference of the current entity. (G) And a relation prediction step, namely combining the fusion characteristics and the evidence distribution, classifying the relation of the entity pairs, and outputting a relation class set to which the entity pairs belong. (H) And the step of optimizing the mixed self-adaptive loss, namely, the model has higher sensitivity and robustness to the difficult-to-classify relationship and the low-frequency relationship through the mixed self-adaptive loss function with dynamic adjustment capability. (I) And the knowledge distillation step is to adopt a teacher-student framework, and guide the student model to promote the inference capability of long-distance relations and complex relations through pseudo relation labels and pseudo data distribution generated by the teacher model.
2. The method of claim 1, wherein the local interaction convolution module is configured to capture neighborhood interaction features of entity pairs within a local scope to enhance recognition of fine-grained semantic cues by a model.
3. The method of claim 1, wherein the global semantic modeling step aggregates the semantic information of the whole document by a focus mechanism to highlight important contexts related to entity relationships.
4. The method of claim 1, wherein the step of multi-level feature fusion employs an alternating stacking approach to gradually strengthen local features and global features in a multi-level structure, thereby capturing cross-sentence, long-distance semantic relationships.
5. The method of claim 1, wherein the evidence selecting step calculates an evidence importance score for each sentence based on the sentence semantic representation, thereby forming a sentence-level evidence distribution for assisting the relationship inference result.
6. The method of claim 1, wherein the hybrid adaptive loss function comprises: (a) The dynamic sample adjusting item is used for adaptively adjusting the weights of the samples difficult to classify and the long tail type samples; (b) And the inter-category boundary adjusting item is used for constructing a learnable separation boundary between similar relation categories and improving the discrimination capability of relation prediction.
7. The loss function of claim 6, wherein the inter-category boundary adjustment term dynamically adjusts the minimum inter-category separation interval based on a confusion structure between relationship categories to reduce a relationship prediction error rate.
8. The method of claim 1, wherein the knowledge distillation step directs the student model to learn the attention distribution of key sentences in the document through the pseudo-evidence distribution provided by the teacher model, improving the accuracy of evidence selection.
9. The method of claim 1, wherein the teacher model is trained on small-scale real labeling data and uses its relationship prediction results and evidence distribution as pseudo-supervisory signals for student model training.
10. A system for performing the document level relation extraction method of any one of claims 1 to 9, comprising: (a) A document encoding module; (b) An entity aggregation module; (c) An entity pair construction module; (d) A local interactive convolution module; (e) A global semantic interaction module; (f) A multi-stage feature fusion module; (g) An evidence selection module; (h) A relationship prediction module; (i) A loss optimization module; (j) A knowledge distillation module; the modules cooperate to perform the method of claim 1.
11. The system of claim 10, wherein the evidence selection module generates sentence importance scores in conjunction with the fusion features by using sentence-level semantic representations and entity pairs to provide an explanatory basis for relationship classification.
12. A computer readable storage medium having stored thereon a program which, when executed by a processor, causes the processor to perform the method of any of claims 1 to 9.

Description

Document level relation extraction method based on multistage feature collaborative modeling Technical Field The invention relates to the field of Natural Language Processing (NLP), in particular to a feature modeling technology for document-level relation extraction tasks. Specifically, the invention provides a document-level relation extraction method based on multistage feature collaborative modeling, which is suitable for scenes such as entity relation identification, knowledge graph construction, information extraction, question-answering system and the like in a long text. Background Relationship extraction (Relation Extraction) is one of the core tasks in natural language processing, aimed at automatically identifying semantic relationships between pairs of entities from text. Compared with sentence-level tasks, document-level relation extraction (Document-level Relation Extraction, docRE) requires cross-sentence reasoning over multiple sentences and even the whole Document, and places higher demands on the global semantic modeling capability of the model. Existing methods are broadly divided into three categories, sequence-based coding methods, graph structure-based methods, and pre-trained language model-based methods. The first two methods have a certain effect when processing short texts, but have obvious defects in complex scenes such as long documents, cross-sentence relations, unbalanced categories and the like: (1) The local and global semantic modeling is insufficient, namely, a single sentence level model is difficult to capture the cross sentence dependency relationship between entities, so that the reasoning precision is reduced; (2) The relation category distribution is extremely unbalanced, namely a few high-frequency relations dominate the training process, and low-frequency or long-tail relations are easy to ignore; (3) The evidence selection mechanism is static, the model is difficult to dynamically focus on key sentences supporting relation inference in the document, and the interpretation is poor; (4) The labeling data are scarce, high-quality manual labeling is expensive, the weak supervision data or the remote supervision data often contain noise, and the model generalization capability is insufficient. Therefore, the existing document-level relation extraction technology needs to realize breakthrough in the aspects of fusion of local interaction features and global semantic information, difficult sample self-adaptive optimization and knowledge migration under weak supervision conditions so as to improve the precision and the interpretability of relation extraction. Disclosure of Invention The invention aims to solve the problems of insufficient local semantic modeling, unbalanced category, inaccurate evidence selection and the like in the existing document-level relation extraction technology, and provides a document-level relation extraction method based on multistage feature collaborative modeling. By introducing a hierarchical multistage feature modeling network (HIERARCHICAL COLLABORATIVE ATTENTION NETWORK, HCAN), combining a local interactive convolution and a global semantic attention mechanism, dynamically selecting a relation evidence sentence, and designing a hybrid self-adaptive loss function and knowledge distillation framework, the sentence-crossing reasoning capability and the robustness of the model are improved. To achieve the above object, the method of the present invention comprises the steps of: 1) Document representation and entity embedding generation: The method comprises the steps of dividing sentences and words of an input document, inputting the divided sentences and words into a pre-training language model (such as BERT) to obtain the context representation of each word, and generating the context embedding of an entity through a dynamic weighted aggregation mechanism. 2) Modeling of local interaction convolution: And constructing a local interaction convolution module, performing convolution operation on the neighborhood semantic matrix of the entity pair, extracting local interaction characteristics, and reflecting the fine-grained semantic relation among the entities. 3) Global semantic interaction modeling: And designing a global semantic interaction module, and dynamically aggregating global semantic information most relevant to entity pairs in the document through a common attention mechanism to realize cross-sentence global dependency modeling. 4) Multistage feature collaborative fusion: and realizing multi-layer semantic fusion and feature reinforcement by alternately combining local convolution and global attention mechanisms through multi-layer stacking, and obtaining multi-scale feature representation. 5) Evidence selection mechanism: and distributing dynamic attention scores for each sentence, and selecting a sentence set which can support the relationship inference by the entity as evidence to promote the reasoning interpretation. Hybrid ad