CN-122021625-A - Document-level relation extraction method for integrating subgraphs and display construction reasoning paths
Abstract
A document-level relation extracting method for merging sub-images and displaying the constructed reasoning paths includes such steps as converting the input text sequence to word vector sequence by encoder, constructing the abnormal pattern containing entity node, reference node and sentence node by document image constructing layer, defining three reasoning paths explicitly, extracting a sub-image from target entity, applying R-GCN network to make reasoning, sending the feature information of global encoder and the feature information of local sub-image to fusion relation classifying layer for predicting relation distribution probability, optimizing weighted adaptive loss function, and dynamically regulating the loss contribution of different kinds of samples. The invention improves the reasoning capacity of complex relations, the effect of long-distance dependency capture and the accuracy of long-tail relation identification, and simultaneously, the accuracy, the efficiency and the generalization capacity of document-level relation extraction are obviously improved.
Inventors
- DONG SHIXIAN
- YU DUNHUI
Assignees
- 湖北大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260209
Claims (8)
- 1. A document-level relation extraction method for merging subgraphs and displaying and constructing an inference path is characterized by comprising the following steps: Step S1, encoding a document, converting an input document data set into a sequence vector through a neural network encoder, converting the sequence vector into a word vector sequence through a pre-training language model, and splicing a word embedding vector and an entity type embedding vector to obtain a word comprehensive representation; Step S2, constructing an abnormal pattern containing three different types of nodes through a document map construction layer, namely defining three node types of entity nodes, mention nodes and sentence nodes, and mention-sentence, entity-mention and sentence-sentence edge types, and constructing a document abnormal pattern, wherein the mention nodes are characterized by comprising the average value of word vectors, the entity nodes are characterized by the average value of all mention nodes, and the sentence nodes are characterized by comprising the average value of word vectors; Step S3, introducing a heuristic reasoning path strategy, and explicitly constructing three types of reasoning paths based on the position relation and semantic association of the target entity pairs, so as to ensure that all the target entity pairs can be connected through at least one path, wherein the step comprises the following steps: When the mention of the entity pair is positioned on the same sentence, constructing a path in the form of entity node-mention node-sentence node-mention node-entity node, and modeling corresponding to two types of pattern recognition and common sense reasoning; when the mentioned pair of entity pairs do not share sentences, the sentence-to-sentence reasoning path comprises a logic reasoning path connected in series through a bridge entity and a common-finger reasoning path based on the common-finger relation of adjacent sentences; constructing a comprehensive reasoning path, namely constructing a path based on all mention combinations of entity pairs when the entity pairs do not meet the intra-sentence reasoning path conditions; Step S4, based on the document graph and the reasoning path, taking the target entity pair as a center, extracting a local subgraph comprising target entity nodes, related mention nodes, related sentence nodes and the reasoning path, introducing super nodes, connecting the super nodes with the target entity nodes in the subgraph, and initially embedding the super nodes into the maximum pooling of the target entity node embedding; S5, sending the obtained global encoder characteristic information and local sub-image characteristic information into a fusion relation classification layer to conduct relation distribution probability prediction, wherein the method comprises the steps of connecting initial entity pair representation in series, super node coding characteristics, sentence node maximum pooling characteristics in sub-images and entity distance embedding, inputting a multi-layer perceptron classification layer, and outputting relation probability distribution of entity pairs, wherein the entity distance embedding is embedding of the relative distance between first references of target entity pairs; and S6, optimizing model parameters by adopting a weighted self-adaptive loss function, wherein the method comprises the following steps of: step S61, defining a special TH class as a threshold value of relation facts; Step S62 for each target entity pair Samples with a relation prediction distribution probability greater than a threshold are predicted as positive samples, samples not greater than the threshold are predicted as negative samples, and the weighted adaptive loss function is defined as: ; Wherein, the For a particular type of relationship, For a set of relationship fact categories, In order to be used to adjust the positive weight of the sample, Indicating that the target entity is relevant to the distribution prediction probability, Representing the fact that the target entity is a relation to the distribution prediction probability; And step S63, dynamically adjusting the loss contribution degree of samples of different categories.
- 2. The document level relation extraction method of the fusion subgraph and display construction inference path according to claim 1, characterized in that in step S1, the method comprises the following steps: Step S11, encoding the document, and inputting the document data set Input to the encoder, a vector sequence of paragraph length is obtained, the encoder types including but not limited to BERT, biLSTM, where the vector sequence is: ; Wherein, the As a sequence of vectors, As a dimension of the dimension, Is a text encoding function; step S12, converting the obtained vector sequence into word vector representation through the Glove Chinese word vectors obtained through pre-training to obtain a word embedding vector matrix; Step S13, using word embedding vector matrix to represent the entity vector embedded by the word in each document, and splicing the entity vector embedded by the entity type information representing the word in each document, wherein the calculation formula is as follows: ; Wherein, the Is that A word embedded and entity type embedded link, , For the word embedded entity vector in each document, Entity type embedding for entity type information for words in each document.
- 3. The document level relation extraction method of the fusion subgraph and display construction inference path of claim 1, wherein in step S2, the initial forms of entity node, mention node and sentence node are defined as: defining the reference node by calculating an average value of representations of words contained, linking the information representation with the node type with the average value of word vector representations contained in the word vector sequence to obtain a feature representation of the reference node, the formula is: ; Wherein, the For the characteristic representation of the resulting reference node, Is the first The number of reference to a node is that, The term vector representing all the words contained in the reference node is averaged, Is the first The vector of the individual words is used to determine, To refer to the type information embedded vector of the node, Representing a vector concatenation operation; Defining entity nodes, namely calculating the average summation of all mentioned node representations, calculating node type representations, and performing splicing operation to obtain the characteristic representation of the entity nodes, wherein the formula is as follows: ; Wherein, the For the characteristic representation of the resulting entity node, Is the first The number of individual entity nodes is the number of, Representation pair entity node All the characteristic representations of the mentioned nodes contained are averaged, Is a physical node Is the first of (2) The number of reference to a node is that, Representing a vector for a type of entity node; defining sentence nodes, namely calculating average value vectors of all sentences in the document, calculating to obtain sentence type representation, and performing splicing operation to obtain feature representation of the sentence nodes, wherein the formula is as follows: ; Wherein, the For the characteristic representation of the final sentence node, Is the first The nodes of the sentence are arranged in a plurality of sentence, Representing sentence nodes The word vectors of all words contained are averaged, The vectors are represented for the types of sentence nodes.
- 4. The document level relation extraction method of the fusion subgraph and display construction inference path of claim 1, wherein in step S2, three types of edges are constructed to model interactions between nodes, respectively: Sentence-mention edge when a mention node exists in a sentence, constructing an edge between the mention node and the sentence node, and processing the case that the same mention node appears in a plurality of sentences; entity-mention edges when entity nodes have different but related mention nodes in multiple sentences, constructing edges between the entity nodes and the mention nodes, and modeling the co-pointing relationship of the mention nodes; Sentence-sentence edge-adding an edge only between two adjacent sentence nodes in the document, maintaining global sequence information.
- 5. The document level relation extraction method for merging sub-graphs and displaying construction reasoning paths of claim 1, wherein in step S3, construction of inter-sentence reasoning paths specifically comprises: Constructing a logical reasoning path, wherein for a target entity pair, a bridge entity exists in a sequence form, and the logical reasoning path can be expressed as a form path of entity node-volume node-sentence node-co-finger mention node-sentence node-mention node-entity node; constructing a co-fingering reasoning path, namely performing co-fingering reasoning on two adjacent sentences containing target entities, wherein formalized definition formulas are as follows: ; Wherein, the For an inter-sentence inference path, Is a head entity In sentences One of the two references to a node, Is the first in the document The nodes of the sentence are arranged in a plurality of sentence, Is a bridge entity In sentences One of the two references to a node, Is a bridge entity In sentences One of the two references to a node, Is the first in the gear The nodes of the sentence are arranged in a plurality of sentence, As tail entity In sentences One of which refers to a node; the co-fingered inference path may be expressed as a physical node-mention node-sentence node-co-fingered mention node-sentence node-mention node-physical node.
- 6. The document level relation extraction method of the fusion subgraph and display construction inference path of claim 1, wherein in step S3, the construction comprehensive inference path is specifically that all sentence pairs are collected, Respectively represent target entity pairs Is distributed in existence Each mention and Each mention will have Strip comprehensive reasoning path Formalized definition formula is: ; Wherein, the In order to be a comprehensive reasoning path, Is a head entity In sentences One of the two references to a node, Tail entity In sentences Is referred to as a node.
- 7. The document level relation extraction method for merging sub-graphs and displaying construction reasoning paths according to claim 1, wherein in step S4, the method specifically comprises the following steps: Step S41, introducing a super node Connecting target entity nodes in other subgraphs; step S42, embedding the super node Initializing the maximum pooling embedded by the target entity node; step S43 use in subgraph RGCN of layer stack, apply message passing iteration to different types of edges in each layer respectively, calculate given first Layer node Polymerization, polymerization The set of neighbors, get the first Layer on polymerization of the first layer The new characteristic information after the layer node information is expressed as the following formula: ; Wherein, the Is the first Layer on polymerization of the first layer New feature information following the layer node information, In order to activate the function, For one particular edge type of the object, For the set of all edge types in the graph, Is a node Is a node in the neighborhood of the node, Is a node Is a set of neighbors of an edge type of (c), For the normalization constant(s), , Is a trainable parameter.
- 8. The document level relation extraction method for merging sub-graphs and displaying construction reasoning paths according to claim 1, wherein in step S5, the method specifically comprises the following steps: Step S51, calculating and splicing the linear representation embedded by the initial target entity node, and providing global entity perception information of the target entity pair; step S52, calculating and splicing super node embedded representations of the super nodes after learning in the subgraph, and providing local entity perception information of target entity pairs; step S53, calculating and splicing all the learned sentence node embedded representations to carry out maximum logsump pooling, and providing local context information of a target entity pair, wherein the formula is as follows: ; Wherein, the Is the first The aggregated representation of all sentence node features of the layer after pooling, For maximum pooling operations, Is the first After the layer diagram neural network iterates Is used for the feature vector representation of (a), Is a subgraph The first of (3) The nodes of the sentence are arranged in a plurality of sentence, For target entity pairs A derived subgraph extracted from the document map; step S54, calculating and splicing entity distance embedding , wherein, For the relative distance between the first reference to a target entity pair in a document, Embedding a layer for a relative distance; step S55, connecting the calculated characteristic information vectors, transmitting the characteristic information vectors to the MLP, and calculating the relation classification probability, wherein the formula is as follows: ; Wherein, the The probability is classified for the relationship and, Is a multi-layer sensor, which is a multi-layer sensor, For a linear representation embedded for an initial target entity node, Is passing by super node The final embedded representation after layer RGCN iteration, Is the first And aggregating and representing all sentence node characteristics of the layer after pooling.
Description
Document-level relation extraction method for integrating subgraphs and display construction reasoning paths Technical Field The invention relates to the technical field of natural language processing, in particular to a document-level relation extraction method for fusing subgraphs and displaying and constructing an inference path. Background In recent years, with the rapid development of internet technology, data has been exponentially and explosively increased, and the information world is full of a large amount of structured, semi-structured and unstructured data. These unstructured data (e.g., medical diagnostic reports, judicial documents, encyclopedia entries, etc.) are difficult to directly utilize by a computer, and how to efficiently extract valuable structured knowledge therefrom becomes a technical problem to be solved. Information extraction is generated as an intelligent data processing tool, and relation extraction is taken as a core research direction of information extraction, and plays a key role in downstream applications such as knowledge graph construction, intelligent question-answering systems and the like, so that the method is an important support for realizing data value conversion. Early relation extraction research focuses on sentence-level scenes, but only can extract the relation among entities in a single sentence, but in practical application, a large number of entity relations need to be expressed through semantic association across a plurality of sentences, so document-level relation extraction (Doc-RE) becomes a current research hotspot, and the aim is to identify complex relations among entities from the whole document, and the method is particularly suitable for professional scenes with deep semantic association among sentences. However, document-level relation extraction faces three major core challenges, namely firstly, the reasoning capability of complex relation between entity pairs is insufficient, the existing method is used for constructing a reasoning path by relying on a predefined syntax or co-occurrence rule, a hidden dynamic logic chain in a document is difficult to model, the potential relation extraction effect is poor, secondly, the long-distance text-dependent information capturing capability is weak, the entity in the document is scattered in a plurality of sentences, when the distance between the head entity and the tail entity is far, the semantic association is easy to attenuate, the recognition accuracy of the cross-sentence relation is low, thirdly, the recognition effect of the long-tail relation is poor, the entity and the relation are unevenly distributed in a data set, the number of low-frequency long-tail relation samples is small, and the problem of unbalanced label distribution is difficult to be relieved by the existing model, so that the relation is easy to be missed. At present, three main stream research methods are formed in the prior art, wherein the main stream research methods have the obvious defects that firstly, a document is regarded as a linear sequence based on a model, implicit capturing characteristics of the model such as BiLSTM, transformer are realized, multi-granularity information can be utilized, but the capability of capturing in long-distance dependence and multi-hop relation is limited, and the reasoning process is lack of interpretability, secondly, a model based on a graph neural network is used, dependence relation among entities is explicitly modeled by constructing a document graph, while cross sentence association capturing can be improved, the model structure is complex, the problems of low learning efficiency and poor expandability exist, and the large-scale document processing scene is difficult to adapt, thirdly, the method based on a pre-training model is used, the context semantic representation is obtained by means of a large-scale pre-training model such as BERT, and cross sentence modeling is combined with the graph structure or attention mechanism, but the method is limited by the length processing capability of the model sequence, and the limitation exists in multi-hop relation reasoning, and the core problems such as reference resolution and long-tail relation recognition cannot be effectively solved. Disclosure of Invention The invention aims to provide a document-level relation extraction method for fusing subgraphs and displaying and constructing inference paths, which is characterized in that three types of inference paths including intra-sentence, inter-sentence and comprehensive are explicitly constructed, a focusing target entity pair extracts a local subgraph and introduces super nodes, a weighted self-adaptive loss function is designed to dynamically adjust sample weights, the comprehensive improvement of the extraction performance of the document-level relation is realized, and the defects of large interference of full graph modeling noise, poor pertinence of the inference paths, insufficie