CN-121981119-A - Entity relationship identification method integrated with display entity information
Abstract
The invention belongs to the technical field of intersection of intellectualization and natural language processing, and particularly relates to an entity relationship identification method integrated with display entity information, wherein the entity relationship identification method adopts a three-level architecture of feature extraction-multi-mode reasoning-credibility verification, and realizes accurate and efficient extraction of military entity relationship through field knowledge injection and model weight reduction; the display entity information extraction layer is used as a first level of an entity relation extraction technical scheme, and lays a solid foundation for subsequent multi-mode reasoning and credibility verification by bearing the key mission of accurate extraction and characterization construction of display entity information in the original data.
Inventors
- JI SIYUAN
- CHEN XIAODONG
- MA XIAOLE
- GUO XIAOLIN
- WEN YUE
Assignees
- 航天科工智能运筹与信息安全研究院(武汉)有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251230
Claims (10)
- 1. A method for identifying entity relation integrated with display entity information is characterized in that the method adopts a three-level architecture of feature extraction, multi-mode reasoning and credibility verification, and achieves accurate and efficient extraction of military entity relation through field knowledge injection and model light weight, wherein a display entity information extraction layer serves as a first level of an entity relation extraction technical scheme, carries key mission for accurate extraction and characterization construction of display entity information in original data, and lays a solid foundation for subsequent multi-mode reasoning and credibility verification.
- 2. The method for identifying entity relationships incorporating display entity information according to claim 1, wherein the method involves the following procedure of entity relationship extraction technique incorporating display entity information; inputting a document, a report and an operation manual text to be extracted based on military subject; Outputting relation entity triples; The operation is as follows: step 1, verb triggering word stock construction and entity information extraction layer construction is displayed; step 2, building a multi-mode fusion dynamic reasoning layer: a) Entity position coding, a type constraint mechanism, data cleaning and structuring, and a multi-granularity marking system; b) Performing Bert model training; c) An entity perception enhancement module; d) Constructing a three-level weight adjustment system based on text features; Step 3, post-processing optimizing layer construction; and 4, utilizing the step 2 to perform multiple training, and utilizing the step 3 to calibrate the data to obtain a triplet result.
- 3. The method for identifying entity relationships incorporated into display entity information according to claim 2, wherein in the step 1, the display entity information extraction layer is used as a primary level of the entity relationship extraction technical scheme, and carries key tasks for accurate extraction and characterization construction of display entity information in original data, so as to lay a solid foundation for subsequent multi-mode reasoning and credibility verification, and construct an entity feature system through three-level processing procedures.
- 4. The entity relationship identification method integrated with the display entity information as claimed in claim 3, wherein in the step 1, a BERT-CRF model is selected as a core algorithm for entity type identification, the BERT model can generate word vectors rich in semantic information by pre-training massive text data, capture deep semantic association in texts, and further describe transition probability and constraint relationship among labels on the basis of the word vectors, so as to realize global optimal label sequence decoding; Based on complex multiple scenes, the method involves enriching heterogeneous feature information, and skillfully integrating entity numbers and GPS (global positioning system) coordinate multisource heterogeneous features, wherein the features depict the overall view of an entity from different dimensions, the numbers reflect the identity of the entity, the GPS coordinates determine the time-space position level of the entity to reflect the state and the emergency degree of the entity, and the aggregation of the multiple features constructs a plump three-dimensional feature image for the entity; For the features with unique structure and semantics, such as space coordinates, the features are mapped to a high-dimensional continuous vector space by using a Gaussian kernel mapping technology, and the mathematical expression of the Gaussian kernel is as follows: Wherein: In order to input the vector(s), Representing the euclidean distance of the two vectors, Controlling the locality of the kernel function for the bandwidth parameter; The Gaussian kernel function can convert nonlinear separable coordinate points in an original space into linear separable vectors in a high-dimensional space by means of excellent mathematical characteristics and strong nonlinear mapping capability, so that the problem of high difficulty in direct processing of space coordinates is effectively solved, entities in different space positions can be subjected to similarity comparison and association analysis in a unified vector space, for example, the spatial adjacency among the entities of different GPS coordinate points can be intuitively calculated through the mapping, potential cooperative relations or situation distribution characteristics are further excavated, and key space dimension information support is provided for subsequent decisions.
- 5. The method for identifying entity relationships incorporated into display entity information according to claim 4, wherein in the step 2, a multi-modal fusion dynamic reasoning layer is constructed; the method adopts a double-engine collaborative reasoning mechanism, aims at improving the field adaptability and accuracy of entity relation extraction, is a core link in an entity relation extraction technical scheme for fusing and displaying entity information, and aims at solving the problems that a traditional single model often has insufficient knowledge adaptability to a specific field, one-sided reasoning result and the like when processing a complex entity relation extraction task, and the double-engine collaborative reasoning mechanism can more comprehensively and accurately infer the entity relation by combining the advantages of two different types of models.
- 6. The method for identifying entity relationships incorporated into display entity information according to claim 5, wherein said step 2 is an innovative improvement based on a BERT-CRF model, and is: 1) The entity position coding is to mark the relative position of the entity by utilizing a sine function, so that the model can capture the position information of the entity in the text, thereby better understanding the relative relation between the entities and providing richer space semantic features for relation extraction; 2) The type constraint mechanism is used for effectively reducing error output of the model in the reasoning process according to priori knowledge of the field, and improving the accuracy of relation extraction; 3) Performing data cleaning and structuring, namely performing entity recognition on an unstructured tactical report, extracting four-element groups of 'time-space coordinates-entity-action type', and adopting a regular matching and rule engine to eliminate text ambiguity; 4) Defining an entity relation label system based on a knowledge graph, and ensuring labeling quality and labeling content through a two-stage labeling strategy; 5) Aiming at rare scenes, generating an countermeasure sample by adopting back translation and entity replacement, and introducing a military term dictionary to perform synonym replacement; the method adopts a dual-channel heterogeneous model architecture, a left-side channel integrated traditional CRF model is used for realizing structural feature extraction, a right-side channel is deployed with a Qwen2.5-7B military fine tuning model to bear a complex semantic understanding task, and two-channel output is subjected to nonlinear fusion through a dynamic gating mechanism: Wherein, the Is a fusion semantic vector; the entity relation confidence vector is output for the CRF model; a classification probability distribution normalized by Softmax for the large model; Is a bias term; The machine learning weight parameters alpha, beta and gamma are automatically learned on the verification set through a Bayesian optimization algorithm, and the initial search space is set to be uniformly distributed in the [0,1] interval; 6) Entity perception enhancement module: introducing a learnable entity type embedding matrix The module explicitly injects domain knowledge into a fusion process through inquiring entity type codes, so as to solve the problem of insufficient sensitivity of a traditional attention mechanism to military terms; 7) Constructing a three-level weight adjustment system based on text features: the term density perception is that the improved TF-IDF algorithm is adopted to calculate the text professional index Wherein the IDF weights are built based on military term ontology, the formula is: i is the order of words in the document, m is the number of words in the document, TF () is a term Is a word frequency of (a); n is the total document number of the corpus, Is a term When (1) document frequency When the method is used, a beta 1 weight self-adaptive lifting mechanism is triggered, and the large model channel weight beta 1 is adjusted to be: Wherein the method comprises the steps of The baseline value, k=0.2 is the gain factor, The mechanism directly affects the beta parameter in the fusion formula, and the large model weight is dynamically enhanced through term density driving, so that when the model processes professional text, The contribution degree of the items is improved by more than 30 percent; The relation complexity assessment, namely adopting the depth of the dependency tree and the entity co-occurrence density as syntax complexity indexes, and triggering a multi-stage fusion strategy for the multi-hop relation; Complexity quantization method Dependency tree depth, analyzing sentence dependency structure by SPM algorithm, calculating longest dependency path length Reflecting the syntax complexity; entity co-occurrence density defined as Wherein For the number of co-occurrences of an entity pair, Is of sentence length And is also provided with When the relation scene is judged to be a multi-hop relation scene; And (3) fusion strategy adjustment: triggering two-stage fusion for multi-hop relations: initial stage of adoption of =0.4 Extract basis relations; recursion phase, reset initial output as context =0.7 Mining implicit relations; the formula is associated: By dynamic adjustment Parameters enable a fusion formula to capture explicit and implicit semantics in a multi-hop relationship in stages, and solve the problem of relationship fracture caused by traditional single-stage fusion.
- 7. The method for identifying entity relationships incorporating display entity information according to claim 6, wherein in step 3, a post-processing optimization layer is constructed; The post-processing optimization layer design for the extraction of the entity relationship of the military is provided, and the problems of knowledge conflict and probability estimation deviation between model output and the military field are solved by constructing a three-level verification system of space-time consistency verification, confidence coefficient calibration and logic correction, so that the rationality and reliability of an output result in the extraction level of the entity relationship are ensured.
- 8. The method for identifying entity relationships incorporated into display entity information of claim 7, wherein the method belongs to the technical field of intersection of intelligence and natural language processing.
- 9. The method for identifying entity relationships incorporated into display entity information according to claim 7, wherein the method enables a fusion formula to capture explicit and implicit semantics in multi-hop relationships in stages by dynamically adjusting alpha/beta parameter combinations, thereby solving the problem of "relationship fracture" caused by conventional single-stage fusion.
- 10. The method for identifying entity relationships incorporated into display entity information according to claim 7, wherein the method aims at problems of rule stiffness, data dependence, field adaptation defects of a deep learning model and the like of a traditional method, optimizes entity relationship extraction, and enhances reasoning capacity of the model on implicit relationships.
Description
Entity relationship identification method integrated with display entity information Technical Field The invention belongs to the technical field of intersection of intellectualization and natural language processing, and particularly relates to an entity relationship identification method integrated with display entity information. Background The entity relationship recognition technology is suitable for scenes such as data analysis, command decision support systems, simulation platforms and the like, and the accuracy and the instantaneity of entity relationship recognition are obviously improved by integrating display entity information, so that the problems of insufficient field suitability, complex context dependence, strict instantaneity requirement and the like are solved. Conventional entity relationship extraction techniques mainly include rule-based methods and statistical machine learning methods. The rule-based method performs relation extraction with the grammar pattern library through regular expression matching, for example, the 'X accessory Y' can be directly mapped into accessory relation. However, the rule maintenance cost of this approach grows exponentially with increasing diversity of expression, and it is difficult to cope with implicit expressions in the report. The statistical machine learning method relies on manually designed feature engineering, such as dependency paths among entities or trigger word positions, but has extremely high requirements on high-quality labeling data, and the problem of data sparsity is particularly prominent in the military field, which limits the application range of the statistical machine learning method. Disclosure of Invention First, the technical problem to be solved The invention aims to provide an entity relationship identification method integrated with display entity information. (II) technical scheme In order to solve the technical problems, the invention provides an entity relation identification method integrated with display entity information, which adopts a three-level architecture of feature extraction, multi-mode reasoning and credibility verification, realizes the accurate and efficient extraction of military entity relation through field knowledge injection and model weight reduction, takes a display entity information extraction layer as a primary level of an entity relation extraction technical scheme, lays a key mission for accurate extraction and characterization construction of the display entity information in original data, and lays a solid foundation for subsequent multi-mode reasoning and credibility verification. The process of the entity relation extraction technology integrated with the display entity information related to the method is as follows; inputting a document, a report and an operation manual text to be extracted based on military subject; Outputting relation entity triples; The operation is as follows: step 1, verb triggering word stock construction and entity information extraction layer construction is displayed; step 2, building a multi-mode fusion dynamic reasoning layer: a) Entity position coding, a type constraint mechanism, data cleaning and structuring, and a multi-granularity marking system; b) Performing Bert model training; c) An entity perception enhancement module; d) Constructing a three-level weight adjustment system based on text features; Step 3, post-processing optimizing layer construction; and 4, utilizing the step 2 to perform multiple training, and utilizing the step 3 to calibrate the data to obtain a triplet result. In the step 1, the display entity information extraction layer is used as a first hierarchy of the entity relation extraction technical scheme, and carries key tasks of accurate extraction and characterization construction of display entity information in original data, so that a solid foundation is laid for subsequent multi-mode reasoning and credibility verification, and an entity characteristic system is constructed through three-level processing procedures. In the step 1, based on deep analysis and system carding of the military field, 8 major classes are covered and 57 minor classes are built, the ontology library looks like a rich knowledge treasury, a detailed and authoritative reference standard is provided for judging entity types, and the model is ensured to be dependable in the identification process; The method comprises the steps of selecting a BERT-CRF model as a core algorithm for entity type identification, generating word vectors rich in semantic information by pre-training massive text data by the BERT model, capturing deep semantic association in a text, further describing transition probability and constraint relation among labels on the basis of the CRF layer to realize global optimal label sequence decoding by the CRF layer, and combining the two to realize fine and precise entity type division of complex military grouping text by fully utilizing the deep