CN-121980035-A - Geological text-oriented space-time ore-forming event and relation extraction method
Abstract
The invention discloses a space-time mineral formation event and relation extraction method for geological texts, which belongs to the technical field of artificial intelligence and natural language processing and comprises the steps of obtaining geological text data and corresponding geological map data, carrying out domain knowledge enhancement processing and graphic neural network processing to generate enhanced text semantic vectors and space topological features, screening the space topological features based on the enhanced text semantic vectors to generate multi-mode fusion features, identifying mineral formation event trigger words based on the multi-mode fusion features, constructing a dynamic space-time dependency domain based on the mineral formation event trigger words and the multi-mode fusion features, extracting time theory elements and space theory elements with dependency relation with the mineral formation event trigger words in a feature range limited by the dynamic space-time dependency domain, and outputting mineral formation event trigger word and time empty talk element pairs. By adopting the mode of asymmetrically interlocking and fusing the geological text and the geological map data and constructing a dynamic space-time dependent domain in the multi-mode fusion characteristic by utilizing a focus diffusion mechanism, the high-robustness combined extraction of the mineralization event and empty talk yuan can be realized.
Inventors
- DING ZHENGJIANG
- BAO ZHONGYI
- WANG BIN
- LV JUNYANG
- ZHANG QIBIN
- LI JINGBO
Assignees
- 山东省地质矿产勘查开发局第六地质大队(山东省第六地质矿产勘查院)
Dates
- Publication Date
- 20260505
- Application Date
- 20260126
Claims (10)
- 1. A method for extracting space-time mineralogical events and relations oriented to geological texts, which is characterized by comprising the following steps: obtaining geological text data to be processed and geological map data corresponding to the geological text data space range; performing domain knowledge enhancement processing on the geological text data to generate an enhanced text semantic vector; performing graph neural network processing on the geological graph data to extract space topological characteristics; Inputting the enhanced text semantic vector and the space topological feature into a preset asymmetric interlocking fusion module, and utilizing text semantic to guide screening and aggregation of space features to generate multi-mode fusion features; inputting the multi-mode fusion features into a preset space-time perceived focal diffusion joint extraction network, and identifying an ore forming event trigger word from the features corresponding to the geological text data by utilizing a focal point positioning mechanism; Taking the ore-forming event trigger word as a central anchor point, carrying out attention energy diffusion calculation based on the multi-mode fusion characteristics, and constructing a dynamic space-time dependent domain according to the interaction strength among the characteristics; And extracting time argument and space argument which have dependency relationship with the ore-forming event trigger words in the characteristic range defined by the dynamic space-time dependency domain, and outputting ore-forming event trigger words and time empty talk argument pairs.
- 2. The method for extracting spatiotemporal mineralogical events and relationships oriented to geological text according to claim 1, wherein generating the enhanced text semantic vector comprises: loading a preset geological domain knowledge base, wherein the geological domain knowledge base comprises a geological entity dictionary and a synonym set; Performing entity identification and link on the geological text data by utilizing the geological domain knowledge base, mapping geological terms in the geological text data to a knowledge base entity identifier, and performing semantic disambiguation; and extracting context semantic information based on the disambiguated text sequence, and generating an enhanced text semantic vector.
- 3. The method for extracting space-time mineralogical events and relations for geologic text according to claim 1, wherein the extracting spatial topological features comprises: Carrying out vectorization analysis on the geological map data to construct a geological space map structure, wherein nodes of the geological space map structure represent geological entities, and edges of the geological space map structure represent adjacent, containing or fault contact spatial relations; carrying out multi-layer message transfer on the geological space graph structure by using a graph convolution network, and aggregating attribute information of neighborhood nodes to generate node-level space features; And performing global pooling operation on the node-level spatial features to generate spatial topological features.
- 4. A method of geospatial text oriented spatiotemporal mineralogical event and relationship extraction according to claim 3, wherein said generating a multi-modal fusion feature comprises: Generating a text query vector based on the enhanced text semantic vector; constructing the spatial topological feature and the node level spatial feature as a spatial key value pair memory library; Performing multi-jump attention retrieval in the memory library by using the text query vector and the space key value pair, and calculating a relevance score of the text and the space entity; and carrying out weighted fusion on the node-level spatial features according to the relevance scores, and reversely injecting weighted fusion results into text representations to generate multi-mode fusion features.
- 5. The method for extracting spatiotemporal mineralogical events and relationships oriented to geological text according to claim 1, wherein the identifying mineralogical event trigger words from the features corresponding to the geological text data by using a focus positioning mechanism comprises: inputting the multi-modal fusion characteristics into a preset sequence labeling model, wherein the sequence labeling model comprises a conditional random field layer or a binary perceptron layer, and calculating probability scores of words in the geological text data serving as action cores of the ore forming events by using the sequence labeling model; and screening out words with probability scores exceeding a preset confidence threshold, and marking the screened words as mine event trigger words.
- 6. The method for extracting space-time ore events and relations for geologic text according to claim 1, wherein the constructing a dynamic space-time dependency domain comprises: Taking the feature vector of the ore event trigger word as an energy diffusion source point, and respectively calculating dot product attention weights between the energy diffusion source point and the feature vectors of the rest text words and the space node feature vectors contained in the multi-mode fusion feature; Defining the dot product attention weight as the interaction intensity between the features, and generating an energy distribution diagram taking the ore-forming event trigger word as the center; and intercepting a text segment and a space node set with the interaction intensity value higher than a preset coverage rate threshold value in the energy distribution diagram, and defining the intercepted text segment and space node set as a dynamic space-time dependency domain.
- 7. The method for extracting spatiotemporal mineralogical events and relationships oriented to geological text according to claim 1, wherein the outputting the mineralogical event trigger word and time empty talk pairs comprises: inputting the feature vector within the limit range of the dynamic space-time dependent domain into a preset pointer generation network; Respectively predicting a starting character index position and an ending character index position of a time argument in the geological text data and a starting character index position and an ending character index position of a space argument in a tag sequence corresponding to the geological text data or the geological map data by utilizing the pointer generation network; and extracting corresponding entity text according to each predicted index position, and taking the extracted entity text as a time empty talk element to be structurally combined with the ore-forming event trigger word to output an ore-forming event trigger word and a time empty talk element pair.
- 8. The method for extracting space-time mineralogical events and relations for geologic text according to claim 1, further comprising: Based on the ore event trigger word and time empty talk element pairs, combining with geological logic element rules acquired from a knowledge base of preset stored geological rules, calculating model uncertainty and rule violation degree through an active double-loop learning strategy, and selecting samples for updating to iteratively optimize model parameters and the geological logic element rules.
- 9. The method for extracting space-time mineralogical events and relations oriented to geological text according to claim 8, wherein the active double-loop learning strategy comprises: Predicting unlabeled geological text samples by a parameter configuration model, calculating information entropy or variance based on prediction probability distribution, and taking the information entropy or variance as the model uncertainty; Mapping the trigger word and time empty talk element pair of the ore-forming event generated by prediction into a structured logic expression, carrying out consistency comparison on the structured logic expression and the geological logic element rule, and quantifying the comparison result into rule violation; Marking the samples with the model uncertainty higher than a preset uncertainty threshold and the rule violation degree higher than a preset violation degree threshold as high-value samples, and taking the high-value samples as samples for updating to carry out expert labeling; and acquiring expert annotation results aiming at the updated sample, carrying out back propagation updating on the model parameters by utilizing the expert annotation results, and dynamically adjusting the confidence weight of the geological logic element rule in subsequent training according to the conflict situation of the expert annotation results and the geological logic element rule.
- 10. A spatiotemporal mineralogical event and relation extraction system for use in a method of spatiotemporal event and relation extraction for use in a method of any of claims 1-9, said system comprising: The multi-source geological data acquisition module is used for acquiring geological text data to be processed and geological map data corresponding to the geological text data space range; The field knowledge enhancement text processing module is used for carrying out field knowledge enhancement processing on the geological text data to generate an enhanced text semantic vector; The space topological feature extraction module is used for carrying out graph neural network processing on the geological map data and extracting space topological features; The asymmetric interlocking fusion module is used for inputting the enhanced text semantic vector and the spatial topological feature into a preset asymmetric interlocking fusion module, and screening and aggregation of text semantic guiding spatial features are utilized to generate multi-mode fusion features; The space-time perception focus diffusion extraction module is used for inputting the multi-mode fusion characteristics into a preset space-time perception focus diffusion joint extraction network, and recognizing an ore event trigger word from the characteristics corresponding to the geological text data by utilizing a focus positioning mechanism; The dynamic space-time dependency domain construction module is used for carrying out attention energy diffusion calculation based on the multi-mode fusion characteristics by taking the ore-forming event trigger word as a central anchor point and constructing a dynamic space-time dependency domain according to the interaction strength among the characteristics; And the time empty talk-element pair extraction and output module is used for extracting time-element and space-element which have dependency relationship with the ore-forming event trigger words in the characteristic range defined by the dynamic space-time dependency domain, and outputting the ore-forming event trigger words and the time empty talk-element pair.
Description
Geological text-oriented space-time ore-forming event and relation extraction method Technical Field The invention relates to the field of artificial intelligence and natural language processing technology, in particular to a space-time ore event and relation extraction method oriented to geological text. Background In the process of geological big data mining by utilizing artificial intelligence technology, automatic extraction of the mineralization event and space-time elements thereof from massive unstructured geological data is a core task for constructing a geological knowledge graph. The task requires a calculation model to understand semantic information in natural language texts and process complex space topological relations hidden in geological map, and belongs to the problem of difficult cross-mode information extraction and reasoning. In the related art, chinese patent publication No. CN116089629A discloses a text data mining method and system for the ore formation rule of phosphorite, comprising a Chinese text word segmentation method based on the characteristics of vocabulary trees and phosphorite beds, marking various geological report text data of phosphorite, constructing a spatial relationship knowledge base of ore formation characteristics of the phosphorite beds, matching semantic similarity by utilizing a space-time convolutional neural network model, extracting space-time relationship information in the text, and constructing a spatial relationship knowledge map of phosphorite geological entities. However, the above-mentioned prior art solution mainly focuses on data mining of a single text mode, and fails to effectively utilize spatial topological features in geological map data to perform multi-mode complementation and asymmetric interlocking fusion, so that understanding of complex spatial relationships between geological entities is limited, meanwhile, the solution lacks a fine extraction mechanism based on focus positioning and attention energy diffusion, and cannot accurately lock argument boundaries of an ore-forming event by dynamically constructing a space-time dependency domain like the solution, and in addition, the solution does not introduce an active double-loop learning strategy combining geological logic element rules, and cannot screen high-value samples based on model uncertainty and rule violation degree to iteratively optimize model parameters and rule confidence, so that the improvement of the model on logic consistency and training efficiency is limited. Disclosure of Invention In order to solve the problems, the invention provides a method for extracting the time-space mineralization event and the relation of the geological text, which adopts the mode of asymmetrically interlocking and fusing the geological text and the geological map data and constructing a dynamic time-space dependent domain in the multi-mode fusion characteristic by utilizing a focus diffusion mechanism, so that the high-robustness combined extraction of the mineralization event and empty talk yuan can be realized. The above object can be achieved by the following scheme: A space-time ore forming event and relation extraction method for a geological text comprises the steps of obtaining geological text data to be processed and geological map data corresponding to a space range of the geological text data, conducting domain knowledge enhancement processing on the geological text data to generate enhanced text semantic vectors, conducting graphic neural network processing on the geological map data to extract space topological features, inputting the enhanced text semantic vectors and the space topological features into a preset asymmetric interlocking fusion module, utilizing text semantic to guide screening and aggregation of the space features to generate multi-modal fusion features, inputting the multi-modal fusion features into a preset space-time perception focus diffusion joint extraction network, utilizing a focus positioning mechanism to identify ore forming event trigger words from the features corresponding to the geological text data, conducting attention energy diffusion calculation based on the multi-modal fusion features by taking the ore forming event trigger words as central anchor points, constructing dynamic space-time dependency fields according to interaction strength among the features, extracting time theory elements and space trigger elements with dependency relations with the ore forming event trigger words in the feature range defined by the dynamic space-time dependency fields, and outputting ore forming event trigger words empty talk pairs. Optionally, the generating the enhanced text semantic vector comprises loading a preset geological domain knowledge base, carrying out entity identification and link on the geological text data by utilizing the geological domain knowledge base, mapping geological terms in the geological text data to a knowledge base entity identifi