CN-121980370-A - Self-supervision geochemical anomaly identification method

CN121980370ACN 121980370 ACN121980370 ACN 121980370ACN-121980370-A

Abstract

The invention provides a self-supervision geochemical anomaly identification method, and belongs to the field of geochemical anomaly identification. The method comprises the steps of inputting an attribute graph comprising a feature matrix X and an adjacent matrix A, constructing a graph structure, generating low-dimensional node embedding by using a global self-attention mechanism of a transducer framework, reconstructing the adjacent matrix A' by using a decoder based on the low-dimensional node embedding, generating high-quality embedded vectors by using a loss function reverse forced transducer encoder to learn, training a classifier GCN based on the embedded vectors, the adjacent matrix A and node data with mineral point labels, and performing anomaly identification on geochemistry by using the trained classifier GCN. The method can fully integrate the attribute characteristics of the geochemical data with the global space association, and can realize the regional geochemical anomaly identification with high precision and high robustness by only relying on a very small quantity of labeling samples.

Inventors

SUN YARU
LIU BIN
PAN KE
LUO TONGHUI
TANG LONG
HUANG KELIN
LI JINYE

Assignees

成都理工大学

Dates

Publication Date: 20260505
Application Date: 20260119

Claims (8)

1. A method for identifying self-supervised geochemical anomalies, comprising the steps of: S1, inputting an attribute graph comprising a feature matrix X and an adjacent matrix A, and constructing a graph structure, wherein the feature matrix X is used for storing the content of geochemical elements of each node, the adjacent matrix A represents the connection relation between the nodes, and each geochemical sample point is regarded as one node in the graph structure; S2, introducing a transform encoder into a variable division graph self-encoder VGAE to replace a traditional GCN encoder, and aggregating and compressing a feature matrix X of a node and global graph structure information thereof by using a global self-attention mechanism of a transform architecture to generate a low-dimensional node embedding, wherein the transform encoder is of a stacked structure; S3, node embedding based on low dimension is performed, and an adjacent matrix A' is obtained through reconstruction by using a decoder; s4, based on the reconstructed adjacent matrix A', a loss function is utilized to reversely force a transducer encoder to learn and generate a high-quality embedded vector; s5, training a classifier GCN based on the embedded vector, the adjacent matrix A and the node data with the mine point labels, and performing anomaly identification on geochemistry by using the trained classifier GCN.
2. The self-supervised geochemical anomaly identification method of claim 1, wherein said S1 comprises the steps of: S101, inputting an attribute graph comprising a feature matrix X and an adjacent matrix A; S102, calculating the Euclidean distance D between nodes based on the attribute graph, and setting a geographic threshold value And dynamically constructing a graph structure.
3. The self-supervised geochemical anomaly identification method of claim 1, wherein said S2 comprises the steps of: s201, introducing a transform encoder into a variable division graph self-encoder VGAE to replace a traditional GCN encoder; s202, injecting structural information by using a position code which can be learned based on a transducer encoder obtained after substitution: ; Wherein, the The feature matrix is represented by a matrix of features, Representing a matrix of projection weights that can be learned, Representing a learnable position-coding matrix for assigning absolute position identity information to each node, Representing a matrix containing feature and position information as input into a first transducer encoder layer, each transducer encoder layer including a multi-headed self-attention and position feed forward network; s203, generating low-dimensional node embedding by capturing global spatial dependency in geochemical data by using a global self-attention mechanism of a transducer architecture based on the injection structure information 。
4. The self-supervised geochemical anomaly identification method of claim 3, wherein said S203 comprises the steps of: Matrix is formed Generating a Query, a Key Key and a Value vector for each attention header h through three different linear layers; calculating a scaled dot product attention based on the generated result; Based on the calculated scaling dot product attention, fusing all attention head information to obtain dynamic and global information aggregation: Based on information aggregation, nonlinear transformation is carried out on vector representation of each node by utilizing a position feed-forward network to obtain a node representation matrix ; Representing nodes into matrix Mean matrix mapping to potential space And logarithmic standard deviation matrix ; Based on the mean matrix And logarithmic standard deviation matrix Generating low-dimensional node embeddings : ; Wherein, the Representing a random noise vector sampled from a standard normal distribution, Representing element-wise multiplication.
5. The method for identifying self-supervised geochemical anomalies according to claim 1, characterized in that the adjacency matrix The expression of (2) is as follows: ; Wherein, the Representing nodes Sum node There is a predictive probability of an edge being present in between, Representation of And The conditional probability of the presence of an edge in between, Representing nodes Sum node The connection relation between the two components is that, Representing nodes Sum node An edge is arranged between the two layers, And Representing nodes respectively And node Is used to determine the low-dimensional potential vector of (c), Representation of Is to be used in the present invention, Representing the inner product of the hidden vectors.
6. The method of claim 1, wherein the expression of the loss function is as follows: ; ; ; Wherein, the The loss function is represented by a function of the loss, Indicating a loss of reconstruction and, Indicating a loss of KL-divergence, The super-parameter is represented by a parameter, Representing the total number of nodes, Representing nodes Sum node The connection relation between the two components is that, Representing nodes Sum node There is a predictive probability of an edge being present in between, Representing positive sample weights when When the weight of the item is When (1) When the weight is 1, Representing the dimensions of the hidden space vector, Representing nodes Mean vector of (a) Is the first of (2) The number of components of the composition, Representing nodes Variance vector of (a) Is the first of (2) A component.
7. The self-supervised geochemical anomaly identification method of claim 1, wherein said S5 comprises the steps of: s501, taking the embedded vector and the adjacent matrix A as input of a classifier GCN; s502, building a training set and a testing set by using node data with mine point labels, and training a classifier GCN by using the training set; s503, utilizing the embedded vector, the adjacent matrix A and the abnormal areas of all nodes in the test set prediction graph structure to finish the abnormal identification of geochemistry.
8. The method of claim 7, wherein the output of the classifier GCN is expressed as follows: ; Wherein, the And Respectively represent the first And (d) The node characteristic matrix of the layer, Representing a non-linear activation function, Representing the normalized adjacency matrix, Represent the first A layer's learnable weight matrix.

Description

Self-supervision geochemical anomaly identification method Technical Field The invention belongs to the field of geochemical anomaly identification, and particularly relates to a self-supervision geochemical anomaly identification method. Background The geochemical data obtained by sampling the mediums such as regional rocks, soil, stream sediments or lake sediments reflects the potential geological process of the region, and the background change caused by the characteristics related to mineralization and other geological or surface effects can be distinguished by carrying out statistical analysis on the geochemical data, so that the geochemical data has strong application potential in mineral exploration. Mineralization is a unique elemental differentiation phenomenon in the evolution process of earth materials and appears as an abnormal enrichment response of chemical elements in the context of shell-veil-water ring multi-layer migration. And identifying the region in the geochemical data, in which the element concentration is obviously deviated from the background value, distinguishing the geochemical background from the anomaly, thereby being beneficial to delineating the anomaly region and improving the exploration efficiency. The spatial distribution characteristics and the combination rules of the geochemical anomalies play a key guiding role in mineral resource prediction. Currently, the ore-forming prediction has been stepped into a new stage of multi-modal data fusion, and students widely combine multi-source information such as geology, geochemistry, geophysics, remote sensing and the like to improve the accuracy of the prediction. With the increasing abundance of geological exploration data, combining model driving and data driving methods, especially using machine learning technology to perform ore-forming prediction, has become a key way to improve data processing efficiency and prediction capability. Among them, in the field of exploration geochemistry, anomaly detection techniques that incorporate machine learning exhibit unique advantages. The method can effectively describe complex nonlinear relations between geochemical data, and further realizes high-precision identification of geochemical anomalies under complex geological background. In terms of method models, machine learning models such as convolutional neural networks, the K-nearest neighbor (KNN) algorithm, the deep autoencoder network and The like have been widely used for geochemical anomaly recognition and have achieved significant results. By deeply analyzing the spatial distribution characteristics of the geochemical data, the method not only can effectively identify the geochemical anomaly mode related to the ore-forming process, but also has important guiding significance for mineral exploration. In the geochemical anomaly recognition task of the ore-forming prediction, a machine learning method has become a key technical path for extracting effective anomaly information from high-dimensional and complex data due to strong nonlinear fitting and pattern recognition capability. In recent years, in order to overcome the problem of rare labeling samples commonly existing in geochemical data, an unsupervised and self-supervised learning method without relying on manual labels is receiving a great deal of attention. The graph neural network is introduced to construct a spatial correlation graph between sampling points due to the strong modeling capability of the graph neural network on the topological and spatial relations between nodes, so that geochemical characterization with more discrimination is learned. However, in the existing graph-based self-supervision method, such as a variational graph self-coder and variants thereof, in the process of characterization learning, the potential space often has the limitations of overlapping node embedding distribution and insufficient degree of distinction between classes, which affects the accuracy and robustness of subsequent anomaly identification. Meanwhile, attention mechanism-based models represented by transformers are beginning to be explored for graph structure data modeling because of their advantages in capturing long-range dependencies and global context information, and offer new possibilities for learning complex spatial dependencies in geochemical data. Disclosure of Invention Aiming at the defects in the prior art, the self-supervision geochemical anomaly identification method provided by the invention solves the problems that the traditional supervision learning label data is scarce and the traditional graph model is difficult to effectively capture global features. In order to achieve the purpose, the technical scheme adopted by the invention is that the self-supervision geochemical anomaly identification method comprises the following steps: S1, inputting an attribute graph comprising a feature matrix X and an adjacent matrix A, and constructing a graph structure, wh