CN-120610309-B - Seismic data augmentation method and system based on countermeasure network and semi-supervised learning
Abstract
The invention belongs to the field of seismic data processing, and discloses a seismic data augmentation method and system based on an antagonism network and semi-supervised learning, wherein the method comprises the steps of acquiring seismic signals through a seismic detector, acquiring time series data, and acquiring real seismic data after data preprocessing; generating unlabeled seismic data according to the real seismic data and the constructed data generation module, labeling the unlabeled seismic data by utilizing the constructed data labeling module to obtain seismic data with pseudo labels, and forming a new seismic data set by manually labeling the seismic data with labels and the seismic data with the pseudo labels to realize the augmentation of the seismic data. On the premise of ensuring the data quality, the invention greatly improves the construction efficiency of the seismic data set and provides powerful data support and technical support for seismic signal processing.
Inventors
- ZHU YADONGYANG
- ZHAO XIAOMIN
- SHI YANDA
Assignees
- 北京石油化工学院
Dates
- Publication Date
- 20260508
- Application Date
- 20250604
Claims (7)
- 1. A method of seismic data augmentation based on antagonism network and semi-supervised learning, the method comprising: acquiring seismic signals through a seismic detector, acquiring time sequence data, and acquiring real seismic data after data preprocessing; generating unlabeled seismic data according to the real seismic data and the constructed data generation module; Marking the unlabeled seismic data by using the constructed data marking module to obtain the seismic data with the pseudo tag; forming a new seismic data set by manually marking the seismic data with the labels and the seismic data with the pseudo labels, so as to realize the augmentation of the seismic data; the method for generating unlabeled seismic data according to the real seismic data and the constructed data generation module comprises the following steps: during the training phase, the generator is utilized to generate the seismic data Comparing the true seismic data with a discriminator And seismic data The parameters of the generator are adjusted according to the comparison result; generating unlabeled seismic data by using trained data generation module ; Generating seismic data using a generator The method of (1) comprises: Will random noise Inputting into a multi-layer perceptron, and obtaining initial time sequence data through mapping Performing instance normalization processing on the initial time sequence data, and normalizing the time sequence data ; Normalized time series data using a lightweight neural network composed of a pooling layer and two layers of perceptrons Performing integral scanning to divide the sequence into Segment, output a size weight matrix ; Preset of A set of sizes of different sizes Wherein each dimension corresponds to the length of a dividing block, and the dimensions are assembled Represented as row vectors ; Size and dimension weight matrix And row vector Multiplying to obtain the first Dynamic partition size corresponding to time series data ; Size of dynamic partition corresponding to time sequence data Composition sequence ; According to the sequence Sliding pair input sequence Cutting to form multiple time series data with different lengths ; Processing the structure of the attention mechanism on each divided segment to extract the features of different scales; The extracted features with different scales are aggregated in a weighted summation mode to obtain aggregated features ; For polymerized characteristics Restoration to initial time series data using 1D deconvolution Seismic data of the same dimension size , Representation generator The generated seismic data; During the training phase, the seismic data are recorded Is of the partial data of (a) The processing method of (1) comprises the following steps: for the same piece of seismic data, add Subshared noise performance And then different enhancement is carried out to obtain the enhanced seismic data Predicting the tag by using the tag prediction network to obtain the seismic data Tag prediction results ; Using confidence thresholds For seismic data Screening label prediction results: Wherein, the Representing seismic data Predicting a label; representing a comparison function; Representation and threshold Predictive labels after comparison, wherein confidence thresholds And (3) performing self-adaptive dynamic adjustment: Wherein, the Representing an initial confidence threshold; representing a final confidence threshold; Representing a current training data amount; Representing a total training data amount; A flexibility factor representing confidence level adjustment; control parameters representing confidence level smoothing; indicating the amplification factor of the training batch data volume adjustment.
- 2. The method of claim 1, wherein the real seismic data is compared using a discriminator And seismic data The method of the similarity degree of (2) comprises: inputting generated seismic data And true seismic data ; Using convolution kernel size as Extracting the characteristics of the local information in a short time window with the step length of 1 to obtain the local characteristics of the data; Using convolution kernel size as Extracting features of the global information in a medium time window with the step length of 3 to obtain global features of the data; Using convolution kernel size as Obtaining long-time dependence of data in a long-time window with a step length of 5; Obtaining a complete seismic data feature vector by using concat splicing operation on the local features of the data, the global features of the data and the long-time dependence of the data; Inputting the complete seismic data feature vector into the full connection layer for feature mapping, and passing through Function contrast generated seismic data And true seismic data Is a degree of similarity of (c).
- 3. The method of claim 1, wherein labeling unlabeled seismic data with a constructed data labeling module, the method of obtaining pseudo-tagged seismic data comprises: during the training phase, in addition to using seismic data In addition, seismic data is added in the training set Is of the partial data of (a) Seismic data Sum data The loss weight after the network training is weighted and summed and the back propagation is carried out to update the network parameters; The trained data marking module is used for marking the untagged seismic data And marking to obtain the seismic data with the pseudo tag.
- 4. A method according to claim 3, wherein during the training phase, the seismic data are recorded The processing method of (1) comprises the following steps: Seismic data The real label is obtained by manual marking Seismic data Predicting the label by using a label prediction network to obtain a predicted label result Predictive labels With real labels And (3) performing cross entropy: Wherein, the Representing seismic data A label prediction result; Representing seismic data A label for manual marking; Representation tag A probability density function; Representation tag Probability density function.
- 5. The method of claim 4, wherein the step of determining the position of the first electrode is performed, To the same piece of data after screening To do mean square error : According to mean square error Obtaining the total mean square error : Wherein, the Representing the total number of data samples; Represent the first The number of samples after random enhancement of the bar data.
- 6. The method of claim 5, wherein the seismic data is recorded Sum data The method for updating the network parameters by weighting and summing the loss weights after the network training and back-propagating comprises the following steps: Cross entropy to be obtained Sum mean square error Weighted summation as a function of the loss of the tag prediction network: Wherein, the Representing the weighting coefficients, L representing the loss function.
- 7. A seismic data augmentation system based on an antagonism network and semi-supervised learning, the system for implementing the method of any of claims 1-6, the system comprising an acquisition module, a generation module, a labeling module, and a synthesis module; The acquisition module is used for acquiring seismic signals through the geophones, acquiring time series data and acquiring real seismic data after data preprocessing; the generation module is used for generating unlabeled seismic data according to the real seismic data and the constructed data generation module; The marking module is used for marking the unlabeled seismic data by using the constructed data marking module to obtain the seismic data with the pseudo tag; The synthesis module is used for forming a new seismic data set by manually marking the seismic data with the labels and the seismic data with the pseudo labels, so as to realize the augmentation of the seismic data.
Description
Seismic data augmentation method and system based on countermeasure network and semi-supervised learning Technical Field The invention belongs to the field of seismic data processing, and particularly relates to a seismic data augmentation method and system based on an antagonism network and semi-supervised learning. Background Deep learning technology has been widely used in seismic signal processing, mainly by learning a large amount of labeled seismic data, to improve the accuracy of seismic signal analysis. The existing seismic data are mainly collected by geophones, and the seismic data are marked manually. The method is low in efficiency, is easily interfered by human factors, and is difficult to ensure the consistency of labeling. To analyze the seismic signals using deep learning techniques, a significant amount of manpower and time is required to produce the seismic dataset. In addition, manual labeling is costly, making frequent updating and expansion of the seismic data training set nearly impossible. Once a new seismic event occurs, traditional manual labeling often cannot be completed in a short time, thereby preventing the deep learning technology from rapidly analyzing the new seismic data. How to ensure the labeling quality and simultaneously remarkably improve the construction efficiency of the data set has become a key difficulty of the deep learning technology in the seismic signal processing. The generation of countermeasures and semi-supervised learning techniques in recent years has shown tremendous potential in data generation and automatic labeling. The generation of the countermeasure network can effectively generate the true characteristic data, and the semi-supervised learning technology can utilize a small quantity of labeling samples to realize automatic labeling of large-scale data, so that a new thought is provided for seismic data augmentation. Disclosure of Invention In order to solve the problems in the prior art, the invention provides the seismic data augmentation method and the system based on the countermeasure network and the semi-supervised learning, which greatly improve the construction efficiency of the seismic data set and provide powerful data support and technical guarantee for seismic signal processing on the premise of ensuring the data quality. In order to achieve the above object, the present invention provides the following solutions: A method of seismic data augmentation based on antagonism network and semi-supervised learning, the method comprising: acquiring seismic signals through a seismic detector, acquiring time sequence data, and acquiring real seismic data after data preprocessing; generating unlabeled seismic data according to the real seismic data and the constructed data generation module; Marking the unlabeled seismic data by using the constructed data marking module to obtain the seismic data with the pseudo tag; the seismic data with the labels and the seismic data with the pseudo labels are manually marked to form a new seismic data set, so that the amplification of the seismic data is realized. Preferably, the method for generating unlabeled seismic data according to the real seismic data and the constructed data generation module comprises the following steps: In the training stage, the generator is utilized to generate seismic data x 1, a discriminator is utilized to compare the similarity degree of the real seismic data sigma and the seismic data x 1, and parameters of the generator are adjusted according to the comparison result; and generating unlabeled seismic data x n by using the trained data generation module. Preferably, the method of generating seismic data x 1 using a generator includes: Inputting random noise z into a multi-layer perceptron, obtaining initial time sequence data x through mapping, carrying out instance normalization standardization processing on the initial time sequence data, and normalizing the time sequence data x'; the normalized time sequence data x' is scanned integrally by using a lightweight neural network consisting of a pooling layer and two layers of perceptrons, the sequence is divided into P segments, and a size and dimension weight matrix is output Preset N size sets f= { F 1,F2,...,FN } with each size corresponding to the length of one partition block, and representing the size set F as a row vector Multiplying the size weight matrix W by the row vector F to obtain a dynamic division size F q' corresponding to the q-th period time sequence data; The dynamic division size F q 'corresponding to each period of sequence data is formed into a sequence F'; Cutting the input sequence x 'according to the sliding of the sequence F' to form a plurality of time sequence data x '' with unequal lengths; Processing the structure of the attention mechanism on each divided segment to extract the features of different scales; the extracted features with different scales are aggregated in a weighted summation mode