CN-122024748-A - DAS noise reduction method based on noise type perception reinforcement learning

CN122024748ACN 122024748 ACN122024748 ACN 122024748ACN-122024748-A

Abstract

A DAS noise reduction method based on noise type perception reinforcement learning belongs to the technical field of machine learning and geophysical exploration signal processing, and the method constructs reinforcement learning intelligent body taking a full convolution network as a core. The invention introduces a semi-supervised collaborative training strategy, adopts a supervised reward based on a mean square error for the synthesis data with labels to ensure the signal fidelity, designs a non-reference quality evaluator DAS-BRISQUE (DBQ) for the real data without labels to generate the non-supervised reward, and guides the optimization of an intelligent agent by combining two reward signals.

Inventors

LIN HONGBO
GAO XIN

Assignees

吉林大学

Dates

Publication Date: 20260512
Application Date: 20260212

Claims (2)

1. The DAS noise reduction method based on noise type perception reinforcement learning is characterized by comprising the following steps of: S1 noise type perception reinforcement learning noise reduction model-NTARL noise reduction is carried out by using a multi-step iteration mechanism, and DAS data containing noise is obtained As an initial state Wherein H is the number of time sampling points contained in each channel of the data Y, W is the number of channels contained in the data Y, and the iterative flow under the time step t is as follows: s1.1 noise type perception comprising the steps of status Policy network and value network of agent input to NTARL, policy network pair The kth 3 x 3 region of perceived and output state of noise type Selection action Probability of (2) , Is a parameter of the policy network, N represents the number of actions in the action set; for a pair of Taking the action number corresponding to the maximum probability along the action dimension to obtain a regional strategy mapping diagram For a pair of Upsampling by adopting a nearest neighbor interpolation mode to obtain a high-resolution strategy mapping diagram Value network outputs state value Downsampling with 3×3 convolution with step size of 3 to obtain region state value , Is a parameter of the value network; S1.2 self-adaptive action selection and noise reduction, NTARL action set is Designed as a random noise reducer, a chessboard noise reducer, a low-frequency noise reducer, an aliasing noise reducer, an enhancement action and a holding action in sequence, and is used for Performing a noise reduction operation by selecting a corresponding action from the action set A according to the high resolution strategy map M The processing result is taken as the next state ; S2, network optimization based on semi-supervised learning is specifically as follows: S2.1 semi-supervised instant rewards calculation, semi-supervised instant rewards By supervised instant rewarding Unsupervised instant rewards The weighting is obtained, and the calculation mode is as follows: ; Wherein: Is a weight coefficient, supervised instant rewards Calculated from the synthetic data set, obtained in step S1.1 And The reward is measured by adopting MSE reduction amount before and after single step noise reduction, and the formula is as follows: ; Wherein: Unsupervised instant rewards for clean DAS data Calculated from the actual data set, using a no-reference signal quality estimator-DBQ metric rewards, DBQ includes two steps of feature extraction and quality score mapping Inputting the signal into a DBQ module, performing multi-scale downsampling processing on the signal, extracting statistical features according to BRISQUE feature definition under each scale, splicing the scale features into feature vectors, inputting the feature vectors into a support vector regression model based on DAS data, outputting quality scores normalized to the [0,1] interval, respectively comparing the quality scores to the following values And Executing the DBQ scoring flow, taking the decrease amount of the score as an unsupervised instant rewards, and the expression is as follows: ; S2.2 calculating a discount jackpot The n-step backtracking mode is adopted, and the formula is as follows: ; Wherein: for the discount factor, n is the time step of forward backtracking, and the maximum value is the maximum time step of the current round The region rewards are calculated by downsampling by a sliding window pair of 3 x 3 in size and 3 in step size Regional rewards are obtained by summing in regions ; S3, calculating a loss function, and obtaining the loss function of the intelligent agent according to an asynchronous dominant actor commentator algorithm, wherein the loss function is as follows: ; wherein L represents the number of areas obtained by division, and the back propagation update is carried out through the loss function strategy network and the value network to obtain optimized network parameters And storing; S4, DAS data noise reduction based on NTARL, namely inputting a noise-containing DAS signal Y into a trained intelligent body, and regarding Y as a state of reinforcement learning Steps S1.1 and S1.2 are performed to obtain the next state And then will Input to agent, iterate Step, will As a final noise reduction result 。
2. The DAS noise reduction method based on noise type perception reinforcement learning of claim 1, wherein The noise type perception reinforcement learning noise reduction model- - -NTARL in step S1 includes an agent and an action set, wherein the agent is composed of a policy network and a value network, specifically: The S5 agent consists of a strategy network and a value network, wherein the strategy network is designed into an 8-layer full convolution neural network, the front 6 layers of the strategy network are Conv+ReLU layers, the convolution kernel size is 3 multiplied by 3, the expansion rates are sequentially 1, 2, 3, 4, 3 and 2, the number of output channels is 64, the input and the output keep consistent in scale through filling, the 7 th layer of the strategy network is a gating circulation unit, the 8 th layer is Conv+Softmax layer, the convolution kernel size of the 8 th layer is 3 multiplied by 3, the step length is 3, the value network consists of 6 Conv+ReLU layers and one Conv layer, and the parameters of the strategy network are recorded as The first 4 layers of the value network share parameters with the strategy network, the convolution kernel size of the last 3 layers is 3 multiplied by 3, the expansion rate is 2,1 and 1, the convolution step length of the 7 th layer is 3, the number of output channels of the first 6 layers of the value network is 64, the number of output channels of the last layer is 1, and the parameters of the value network are recorded as ; S5.1, building and pre-training an action set, wherein the method comprises the following steps of: S5.1.1 action sets comprise 6 discrete actions, namely a random noise reducer, a chessboard noise reducer, a low-frequency noise reducer, an aliasing noise reducer, an enhancement action and a holding action; the random noise reducer consists of two residual blocks in cascade connection, wherein each residual block comprises two Conv+BN+ReLU layers and a residual connection, and the two residual blocks are connected with one Conv as an output layer; the chessboard noise reducer consists of three residual blocks in cascade connection, the residual block structure is consistent with that of the random noise reducer, and a Conv layer is connected after the cascade connection as an output layer; the method comprises the steps of adopting a U-Net network structure for twice encoding and twice decoding by a low-frequency noise reducer, wherein each downsampling of an encoding stage consists of Conv+BN+ReLU and Maxpool, the output characteristics of Conv+BN+ReLU are reserved for characteristic splicing of the decoding stage, the decoding stage consists of ConvTranspose +ReLU and Conv+BN+ReLU, wherein the output of ConvTranspose +ReLU is used for characteristic splicing, the spliced characteristics are fused by Conv+BN+ReLU, and finally the output result is obtained by a layer of Conv+BN+ReLU; S5.1.2 action set pretraining, random noise reducer, chessboard noise reducer, low-frequency noise reducer and aliasing noise reducer proposed by the action set are pretraining networks, the networks learn the characteristics of specific types of noise to sense the noise types, parameters remain unchanged after pretraining and participate in reinforcement learning process as noise reducer with specific functions, the random noise reducer is trained by using a clean data set and a random noise training set, the chessboard noise reducer is trained by using a clean data set and a chessboard noise training set, the low-frequency noise reducer is trained by using a clean data set and a low-frequency noise training set, the aliasing noise reducer is trained by using a clean data set and an aliasing noise training set, all networks are trained by taking mean square error as a loss function, the training set is constructed by using clean signals and noise data with specific noise types, and mapping of noise-containing data to clean data is learned.

Description

DAS noise reduction method based on noise type perception reinforcement learning Technical Field The invention belongs to the technical field of reinforcement learning and seismic signal processing, and particularly relates to a DAS noise reduction method based on noise type perception reinforcement learning. Background The oil gas resource is one of main energy sources in modern society, provides necessary energy guarantees for industrial production, transportation, household heating and power supply, oil gas supply in China is in a shortage state for a long time, national economy development has huge demands on the oil gas energy, distributed Acoustic Sensing (DAS) is a rapidly developed seismic acquisition technology and has a wide application prospect, noise in DAS data comprises random noise, optical system noise, coupling noise, horizontal noise, fading noise, chessboard noise and the like, and challenges are brought to subsequent data processing such as seismic imaging, inversion and interpretation. In order to remove various noises contained in DAS signals, geophysicists at home and abroad propose a method for suppressing the seismic noises, the traditional noise reduction method is generally used for reducing certain types of noises such as background noises, fading noises, amplitude abnormal noises and coupling noises in a targeted manner based on frequency domain characteristics, statistical characteristics, structural characteristics and sparse characteristics of the DAS noises, the aliasing multiple types of noises in DAS data are jointly suppressed by a cascading filter mode, but the sequence and parameters of the cascading filter have great influence on the noise suppression effect, the other type is used for removing the seismic noises by utilizing the difference between effective reflection and noises in a transformation domain, such as F-x deconvolution, seislet transformation, wavelet transformation, curvelet transformation, shearlet transformation, dreamlet transformation and the like, the characteristic of each sampling point is ignored by the wavelet transformation after denoising by adopting an F-x deconvolution method, the principle of shearlet transformation is that the seismic data is decomposed into signals in multiple dimensions and multiple domains by sparse transformation, dreamlet transformation is carried out on the seismic data, and the important structural imaging information can be maintained on the important information in a certain degree, but the quality of the imaging method can be improved. The development of deep learning provides new theory and development for DAS noise suppression, a convolutional neural network is proved to be a powerful tool for DAS data noise reduction, the noise reduction effect of DAS exploration data is improved by automatically learning structural features and complex nonlinear mapping from a large amount of noise-containing label data and designing a multi-scale and multi-dimensional noise reduction model structure, the problem of signal leakage of the noise reduction data is caused by the fact that some DAS noise is close to effective signal features in structural mode and frequency domain features, attention mechanisms are introduced to be capable of highlighting the signal features and implicit noise information, the noise reduction effect of background noise, optical noise, abnormal amplitude noise and coupling noise is effectively improved based on an attention-guided noise reduction model, and the defects of poor interpretability and limited generalization capability still exist despite the fact that the deep learning greatly promotes the development of seismic data noise reduction. Disclosure of Invention The invention aims to solve the technical problems of various noise types, strong non-stationarity and lack of a labeled real training sample in distributed acoustic sensing seismic data, and provides a DAS noise reduction method based on noise type perception reinforcement learning. The invention discloses a DAS noise reduction method based on noise type perception reinforcement learning, which comprises the following steps: S1 noise type perception reinforcement learning noise reduction model-NTARL noise reduction is carried out by using a multi-step iteration mechanism, and DAS data containing noise is obtained As an initial stateWherein H is the number of time sampling points contained in each channel of the data Y, W is the number of channels contained in the data Y, and the iterative flow under the time step t is as follows: s1.1 noise type perception comprising the steps of status Policy network and value network of agent input to NTARL, policy network pairThe kth 3 x 3 region of perceived and output state of noise typeSelection actionProbability of (2),Is a parameter of the policy network, N represents the number of actions in the action set; for a pair ofTaking the action number corresponding to the maximum probability alon