CN-121996933-A - Bidirectional gating cyclic neural network logging curve reconstruction method based on multistage wavelet decomposition

CN121996933ACN 121996933 ACN121996933 ACN 121996933ACN-121996933-A

Abstract

The invention belongs to the technical field of oil and gas geophysical exploration and logging data processing, and particularly discloses a bidirectional gating cyclic neural network logging curve reconstruction method based on multistage wavelet decomposition. The method is characterized in that reconstruction accuracy and reservoir identification reliability are improved by mining logging data sequence dependence, multi-scale geological features are captured in a layered mode, the interpretability of a model is enhanced by means of natural relevance of wavelet decomposition and stratum signals, and technical support is provided for oil and gas exploration and development. The method comprises the steps of 1) preprocessing original logging data and constructing a dataset, 2) constructing mWDN-BiGRU model frames, 3) model training and evaluation, and 4) reconstructing logging curves of a data-missing well section. The method has the beneficial technical effects that the method aims at reconstructing the logging curve, and the missing data of the acoustic logging curve is repaired by adopting a multi-level wavelet decomposition bidirectional gating circulating neural network algorithm, so that the accuracy of the reconstructed data is improved, and the reservoir parameter interpretation accuracy and the deposit phase identification reliability are improved.

Inventors

Huang Chengyushu
ZHAO PENGFEI
MENG FAN
FAN XIANGYU
Al Hamza H. Jasim
ZHANG QIANGUI
CHEN YUFEI
XU JIA
Liao Silan

Assignees

西南石油大学

Dates

Publication Date: 20260508
Application Date: 20260402

Claims (5)

1. A method for reconstructing a logging curve of a bidirectional gating cyclic neural network based on multistage wavelet decomposition is characterized by comprising the following implementation steps: Step 1, preprocessing original logging data and constructing a data set, namely performing quality inspection and outlier cleaning on the original logging data, dividing the original logging data of different wells into a training set, a verification set and a test set according to well numbers, and completing the construction of the data set through data conversion and normalization processing, data set packaging, data iterator construction and data super-parameter configuration; Step 2, constructing mWDN-BiGRU model frames, namely firstly extracting time sequence data frequency information through a multi-level wavelet decomposition mWDN module, connecting a mWDN module with a plurality of GRU modules to form a parallel data flow structure, setting networking basic parameters and data fluid dimensions, and configuring model parameters; Training and evaluating the model, namely training the model constructed in the step 2 by using the training set constructed in the step 1, and constructing a training frame of the depth time sequence network from three dimensions of a loss function, an optimizer and an evaluation index; and 4, reconstructing the well logging curve of the missing well section by using a test set to verify the generalization performance of the model, saving a model parameter file and performing reconstruction prediction on the well logging curve of the missing well section by using a trained model.
2. The method for reconstructing a log of a bi-directional gated cyclic neural network based on multi-level wavelet decomposition according to claim 1, wherein said step 1 comprises the steps of: step 1.1, checking and cleaning original logging data, removing abnormal values caused by underground environment interference and instrument faults in the original logging data, ensuring that the value range and distribution of effective data of each logging curve accord with actual geological and physical significance, and correcting system errors caused by instrument differences and measuring environment changes; Step 1.2, constructing a data set, dividing original logging data into a training set, a verification set and a test set according to well numbers from an effective well section according to the number of effective wells in each area, ensuring that the data of the same well only appear in one data set, and storing and reading in three file forms of train.csv, fail.csv and test.csv; Step 1.3, data conversion and normalization, wherein the resistivity RT curve is converted into an RT_log curve through logarithm conversion, and the logarithm conversion formula is as follows: , Wherein, RT lg is the resistivity curve after logarithmic transformation; is the offset; the converted training set data is extracted through a Min-Max normalization method and stored as a JSON file in a dictionary form, and a linear normalization formula is as follows: , Wherein V norm is a normalized value, V log is an original curve value, V max is a maximum curve value, and V min is a minimum curve value; Step 1.4, sequence data segmentation, wherein an original dataset is expressed as D e R n×m , n is the number of samples of a well dataset, m is the number of features in the dataset, the dataset is segmented based on the total data amount and the target sequence length to obtain a dataset D i corresponding to the sequence length, i=1, 2..l, l is the target segmented sequence length, then stacking of l datasets D 1 ,D 2 ,...,D l is realized by using an array stacking function np.stack of numpy library in python, the sequence dimension is increased, and a three-dimensional tensor dataset with the sequence dimension is obtained, and the dataset expression formula is as follows: , Wherein D new is a new three-dimensional tensor dataset, D 1 ,D 2 ,...,D l is a segmented sub dataset, axis is an axial parameter of stacking operation; Step 1.5, data set encapsulation and data iterator construction, namely defining a logging sequence data set class well_ Dataset by inheriting a torch.uteils.data.dataset abstract class based on PyTorch frames, initializing a data set, receiving segmented sequence sample data, returning the total number of samples, judging the value range of an index according to the returned total number of samples, returning a single training sample according to the index, converting the single training sample into a Tensor format Tensor, calling the torch.uteils.data.dataloader class based on PyTorch frames, and initializing the data iterator by using well_ Dataset; And 1.6, configuring the data super-parameters, wherein the method specifically comprises a standardization method, an input sequence length and a sliding step length.
3. The method for reconstructing a log of a bi-directional gated cyclic neural network based on multi-level wavelet decomposition according to claim 1, wherein said step 2 comprises the steps of: Step 2.1, constructing a network frame, wherein the network structure consists of an input layer, a multi-level wavelet decomposition network layer, a bidirectional circulation gate control neural network layer, a full-connection neural network layer and an output layer, wherein layers 1 and 5 are set as the input layer and the output layer, a multi-level wavelet decomposition network is adopted in the layer 2, a bidirectional gate control circulation neural network is adopted in the layer 3, and a full-connection neural network consisting of two linear layers is adopted in the layer 4; Step 2.2, implementing a multi-level wavelet decomposition network model, first defining the form of weight matrices W l (i) and W h (i), and the formula is as follows: , Wherein W l (i) and W h (i) are a low-pass weight matrix and a high-pass weight matrix of the ith-level wavelet decomposition, k is the length of a filter coefficient, l 1 、l 2 、...、l k is a low-pass filter coefficient, and h 1 、h 2 、...、h k is a high-pass filter coefficient; Is a filler element in the matrix; The wavelet basis function is adopted to initialize the weight matrixes W l (i) and W h (i), the weight matrixes are used as filters, and then the time series data is processed through multi-level decomposition, wherein the formula is as follows: , Wherein a l (i)、a h (i) is a low-frequency intermediate variable and a high-frequency intermediate variable of the ith-level wavelet decomposition, sigma (·) is an activation function, W l (i) and W h (i) are a low-pass weight matrix and a high-pass weight matrix of the ith-level wavelet decomposition, x l (i-1) is a low-frequency approximate signal obtained by the (i-1) th-level decomposition, and b l (i) and b h (i) are trainable vector biases; The method comprises the steps of obtaining a low-frequency approximate signal x l (i) and a high-frequency detail signal x h (i) after the i-th wavelet decomposition, and carrying out sliding convolution on x l (i) through a low-pass filter and a high-pass filter when the (i+1) -th wavelet decomposition is carried out, so as to obtain an intermediate variable sequence, wherein a convolution formula is as follows: , in the formula, And An nth element of a low frequency intermediate variable sequence and an nth element of a high frequency intermediate variable sequence for (i+1) th level wavelet decomposition; the (n+k-1) th element of the i-th low-frequency subsequence, K is the length of the filter, K is the internal index of the filter coefficient, l k and h k are the low-pass filter coefficient and the high-pass filter coefficient; the intermediate variables are downsampled by means of an average pooling, the average pooling formula being as follows: , Wherein j represents the length of the high-low frequency subsequence after the i-th level decomposition; And The j element of the low-frequency approximate signal and the j element of the high-frequency detail signal which are finally obtained by decomposing the ith wavelet; And The (2 j-1) th element and the (2 j-1) th element in the low-frequency intermediate variable sequence; And The (2 j-1) th element and the (2 j-1) th element in the high-frequency intermediate variable sequence; then adding regularization constraint, wherein the regularization constraint formula is as follows: , Wherein W l (i) and W h (i) are a low-pass weight matrix and a high-pass weight matrix of the ith-level wavelet decomposition; 、 L, L (theta) and L l 、L h are the final total loss function, basic loss function, regularization loss of the low-frequency weight matrix and the high-frequency weight matrix, alpha and beta are low-frequency and high-frequency regularization parameters; and then iteratively updating network parameters by using a back propagation BP algorithm, wherein the updating formula is as follows: , Wherein W l (i) and W h (i) are a low-pass weight matrix and a high-pass weight matrix of the ith-level wavelet decomposition; 、 the method is characterized in that the method is an initialized state of a weight matrix W l (i)、W h (i), alpha and beta are low-frequency and high-frequency regularization parameters, L is a loss function, and eta is a learning rate; In the forward propagation process of the model, carrying out weighted calculation, stacking and averaging on the final regularized loss value and each wavelet decomposition L l and L h , and then outputting the final regularized loss value and the result together, and acting on the backward propagation optimization process of a loss function; Step 2.3, configuring a network skeleton, connecting mWDN modules with a plurality of GRU modules to form a parallel data flow structure, and setting networking basic parameters and data fluid dimensions; and 2.4, configuring model super parameters, wherein the model super parameters specifically comprise wavelet decomposition progression, wavelet basis function type, mWDN low-frequency matrix regularization parameters, mWDN matrix high-frequency regularization parameters, cyclic neural network RNN hidden layer dimension, RNN hidden layer number and full convolution neural network FCN hidden layer dimension.
4. The method for reconstructing a log of a bi-directional gated cyclic neural network based on multi-level wavelet decomposition according to claim 1, wherein said step 3 comprises the steps of: step 3.1, selecting MSE as a loss function, wherein the formula is as follows: , wherein y i is a true value; N is the length of the logging sequence; Step 3.2, selecting Adam as a training optimizer, adopting a learning rate scheduling strategy combining preheating and cosine annealing in the training process, adopting a linear increasing learning rate method to increase the learning rate in the initial stage of training, adopting a cosine annealing scheduling strategy in the later stage of training to reduce the learning rate, and adopting the following preheating and cosine annealing formulas: , Wherein T warm is the number of epochs for controlling preheating and cosine annealing, T max is the maximum training round number, T is the iteration number of model training which is completed, eta t is the learning rate of the T-th round of training, eta 0 is the initial learning rate, and eta min is the minimum learning rate; meanwhile, monitoring evaluation indexes, and adopting an early stop system to terminate training in advance; And 3.3, selecting an average absolute error MAE, a Pearson correlation coefficient Pearson and a determinable coefficient R 2 as evaluation indexes, wherein all indexes are calculated under the original physical dimension after inverse normalization, and the average absolute error MAE is expressed as follows: , wherein y i is a true value; N is the length of the logging sequence; the pearson correlation coefficient formula is as follows: , Wherein, t is the total number of sampling points, m i and n i are the sampling point values of the real curve and the reconstruction curve; The determinable coefficient R 2 is given by: , Wherein N is the length of the logging sequence; Y i is a true value; Is a predicted value; And 3.4, configuring the training super parameters, wherein the training super parameters comprise training batch size, initial learning rate, weight attenuation coefficient, gradient clipping proportion and Early Stopping number of epochs waiting when the evaluation index of the verification set is not improved.
5. The method for reconstructing a log of a bi-directional gated cyclic neural network based on multi-level wavelet decomposition according to claim 1, wherein said step 4 comprises the steps of: Step 4.1, model preservation and reloading, namely establishing a file management system step by step according to a prediction target, a training data set, a prediction model and an input sequence length, preserving the training model in a full model preservation mode, preserving data and training process parameters by adopting state dictionary parameters, preserving the same-level catalogues of model files and preserving the same-level catalogues of the model files in a JSON file mode, and simultaneously preserving the full process parameters by using a command line parameter management system in a argparse library; And 4.2, model reasoning and outputting, namely inputting the preprocessed logging sequence data into a model to execute a forward propagation calculation process, directly outputting a predicted value of a target logging curve, and carrying out precision evaluation and effect analysis on the predicted result on a verification well and a test well to construct a logging curve of the data missing well section.

Description

Bidirectional gating cyclic neural network logging curve reconstruction method based on multistage wavelet decomposition Technical Field The invention belongs to the technical field of oil and gas geophysical exploration and logging data processing, and particularly relates to a bidirectional gating cyclic neural network logging curve reconstruction method based on multistage wavelet decomposition. Background Logging curves such as sound waves, resistivity, natural gamma and the like are core basis for identifying underground lithology, calculating physical parameters of a reservoir and carrying out fine description of oil and gas reservoirs, however, in actual operation, the quality of the logging curves of key well sections is distorted or seriously lost due to factors such as instrument faults, borehole collapse, severe drilling environment or data transmission loss and the like, logging re-logging cost is high and the logging re-logging is often not feasible for the completed well sections, so that the logging curve reconstruction technology becomes a key means for compensating data loss and improving data quality. The traditional curve reconstruction method mainly comprises an interpolation method and an empirical formula method, although the method can realize filling of a missing segment to a certain extent, most of the methods are based on linear or stable signal assumptions, complex nonlinear relation and local mutation characteristics among underground stratum parameters are difficult to capture effectively, in order to better mine time sequence dependency characteristics of a logging curve on a depth sequence, a circulating neural network and variants thereof such as a long-short-term memory network and a gating circulating unit are gradually applied, however, the traditional unidirectional network cannot utilize 'future' context information, so that insufficient perceptibility of stratum interface and sediment rotation change is caused, and the bidirectional circulating neural network solves the problem to a certain extent through combining history and future information. Nevertheless, the existing model has limitation when facing the core challenge of log reconstruction, namely decoupling and noise suppression of multi-scale features, log data is a multi-scale signal, and comprises local disturbance caused by high-frequency noise such as well wall collapse and overall change of low-frequency trend such as stratum compaction, and a conventional time sequence modeling mode may be limited by high-frequency noise interference, low-frequency trend confusion and difficulty in cross-scale modeling, and a pure data driving model is mainly based on time domain features, has insufficient sensitivity to frequency domain features and is easily interfered by high-frequency noise, so that low-frequency geological trends are difficult to accurately capture. Disclosure of Invention The invention aims to provide a method for reconstructing a logging curve of a bi-directional gating cyclic neural network (Bidirectional Gated Recurrent Unit Network with multilevel Wavelet Decomposition, mWDN-BiGRU) based on multi-level wavelet decomposition, which solves the problems of decoupling and noise suppression of multi-scale features of the logging curve, the frequency domain method is incorporated into a modeling task of logging curve reconstruction, a multi-level wavelet decomposition module is used for separating high-frequency noise and low-frequency signals of the logging curve, a bi-directional gating cyclic network module is used for carrying out depth modeling on low-frequency main components and optimizing and screening high-frequency auxiliary components, and the combined optimization of frequency domain noise reduction and time sequence modeling is used for realizing the retention of geological trends and the enhancement of extraction of beneficial frequency band signals, and the flow of the method is shown in a figure 1. The technical scheme adopted by the invention is as follows: 1. a log reconstruction method based on a multi-level wavelet decomposition and a bi-directional gating cyclic neural network, the method comprising: Step 1, preprocessing original logging data and constructing a data set, namely performing quality inspection and outlier cleaning on the original logging data, dividing the original logging data of different wells into a training set, a verification set and a test set according to well numbers, and completing the construction of the data set through data conversion and normalization processing, data set packaging, data iterator construction and data super-parameter configuration; Step 2, constructing mWDN-BiGRU model frames, namely firstly extracting time sequence data frequency information through a multi-level wavelet decomposition mWDN module, connecting a mWDN module with a plurality of GRU modules to form a parallel data flow structure, setting networking basic parameters and dat