Search

CN-121997035-A - Ultra-deep gas well pipe string leakage intelligent identification method based on time sequence data and time-frequency fusion BiLSTM model

CN121997035ACN 121997035 ACN121997035 ACN 121997035ACN-121997035-A

Abstract

The invention relates to an ultra-deep gas well pipe string leakage intelligent identification method based on a time sequence data and time-frequency fusion BiLSTM model, which belongs to the technical field of petroleum engineering and comprises the following steps of 1, obtaining model input physical characteristics, constructing an experimental dataset, 2, constructing and training a multi-module fusion integrated deep learning model, and 3, carrying out ultra-deep gas well pipe string leakage intelligent identification through the trained multi-module fusion integrated deep learning model. The method can learn input physical characteristics with high efficiency, and can realize high-precision regression prediction aiming at leakage positions and size tasks.

Inventors

  • SUN XIAOHUI
  • Qiu Shanhui
  • WANG ZHIYUAN
  • Lou Wenqiang
  • SUN BAOJIANG
  • WANG JINTANG
  • YANG CHEN
  • HE HAIKANG

Assignees

  • 中国石油大学(华东)

Dates

Publication Date
20260508
Application Date
20260410

Claims (8)

  1. 1. The ultra-deep gas well pipe string leakage intelligent identification method based on time sequence data and time-frequency fusion BiLSTM model is characterized by comprising the following steps: step 1, acquiring input physical characteristics of a model, and constructing an experimental data set; step 2, constructing and training an integrated deep learning model fused by multiple modules; step 3, performing intelligent recognition of the leakage of the ultra-deep gas well pipe column through the integrated deep learning model with the multi-module fusion after training; The multi-module fusion integrated deep learning model comprises a time-frequency double-flow feature extraction layer, a global time-sequence evolution modeling layer, a self-adaptive key feature aggregation layer and a gating decision regression layer; The method comprises the steps of processing a standardized 9-dimensional physical monitoring signal in a time-frequency double-flow characteristic extraction layer by adopting a parallel double-flow structure, wherein the parallel double-flow structure comprises a time domain branch and a frequency domain branch, capturing high-frequency mutation of micro leakage and low-frequency trend of large leakage in parallel by convolution kernels of different sizes in the time domain branch by utilizing a multi-scale convolution network, mapping the signal to a frequency domain by utilizing fast Fourier transform FFT in the frequency domain branch to extract Top-K main energy characteristics, and splicing the two characteristics of the time domain branch and the frequency domain branch in the channel dimension to form a mixed characteristic vector containing local details and global distribution; in the global time sequence evolution modeling layer, a two-way long-short-term memory network BiLSTM is utilized to process the mixed feature vector, and long-range context dependence of the leakage full life cycle is established through a forward and reverse two-channel mechanism; through the global query vector which can be learned, steady background noise is automatically shielded, the key time step of pressure dip or vibration occurrence is dynamically focused, and the variable-length time sequence features are compressed into a high-discrimination context vector with fixed length; In the gating decision regression layer, a gating residual network GRN is used as a regression head, secondary screening is carried out on the aggregation characteristics through a GLU gating unit, redundant information is restrained, core leakage characteristics are amplified, and finally accurate leakage positions and sizes are output through linear layer mapping.
  2. 2. The method for intelligently identifying the leakage of the ultra-deep gas well pipe string based on the time sequence data and time frequency fusion BiLSTM model is characterized by obtaining the input physical characteristics of the model and constructing an experimental data set, and comprises the following steps: And selecting 9-dimensional parameters including annulus top pressure, annulus top fluid temperature, oil pipe top pressure, oil pipe top fluid temperature, reservoir top pressure, reservoir top fluid temperature, annulus total water content, annulus top void fraction and oil pipe top total volume flow as model input physical characteristics, wherein an experimental data set is the model input physical characteristics.
  3. 3. The ultra-deep gas well string leakage intelligent identification method based on the time sequence data and time-frequency fusion BiLSTM model of claim 1 is characterized in that the specific implementation process in the time-frequency double-flow characteristic extraction layer comprises the following steps: inputting the 9-dimensional physical monitoring signal subjected to Z-score standardization pretreatment into a multi-scale convolution network, wherein the original signal is specifically Wherein T is the length of the time sequence, each dimension represents a type of physical monitoring index, and independent standardization processing is carried out on each dimension i to obtain a standardized signal Normalizing a signal Elements of (2) Calculated from the following formula: ; Wherein, the Is the original value of the i-th dimension signal at the t-th time point; And Respectively the mean value and standard deviation of the ith dimension signal on the training set; The multi-scale convolution network comprises four parallel convolution branches, and the convolution kernel sizes are respectively set as follows The output channel number of all convolution branches is set to be 32 in a unified way, and a ReLU activation function is provided, and the final output of the parallel multi-scale CNN module is obtained by splicing branch output tensors H in the channel dimension: ; Wherein, the Output characteristic tensor H of the corresponding branch with the convolution kernel size of k is represented, and finally, a 128-dimensional local-global fusion characteristic vector is formed; Adopting real number fast Fourier transform RFFT to process standardized 9-dimensional physical monitoring signals, after calculating the frequency spectrum amplitude, calculating the modulus of a complex frequency spectrum, adopting Top-K selection strategy, and reserving only the first K components with the largest amplitude for each channel: ; Wherein, the Representing the finally preserved sparse frequency domain amplitude characteristic of the p-th channel, For inputting sequences The complete complex spectrum after RFFT, Representing Top-K truncation operation, and after the spectrum amplitude is sequenced, only intercepting the first K components with the largest energy; Flattening the extracted frequency domain feature vector and copying the frequency domain feature vector in a time dimension; time domain features of the original input And extended frequency domain features Splicing: ; Finally, an output tensor is obtained Dimension, to And multiscale CNN output features Splicing to form high discriminant input containing local detail in time domain and global distribution in frequency domain.
  4. 4. The ultra-deep gas well string leakage intelligent identification method based on the time sequence data and time-frequency fusion BiLSTM model according to claim 3, wherein the specific implementation process in the global time sequence evolution modeling layer comprises the following steps: constructing a two-way long-short-term memory network BiLSTM, wherein the two-way long-short-term memory network BiLSTM adopts two layers of stacked two-way LSTM units, and each layer comprises a forward LSTM chain and a backward LSTM chain; The input end, firstly, the 128-dimensional time domain features extracted by the multi-scale CNN are spliced with 153-dimensional frequency domain features output by the FFT module in the channel dimension to form The method comprises the steps of mixing feature vectors of dimensions, eliminating invalid filling bits in a variable-length sequence through sequence packing operation, bidirectionally processing an input sequence from front to back by a first layer LSTM unit, splicing front and back hidden states into 128-dimension features at each time step, further modeling long-range time sequence dependency relationship by a second layer LSTM unit, and finally outputting a 128-dimension deep time sequence feature sequence, wherein the method specifically comprises the following steps: The received input includes 128-dimensional time domain features extracted by the multi-scale CNN and 153-dimensional frequency domain features output by the FFT module, wherein the input dimension Constructing a binary mask matrix : ; Wherein, the For the true effective length of the b-th sample, t represents the position index in the current time step or sequence, As an indication function; Performing a sequence compression transformation to reorganize a three-dimensional tensor into a compact two-dimensional tensor sequence Wherein the input of step t includes a current time mask Is a valid sample data of (1); BiLSTM performing feature extraction by two layers of stacked LSTM units, and BiLSTM calculating forward hidden state sequences by adopting a bidirectional propagation mechanism Reverse hidden state sequence : ; Each layer of LSTM units is provided with 64 hidden units, and the forward and reverse states of the same time step are spliced as follows: ; Wherein, the Representing a comprehensive hidden state fused with context information at a time step t, wherein the feature dimension is 128; after two-layer stacked bidirectional calculation, pair Performing a sequence restoration operation to restore the compressed hidden state to the original batch structure, filling zero values into invalid positions, and finally outputting a characteristic tensor 。
  5. 5. The ultra-deep gas well string leakage intelligent identification method based on the time sequence data and time-frequency fusion BiLSTM model of claim 4, wherein the specific implementation process of the self-adaptive key feature aggregation layer comprises the following steps: the self-adaptive multi-head attention mechanism AMA is used as a time sequence aggregation layer, the self-adaptive key characteristic aggregation layer receives BiLSTM the output hidden state sequence H, and learnable global query parameters are introduced The input sequence generates Key and Value matrix after Split operation, Calculating attention scores with Key through matrix multiplication, masking invalid filling bits through Masking, obtaining self-adaptive weights after Softmax normalization, generating output characteristics of each attention head through Value weighted summation, and finally outputting global context vectors through splicing, linear projection and layer normalization of outputs of a plurality of attention heads, wherein the method specifically comprises the following steps: First, define a globally learnable query vector The attention mechanism adopts a multi-head design, comprising h parallel processing heads, and for the ith processing head, Will input H and global query parameters Respectively projecting to a low-dimensional space to generate a corresponding Query, key, value matrix: ; Wherein, the For the linear projection matrix of the ith processing head, Is a subspace dimension; Within each processing head, the dot product similarity of the Query matrix and the Key matrix is calculated and divided by a scaling factor Introducing KEY PADDING MASK mechanism, for time step t, if the data correspondent to said time step belongs to filling area, then making attention score Put to minus infinity, as follows: ; ; Wherein, the Representing Key vectors corresponding to time step t in the ith processing head; indicating the un-normalized attention score obtained by calculating the global Query and the time step Key vector; Subsequently, a normalized attention weight distribution is generated using a Softmax function Weighting and summing the Value matrix by using the calculated weight to obtain the context vector of the ith processing head : ; Wherein, the Representing a Value vector corresponding to the time step t in the ith processing head; finally, the outputs of all the h processing heads are spliced and projected linearly And layer normalization processing to obtain final fixed-length global context vector : ; Final output Is a high-dimensional feature vector independent of the sequence length T.
  6. 6. The ultra-deep gas well string leakage intelligent identification method based on the time sequence data and time-frequency fusion BiLSTM model of claim 5, wherein the specific implementation process of the gating decision regression layer comprises the following steps: introducing a gate control residual error network GRN as a final regression and information fusion module; the gated residual network GRN comprises a gated linear unit GLU, residual connection and layer normalization, and comprises two parallel paths: nonlinear transformation path first, input vector, i.e. final fixed-length global context vector Generating potential characteristic representation z through a linear layer and an ELU activation function, then dividing the potential characteristic representation into a data stream and a gate stream, generating a weight coefficient of a (0, 1) interval through a Sigmoid function by the gate stream, and multiplying the weight coefficient with the data stream element by element to obtain a transformed characteristic The following is shown: ; Wherein, the Is a Sigmoid activation function that is activated by, And A learnable weight matrix and a bias vector respectively representing the opening degree of the control gate, And Respectively representing a learnable weight matrix and a bias vector for performing linear transformation on the original data; Residual connection path, original input, i.e. final fixed-length global context vector Direct and transformed features Adding and normalizing the layers to obtain the final output : ; Based on high discriminant features, i.e. final output vectors Finally, the leakage positions are respectively output through two independent linear heads And size of Is a quantitative predictor of (2): ; ; Wherein, the And The weight vector, which is the regression header, is used to map the high-dimensional features into scalar outputs, And Is the bias term of the regression head.
  7. 7. A computer device comprising a memory and a processor, wherein the memory stores a computer program, and the computer program is characterized in that the processor, when executing the computer program, realizes the steps of the ultra-deep gas well string leakage intelligent identification method based on the time sequence data and time frequency fusion BiLSTM model as set forth in any one of claims 1-6.
  8. 8. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the ultra-deep gas well string leak intelligent identification method based on a time series data and time frequency fusion BiLSTM model of any one of claims 1-6.

Description

Ultra-deep gas well pipe string leakage intelligent identification method based on time sequence data and time-frequency fusion BiLSTM model Technical Field The invention belongs to the technical field of petroleum engineering, and particularly relates to an ultra-deep gas well pipe string leakage intelligent identification method based on a time sequence data and time-frequency fusion BiLSTM model. Background The increasing depletion of shallow hydrocarbon resources has prompted the shift of the global exploration center of gravity to the ultra-deep field. Meanwhile, the high temperature, high pressure and complex geological environment of the ultra-deep well obviously aggravate the severity of the working condition of the well shaft, so that the ultra-deep well becomes a weak link of gas well sealing failure. As a core production channel for formation fluids, the production string is extremely prone to structural damage or corrosion failure under the action of multiple physical field coupling and long-term dynamic loading, and the leakage risk is far higher than that of other barriers. Once the pipe column leaks, the annulus is continuously pressurized, so that the casing failure, the annulus and even underground blowout are induced, and serious consequences such as environmental pollution, platform displacement, oil well abandonment and the like are caused. In addition, gas well leakage can also cause fire explosion accidents, and the public safety and the surrounding environment are greatly threatened. Therefore, the leakage identification of the ultra-deep gas well string is significant. However, because the fluid phase state in the well bore is complex and various and the special physical properties of the gas are adopted, and the gas well production string is leaked to induce the disturbance of a local temperature field, a sound field and a pressure field, the physical field has unknown signal characteristics and weak strength, the leakage identification difficulty of the production string is high, and a plurality of problems are encountered in the related research at present. Multiphase flow laws of fluids in a wellbore are the physical basis for understanding leakage mechanisms and characteristic evolution. At present, the academic world forms a mature theoretical system in the aspect of hydrodynamic mechanism of production string leakage, and research focuses mainly on three directions of multiphase flow pattern discrimination, annulus pressure prediction and leakage flow field numerical simulation. However, the research on the current transient and whole-process leakage models is not perfect, and the existing models still have obvious defects in accurately predicting the changes of characteristics such as fluid temperature, pressure, gas production and the like in a shaft. The existing model cannot comprehensively capture the leakage characteristics of transient state and whole process, reliable data support is difficult to directly provide for the construction of a leakage point identification algorithm, and generalization capability of the model is limited. Leak identification is classified into qualitative identification (determining whether a leak has occurred) and quantitative identification (determining the location and size of the leak). Although qualitative identification techniques have found widespread use in industry, quantitative identification is more challenging because it provides direct guidance for remedial strategies with higher engineering value. Early identification methods mainly rely on fluid composition contrast, and with the development of sensing technology, detection based on physical fields (acoustic waves, temperature, pressure, flow fields) became the mainstream. Detection means are increasingly abundant, but there are still problems in quantitatively identifying the location and size of leaks. The existing acoustic wave array method needs to stop production and take out a tubular column, and has high cost, complex operation and no real-time performance. In addition, the current technology mainly carries out qualitative judgment based on simple parameters such as temperature, pressure and the like, lacks a corresponding transient overall process physical model and a machine learning method, and is difficult to effectively identify the specific position and size of leakage. With the improvement of computer power and the development of big data technology, machine learning and deep learning have become core tools for solving complex nonlinear pattern recognition and parameter prediction. However, at present, research in this area is mainly focused on pipeline transportation systems, and research on gas well production strings is relatively scarce. Compared with the rapid development of an artificial intelligent framework, related researches on a shaft and a production string are still in a starting stage, the prior researches have fewer results for quantitatively detect