CN-122021223-A - Intelligent well data model and modeling method

CN122021223ACN 122021223 ACN122021223 ACN 122021223ACN-122021223-A

Abstract

The invention relates to the technical field of data models and modeling, in particular to an intelligent well data model, which comprises a data acquisition module and a data analysis module; the data acquisition module is used for acquiring stratum pressure, saturation pressure, oil reservoir porosity, ‌ oil reservoir permeability, bottom hole flow pressure, stratum pressure, temperature, flow rate, concentration, smell, vibration, oil drainage area, cleaning frequency of a pipeline and medium viscosity in the pipeline of the low-pressure underground production well; for a low-pressure low-yield well, the invention analyzes the energy supplementing capability of the gas reservoir, the change rule of the natural gas release period at the bottom of the well and the system back pressure reducing condition through a data model, and establishes the most economical equipment configuration and the optimal production system of the supercharging device according to different pressures, yields and pipeline distances of the gas well, thereby ensuring that the supercharged natural gas can enter a gathering and transportation pipeline network.

Inventors

REN XIAORONG
XU WENLONG
SHI MINGXIA
SHI LEI
WANG JIAN
ZHANG RUI
CHEN PENG
ZHI RUI
LI MENG
HUANG SHASHA
GUO GUIJIAO
XU ZIQIANG

Assignees

中国石油天然气股份有限公司

Dates

Publication Date: 20260512
Application Date: 20241111

Claims (8)

1. The intelligent well data model is characterized by comprising a data acquisition module and a data analysis module; The data acquisition module is used for acquiring stratum pressure, saturation pressure, reservoir porosity, ‌ reservoir permeability, bottom hole flowing pressure, stratum pressure, temperature, flow, concentration, smell, vibration, oil drainage area, cleaning frequency of a pipeline and medium viscosity in the pipeline of the low-pressure ground production well, dividing the data into three types, wherein the stratum pressure, the saturation pressure, the reservoir porosity and ‌ reservoir permeability belong to gas reservoir energy supplementing capacity data, the bottom hole flowing pressure, the stratum pressure, the temperature, the pressure, the flow, the concentration, the smell, vibration and the oil drainage area belong to bottom hole natural gas release period change rule data, the cleaning frequency of the pipeline and the medium viscosity in the pipeline belong to system back pressure reducing condition data, and converting the data into characteristic vectors and inputting the characteristic vectors into the data model; the data analysis module is internally provided with a data model, and the steps are as follows: (1) Reading a CSV file containing historical data, and respectively storing time and corresponding data in two variables; (2) Preprocessing data, including data cleaning, feature normalization, discrete feature encoding and continuous feature discretization; the data cleaning comprises missing value processing, abnormal value processing and sample unbalance processing; Feature normalization processing, namely maximum and minimum value normalization, processing in the range of [0,1], wherein max is the maximum value of the sample, min is the minimum value of the sample, , For the value after the normalization, Is an input value; discrete feature codes include one-hot codes, numerical codes, and embedding codes; Discretizing the continuous features, namely discretizing the continuous features into a series of 0,1 features, and delivering the features to a model, wherein the discretization mode comprises equal frequency, equal distance and clustering; (3) Feature selection, feature construction, feature conversion, and feature encoding; (4) Training model Firstly, analyzing the energy supplementing capacity of the gas reservoir, inputting processed energy supplementing capacity data of the gas reservoir into a data model, and expressing the energy supplementing capacity data as the formula Wherein , The processed stratum pressure, saturation pressure, oil reservoir porosity and oil reservoir permeability are used as the input of neurons, , Representing the corresponding weight of each of the inputs, , Is a bias term that is used to determine, The activation function is represented as a function of the activation, As output of neurons, representing the reservoir energy replenishment capability; then analyzing the change rule of the natural gas release cycle at the bottom of the well, inputting the processed change rule data of the natural gas release cycle at the bottom of the well into a data model, and formulating as Wherein , The processed bottom hole flowing pressure, stratum pressure, temperature, flow, concentration, smell, vibration and oil drainage area are used as the input of neurons, , Representing the corresponding weight of each of the inputs, , Is a bias term that is used to determine, The activation function is represented as a function of the activation, As the output of the neuron, the change rule of the natural gas release cycle at the bottom of the well is shown; Then analyzing the system back pressure reducing condition, and inputting the processed system back pressure reducing condition data into a data model, wherein the formula is expressed as Wherein , For the frequency of the cleaning of the treated pipeline and the viscosity of the medium in the pipeline, the cleaning frequency and the viscosity of the medium in the pipeline are used as inputs of neurons, , Representing the corresponding weight of each of the inputs, , Is a bias term that is used to determine, The activation function is represented as a function of the activation, As the output of the neuron, representing the condition of reducing the back pressure of the system, and outputting the result to the design module; (7) Different pressures, production rates, pipeline distances, gas wells, 、 And The data model is input again as input, the output results are divided into three types, and the most economical equipment configuration and the most economical production system of the supercharging device are matched by combining a configuration scheme library and an optimal production system library which are arranged in the data analysis module.
2. The intelligent well data model of claim 1, wherein the missing data is processed by taking the average value of the continuous features to be filled by a direct filling method, wherein the discrete features can be filled by a mode when the missing data is smaller than 20%, the missing data is constructed as new features when the missing data is larger than 20% and smaller than 50%, a list of discrete features is added, and the features are directly deleted when the missing data is larger than 50%.
3. The intelligent well data model of claim 1, wherein outliers are analyzed by statistics, box graphs, clusters, 3 The principle and the isolated forest method are checked, and the processing method is as follows: (1) Directly deleting samples containing outliers; (2) Treating the missing value by a missing value treatment method; (3) A latest value correction, wherein the abnormal value is corrected by using a similar observed value; (4) Data modeling is performed directly on a dataset having outliers without processing.
4. The intelligent well data model of claim 1, wherein the sample imbalance treatment method comprises the following steps: (1) Downsampling/undersampling, namely randomly extracting samples from a plurality of classes so as to reduce the sample data of the plurality of classes and balance the data; (2) Upsampling/oversampling by increasing the number of minority samples by sampling to achieve data balance; (3) Smote algorithm, generating few types of samples by a synthetic method; (4) Focal loss, namely adding category weight and sample difficulty weight adjustment factors by modifying the cross entropy loss function; (5) The weight of the loss function is set so that the loss of the judgment errors of the minority class data is larger than the loss of the judgment errors of the majority class data.
5. The intelligent well data model of claim 1, wherein the one-hot code regards each value of the discrete feature as a state, if there are N different values in a feature, abstracts the feature into N different states, only one of the N states having a state bit value of 1 and the other state bits of 0, the numerical code maps the class directly to 1,2,3, embedding codes maps the high-dimensional input data to the low-dimensional vector space by a linear transformation.
6. The intelligent well data model of claim 1, wherein the equal frequency range is divided into n equal parts uniformly, each part is equally spaced, the equal parts are equally divided into n equal parts uniformly, the number of observation points contained in each part is the same, and the clustering is divided into different intervals by clustering.
7. The intelligent well data model of claim 1, wherein the feature selection method comprises a variance threshold method, univariate feature selection, recursive feature elimination, tree-model-based feature selection, L1 regularization, embedding, principal component analysis, correlation coefficient method, information gain, and mutual information method.
8. A method for modeling an intelligent well data model, comprising the intelligent well data model according to claim 1, comprising the steps of: (1) Reading a CSV file containing historical data, and respectively storing time and corresponding data in two variables; (2) Preprocessing data, including data cleaning, feature normalization, discrete feature encoding and continuous feature discretization; the data cleaning comprises missing value processing, abnormal value processing and sample unbalance processing; Feature normalization processing, namely maximum and minimum value normalization, processing in the range of [0,1], wherein max is the maximum value of the sample, min is the minimum value of the sample, , For the value after the normalization, Is an input value; discrete feature codes include one-hot codes, numerical codes, and embedding codes; Discretizing the continuous features, namely discretizing the continuous features into a series of 0,1 features, and delivering the features to a model, wherein the discretization mode comprises equal frequency, equal distance and clustering; (3) Feature selection, feature construction, feature conversion, and feature encoding; (4) Training model Firstly, analyzing the energy supplementing capacity of the gas reservoir, inputting processed energy supplementing capacity data of the gas reservoir into a data model, and expressing the energy supplementing capacity data as the formula Wherein , The processed stratum pressure, saturation pressure, oil reservoir porosity and oil reservoir permeability are used as the input of neurons, , Representing the corresponding weight of each of the inputs, , Is a bias term that is used to determine, The activation function is represented as a function of the activation, As output of neurons, representing the reservoir energy replenishment capability; then analyzing the change rule of the natural gas release cycle at the bottom of the well, inputting the processed change rule data of the natural gas release cycle at the bottom of the well into a data model, and formulating as Wherein , The processed bottom hole flowing pressure, stratum pressure, temperature, flow, concentration, smell, vibration and oil drainage area are used as the input of neurons, , Representing the corresponding weight of each of the inputs, , Is a bias term that is used to determine, The activation function is represented as a function of the activation, As the output of the neuron, the change rule of the natural gas release cycle at the bottom of the well is shown; Then analyzing the system back pressure reducing condition, and inputting the processed system back pressure reducing condition data into a data model, wherein the formula is expressed as Wherein , For the frequency of the cleaning of the treated pipeline and the viscosity of the medium in the pipeline, the cleaning frequency and the viscosity of the medium in the pipeline are used as inputs of neurons, , Representing the corresponding weight of each of the inputs, , Is a bias term that is used to determine, The activation function is represented as a function of the activation, As the output of the neuron, representing the condition of reducing the back pressure of the system, and outputting the result to the design module; (5) Assessment model ‌ assessment of model after ‌ training is completed, using test dataset ‌ test dataset is data not participating in model training ‌ for assessing generalization ability of model; (6) Adjustment and optimization ‌ the model is adjusted and optimized according to the evaluation result ‌, and the performance of the model is improved by adjusting parameters of the model, ‌ to try different feature combinations or using different algorithms.

Description

Intelligent well data model and modeling method Technical Field The invention relates to the technical field of data models and modeling, in particular to an intelligent well data model and a modeling method. Background Low pressure low yield wells generally refer to natural gas wells with low fluid pressure at the bottom of the well and relatively low gas yield, which often cause difficult gas production due to insufficient pressure at the bottom of the well to raise natural gas to the ground, drainage and gas production processes are a technical method for low pressure low yield gas wells, aiming at raising the bottom pressure of the well and promoting the raising and separation of natural gas, and by selecting proper processes and equipment, such as a mechanical pumping and drainage process, a plunger lifting and drainage process, a foam drainage process, a screw pump drainage process, an ultrasonic drainage process and the like, the gas production effect of the low pressure low yield gas wells is improved, and the processes can raise the bottom pressure, increase the gas lifting force, reduce the inhibition of the gas production by a liquid column and the like, thereby improving the gas production efficiency and economic benefit. . In the actual use process, in order to ensure that the natural gas can enter the gathering and transportation pipe network, the equipment configuration of the supercharging device needs to be established, but the existing system does not have a comprehensive data analysis function, and in order to ensure that the natural gas can enter the gathering and transportation pipe network, the equipment configuration of the supercharging device is as high as possible, so that the cost is increased. Therefore, aiming at the problem that the equipment configuration of the supercharging device is high and the cost is increased, the most economical equipment configuration data model of the supercharging device can be designed. Disclosure of Invention To overcome the problem of higher equipment configuration of the supercharging device, the cost is increased. The intelligent well data model comprises a data acquisition module and a data analysis module; The data acquisition module is used for acquiring stratum pressure, saturation pressure, reservoir porosity, ‌ reservoir permeability, bottom hole flowing pressure, stratum pressure, temperature, flow, concentration, smell, vibration, oil drainage area, cleaning frequency of a pipeline and medium viscosity in the pipeline of the low-pressure ground production well, dividing the data into three types, wherein the stratum pressure, the saturation pressure, the reservoir porosity and ‌ reservoir permeability belong to gas reservoir energy supplementing capacity data, the bottom hole flowing pressure, the stratum pressure, the temperature, the pressure, the flow, the concentration, the smell, vibration and the oil drainage area belong to bottom hole natural gas release period change rule data, the cleaning frequency of the pipeline and the medium viscosity in the pipeline belong to system back pressure reducing condition data, and converting the data into characteristic vectors and inputting the characteristic vectors into the data model; the data analysis module is internally provided with a data model, and the steps are as follows: (1) Reading a CSV file containing historical data, and respectively storing time and corresponding data in two variables; (2) Preprocessing data, including data cleaning, feature normalization, discrete feature encoding and continuous feature discretization; the data cleaning comprises missing value processing, abnormal value processing and sample unbalance processing; Feature normalization processing, namely maximum and minimum value normalization, processing in the range of [0,1], wherein max is the maximum value of the sample, min is the minimum value of the sample, ,For the value after the normalization,Is an input value; discrete feature codes include one-hot codes, numerical codes, and embedding codes; Discretizing the continuous features, namely discretizing the continuous features into a series of 0,1 features, and delivering the features to a model, wherein the discretization mode comprises equal frequency, equal distance and clustering; (3) Feature selection, feature construction, feature conversion, and feature encoding; (4) Training model Firstly, analyzing the energy supplementing capacity of the gas reservoir, inputting processed energy supplementing capacity data of the gas reservoir into a data model, and expressing the energy supplementing capacity data as the formulaWherein,The processed stratum pressure, saturation pressure, oil reservoir porosity and oil reservoir permeability are used as the input of neurons,,Representing the corresponding weight of each of the inputs,,Is a bias term that is used to determine,The activation function is represented as a function of the activation,As output of ne