CN-122024917-A - Acidification measure effect prediction method based on multisource feature fusion machine learning model
Abstract
The invention belongs to the technical field of petroleum engineering, and particularly relates to an acidification measure effect prediction method based on a multisource feature fusion machine learning model. The acidizing measure effect prediction method performs feature selection through an integrated feature screening model, and realizes scientific and accurate prediction of the oil well acidizing measure yield increase effect based on a machine learning method combining a PSO model and an LCE model. The acidification measure effect prediction method based on the multisource feature fusion machine learning model comprises the following steps of collecting multisource data, cleaning, preprocessing and feature extraction on the multisource data, feature selection on the multisource data by using an integrated feature screening model, independent verification and representative verification on selected feature parameters, construction of an acidification measure effect prediction model based on LCE, training and optimization, optimization of super parameters of the prediction model, and model evaluation to obtain the acidification measure effect prediction model based on the multisource feature fusion machine learning model.
Inventors
- JING RUILIN
- CHEN GE
- LI QINGSONG
- YANG SHUO
- TIAN FENG
- GUAN JINGTAO
- HAN MING
- Fang Chaolian
- LI LIAN
Assignees
- 中国石油化工股份有限公司
- 中国石油化工股份有限公司胜利油田分公司
Dates
- Publication Date
- 20260512
- Application Date
- 20241112
Claims (9)
- 1. The acidification measure effect prediction method based on the multisource feature fusion machine learning model is characterized by comprising the following steps of: Step 1, collecting multi-source data required by an acidizing measure effect prediction method, wherein the multi-source data comprise well condition data, geological parameters, production dynamic data and measure data; step 2, cleaning, preprocessing and extracting the characteristics of the multi-source data collected in the step 1; Step 3, performing feature selection on the multi-source data processed in the step 2 by using an integrated feature screening model, wherein the integrated feature screening model is obtained by integrating a random forest feature importance method, a Pelman correlation analysis method and an elastic network method; step 4, performing independence verification and representative verification on the characteristic parameters selected by the integrated characteristic screening model in the step 3; Step 5, forming a data set from the multisource data verified in the step 4, and splitting the data set to obtain a training set and a verification set; step 6, optimizing the super-parameters of the acidification measure effect prediction model based on LCE by using a particle swarm algorithm; and 7, carrying out model evaluation on the acidizing measure effect prediction model based on the LCE with the super-parameters optimized in the step 6, and obtaining the acidizing measure effect prediction model based on the multisource feature fusion machine learning model after the model evaluation is passed.
- 2. The method for predicting the effect of acidification measure based on a multisource feature fusion machine learning model according to claim 1, wherein the process of cleaning multisource data in step 2 can be specifically described as: processing the missing values in the multi-source data by adopting a mode of combining deletion and filling; using DBScan clustering algorithm to process abnormal values in the multi-source data; Quantitative feature binarization processing is carried out on the multi-source data.
- 3. The method for predicting the effect of acidification measure based on a multisource feature fusion machine learning model according to claim 2, wherein the process of processing the missing values in the multisource data by adopting a combination of deletion and filling is specifically described as follows: Detecting a missing value of any piece of data of the multi-source data; If the number of the missing values exceeds the preset requirement, deleting the whole piece of data, otherwise, predicting the missing values by adopting a random forest algorithm, and filling the predicted values into the piece of data as default correct values.
- 4. The method for predicting the effect of acidification measure based on the multisource feature fusion machine learning model according to claim 1, wherein the process of preprocessing multisource data in step 2 can be specifically described as: and performing data class conversion on class type data in the multi-source data, and performing normalization processing on data with different dimensions in the multi-source data.
- 5. The acidifying measure effect prediction method based on the multisource feature fusion machine learning model of claim 4, wherein when the class type data in the multisource data is subjected to data class conversion, a Python language is used for adding a digital label for each class type data; And a normalization process for data of different dimensions in the multi-source data, satisfying: ; In the formula, The value range is between [ -1,1] for the normalized value; a minimum value for each parameter in the pre-processed measure stimulation data set; the maximum value of each parameter in the data set is stimulated for the pre-treated measure.
- 6. The method for predicting the effect of acidification measure based on the multisource feature fusion machine learning model according to claim 1, wherein the process of feature extraction of multisource data in the step 2 can be specifically described as: using CNN, feature extraction is performed on image data and sequence data in the multi-source data so as to become one-dimensional feature vector data for LCE processing.
- 7. The acidifying measure effect prediction method based on the multisource feature fusion machine learning model according to claim 1, wherein the integration process of the integrated feature screening model in step 3 can be specifically described as: normalizing the importance of the features of each method in the integrated feature screening model to obtain a feature weight, wherein the feature weight is used for reflecting the ratio of the importance of the features in the sum of the importance of all the features; the characteristic weight value The method comprises the following steps: ; Wherein, the For the importance of this feature, For the least of the overall feature importance, Is the maximum value of the importance in the overall feature importance; after normalization, carrying out weighted fusion on the feature weights of each method to obtain feature importance vectors integrating all features; the feature importance vector integrating all features The method comprises the following steps: ; Wherein, the Is the first The weight that the individual methods take in the integration, Is the first The feature weight value normalized by the method.
- 8. The method for predicting the acidification measure effect based on the multisource feature fusion machine learning model according to claim 1, wherein in the step 4, the feature parameters selected by the integrated feature screening model are independently verified by using a spearman correlation analysis method, and the feature parameters selected by the integrated feature screening model are representatively verified by using an ablation experiment.
- 9. The method for predicting the effect of the acidification measure based on the multisource feature fusion machine learning model according to claim 1, wherein the process of optimizing the superparameter of the acidification measure effect prediction model based on LCE in the step 6 by using a particle swarm algorithm can be specifically described as: selecting super parameters to be optimized, initializing a particle swarm, and setting the maximum iteration times; training, and calculating the fitness of each particle; If the whole particle swarm is not converged and the maximum iteration number is not reached, returning to the previous step to continue training; and finally, determining the optimal value of the super-parameter in the acidification measure effect prediction model based on the LCE.
Description
Acidification measure effect prediction method based on multisource feature fusion machine learning model Technical Field The invention belongs to the technical field of petroleum engineering, and particularly relates to an acidification measure effect prediction method based on a multisource feature fusion machine learning model. Background With the development of oil fields gradually going into the middle and later stages, most oil fields have a series of problems of yield reduction, excessive water content and the like, and effective oil well yield increasing measures are necessary for improving the production capacity and oil yield of the oil wells. In this context, acidizing is widely used as a common stimulation technique to increase the capacity of oil and gas wells. By injecting acidic chemical substances, sediments in a shaft and a reservoir can be effectively removed, and a channel for oil and gas flow is increased, so that the yield and recovery rate of an oil and gas well are obviously improved, and the method is one of effective methods for resisting the ageing problem of an oil field. Among other things, predicting the effectiveness of acidizing measures is critical, as doing so helps optimize the job design, predicting possible increases in production, thereby providing data support for investment decisions. By means of acidification measure prediction, resources can be more effectively allocated, uncertainty and risk in acidification operation are reduced, and operation economy is improved. Accurate predictions allow engineers to select the most appropriate acidizing parameters, such as type and concentration of acid, amount of acidizing fluid, injection rate, etc., to achieve the best stimulation effect. In addition, the prediction can further guide on-site operation, reduce potential influence of acidification operation on the environment, and ensure the safety of operation. Existing acidizing measure effect prediction usually depends on an empirical formula or a simplified physical model, and mainly comprises empirical judgment, historical data comparison and physical model prediction. However, the above methods have disadvantages in that they tend to ignore the complexity and variability of the wellsite data, fail to fully utilize all available information, and have limited accuracy and reliability of predictions. In addition, technical means are tried by a technician, for example, a patent title is that an oil well acidizing measure effect prediction method based on transfer learning is recorded in a patent document with application number of CN202310883234.4, the oil well acidizing measure effect prediction method based on transfer learning belongs to the technical field of petroleum engineering, and comprises the steps of establishing an acidizing numerical simulation model based on a seepage field-temperature field-chemical field coupling model, conducting acidizing process simulation by using the numerical simulation method, constructing a simulation sample set, constructing an actual sample library according to actual production data before and after acidizing of an acidizing well, constructing an acidizing measure effect prediction model based on a BP neural network model, pretraining the prediction model through the simulation sample set, substituting the actual sample set into the pretrained prediction model through the transfer learning method to conduct pretraining again to obtain a final prediction model, evaluating the final prediction model, outputting a prediction model with good evaluation, and conducting acidizing measure effect prediction based on the well evaluation prediction model. However, after further research, the inventor finds that the above prior art does not form a complete and mature acidification measure effect prediction system, so that the prediction of the yield increase effect after acidification measures is completed, and the technical requirements of technicians on efficient yield increase and real-time monitoring are met. Disclosure of Invention The invention provides an acidification measure effect prediction method based on a multisource feature fusion machine learning model, which performs feature selection through an integrated feature screening model and realizes scientific and accurate prediction of the oil well acidification measure yield increase effect based on a machine learning method combining a PSO model and an LCE model. In order to solve the technical problems, the invention adopts the following technical scheme: the acidification measure effect prediction method based on the multisource feature fusion machine learning model comprises the following steps: Step 1, collecting multi-source data required by an acidizing measure effect prediction method, wherein the multi-source data comprise well condition data, geological parameters, production dynamic data and measure data; step 2, cleaning, preprocessing and extracting the characteristi