CN-122019969-A - Ironmaking data sampling method integrating time attenuation and multi-parameter stability
Abstract
The invention belongs to the technical field of blast furnace ironmaking data processing, and discloses an ironmaking data sampling method integrating time attenuation and multi-parameter stability, which comprises the steps of performing dual-quality filtration on ironmaking historical data; and calculating a furnace condition stability index based on the average value of variation coefficients of the multi-process parameters, combining the time attenuation coefficients to obtain comprehensive weights, and updating the sampling strategy periodically and dynamically. The method solves the defect that the prior method is not combined with the blast furnace ironmaking process characteristics, improves the representativeness of the sampled samples, adapts to the data mining and modeling requirements, is simple and convenient to implement, and can be used for the data preprocessing of the whole life cycle of the blast furnace.
Inventors
- LI ZHUANGNIAN
- YE JIANHU
- SHU YOUWU
- LIU YANNAN
- DU SHU
- Gao Longpeng
- LIU YONG
- FAN MENGHUI
- WANG ZHAOHUI
- TANG SHUNBING
Assignees
- 山西太钢不锈钢股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260120
Claims (8)
- 1. A method for sampling ironmaking data by combining time attenuation and multi-parameter stability is characterized by comprising the following steps: s1, carrying out quality filtration on ironmaking historical data, and removing abnormal values and low-quality data; s2, equally dividing the filtered data into N blocks according to a time ascending sequence, wherein N is more than or equal to 5, and all N blocks are reserved; s3, calculating a furnace condition stability index of each block based on a variation coefficient average value of the multi-process parameters; s4, calculating the comprehensive weight of each block by combining the time attenuation coefficient and the stability index; s5, randomly sampling the previous N-1 block without replacement according to the comprehensive weight; S6, periodically re-executing S1-S5, and dynamically updating the sampling data set.
- 2. The method for sampling ironmaking data integrating time attenuation and multi-parameter stability as claimed in claim 1, wherein in S1, the quality filtering adopts a dual mechanism that significant outliers are removed through IQR, and extreme values are compressed through quantile range.
- 3. The method for sampling ironmaking data with fusion time attenuation and multiparameter stability according to claim 2, wherein the method comprises the following steps: The method for eliminating the significant outlier by using the IQR comprises the following steps: For each process parameter x, calculating 25% quantile Q 1 and 75% quantile Q 3 to obtain a quartile range IQR=Q 3 -Q 1 , reserving data meeting the requirements of Q 1 -λ×IQR≤x≤Q 3 +lambda×IQR, and filtering significant outliers, wherein lambda takes the range of 1.5-4.5; the quantile range screening method comprises the following steps: And configuring a unified quantile threshold, namely, a lower quantile q_l and an upper quantile q_h, and reserving data of which parameter values are in the range of q_l and q_h.
- 4. The method for sampling ironmaking data integrating time attenuation and multi-parameter stability as claimed in claim 1, wherein the furnace condition stability index calculation process in S3 is as follows: s31, calculating CV of the jth parameter in the ith block: CVij=σij/μij(μj>0); s32, calculating global CV of the j-th parameter: s33, calculating a stability index of the j-th parameter in the i-th block: Sij=max(0,1-CVij/ ); s34, weighting according to a preset weight to obtain a multi-parameter stability index of the ith block: Ii= Wherein wj is greater than or equal to 0 and 。
- 5. The method for sampling ironmaking data with fusion time attenuation and multi-parameter stability as claimed in claim 1, wherein the calculation formula of the comprehensive weight in S4 is as follows: w i =r Δt_i ×exp(-β·I i ), Δt_i=n-i is the time interval, r e (0, 1) is the time decay coefficient, and β >0 is the stability influence coefficient.
- 6. The method for sampling iron-making data with fusion time attenuation and multiparameter stability according to claim 5, wherein r is preferably 0.7-0.95, and beta is preferably 0.2-1.0.
- 7. The method for sampling ironmaking data with fusion time attenuation and multiparameter stability according to claim 1, wherein the number of samples C i =max(N min ,⌊len(block i )×W i ⌋ in S5) is 5% -10% of the data amount per block.
- 8. The method for sampling ironmaking data with integrated time attenuation and multi-parameter stability as claimed in claim 1, wherein the dynamic update period in S6 is adjusted according to the change frequency of the blast furnace condition.
Description
Ironmaking data sampling method integrating time attenuation and multi-parameter stability Technical Field The invention belongs to the technical field of blast furnace ironmaking data processing, relates to sampling of blast furnace data in the ironmaking production process, is suitable for data preprocessing links of blast furnace data mining, regression analysis and machine learning model training, and particularly relates to an ironmaking data sampling method integrating time attenuation and multi-parameter stability. Background In the iron-making production process, the blast furnace data needs to be sampled, and the blast furnace iron-making data has remarkable time effect and non-stationarity core characteristics, and is specifically expressed as follows: (1) Furnace age period influence, namely, as the furnace age increases, furnace lining erosion and thermal state change cause continuous drift of process parameter distribution, and adaptability of long-term data and current furnace conditions decreases; (2) The timeliness difference is that the recent data directly reflects the characteristics of the current furnace condition, the reference value is higher, and the reference value of the long-term data naturally decays along with the time; (3) The stability difference is that the historical data of the furnace condition in the stable period (such as the normal smelting stage) still has a certain reference meaning, and the reference value is rapidly reduced due to abnormal distribution of the data of the furnace condition in the fluctuation period (such as the raw material replacement and the equipment overhaul). The prior blast furnace ironmaking data sampling method has the following defects: (1) The defect of simple random sampling is that the time sequence and timeliness of the data are not distinguished, a large amount of low-value long-term data are easily incorporated, and the sample representativeness is insufficient; (2) The defect of fixed proportion sampling is that data in each period are extracted according to a uniform proportion, the stability difference of the furnace condition is not considered, and the modeling result is interfered when the fluctuation period data is too high; (3) The defect of general time weighted sampling is that the data stability is not corrected by only adjusting the weight by time attenuation and not combining the blast furnace process characteristics, and the parameter distribution change caused by the fluctuation of the furnace condition cannot be adapted. The above-mentioned shortcomings make it difficult for the existing sampling method to reject redundant/interference data while guaranteeing the data representativeness, and finally affect the accuracy of blast furnace data mining (such as furnace temperature prediction and energy consumption analysis) and modeling. Disclosure of Invention The invention aims to provide an ironmaking data sampling method integrating time attenuation and multi-parameter stability, which combines time effect and furnace condition stability, reflects data timeliness through time partitioning, ensures high weight of recent data, calculates a furnace condition stability index based on a Coefficient of Variation (CV) of a multi-process parameter, corrects the weight of data in different periods, outputs a high-quality sample set to adapt to model training requirements such as linear regression, local weighted regression, machine learning and the like, and improves modeling precision. The technical scheme adopted by the invention for achieving the purpose is as follows: A method for sampling ironmaking data by combining time attenuation and multi-parameter stability comprises the following steps: s1, carrying out quality filtration on ironmaking historical data, and removing abnormal values and low-quality data; S2, equally dividing the filtered data into N blocks (N is more than or equal to 5) according to a time ascending sequence, and keeping all the N blocks; s3, calculating a furnace condition stability index of each block based on a Coefficient of Variation (CV) average value of the multi-process parameters; s4, calculating the comprehensive weight of each block by combining the time attenuation coefficient and the stability index; s5, randomly sampling the previous N-1 block without replacement according to the comprehensive weight; S6, periodically re-executing S1-S5, and dynamically updating the sampling data set. Furthermore, the quality filtering in S1 adopts a dual mechanism that the significant outliers are removed through the IQR, and then the compression extreme values are screened through the quantile range. Further, the method for eliminating the significant outlier by using the IQR comprises the following steps: For each process parameter x, calculating 25% quantile Q 1 and 75% quantile Q 3 to obtain a quartile range IQR=Q 3-Q1, reserving data meeting the requirements of Q 1-λ×IQR≤x≤Q3 +lambda×IQR, and filterin