CN-122020050-A - Sewage treatment water quality prediction method and system based on multi-source data fusion
Abstract
The invention relates to the technical field of sewage treatment, in particular to a sewage treatment water quality prediction method and system based on multi-source data fusion. According to the invention, the fusion feature vector is obtained based on the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence, wherein the feature sub-vectors with different dimensions are respectively mapped with the features of different types of time sequence data, so that the problem that the water quality parameter, the equipment parameter and the process parameter are difficult to fuse is solved. In addition, the chemical oxygen demand of the effluent is decomposed into a plurality of low-complexity time sequence components of the chemical oxygen demand of the effluent through signal decomposition, and the time sequence components of the chemical oxygen demand of the effluent and the fusion characteristic vectors with low complexity are used for predicting the quality of the effluent, so that the calculation burden of a model can be reduced, the prediction capacity and generalization capacity of the model are ensured, the prediction model can be successfully converged to obtain a prediction result, and the technical bottlenecks of difficulty in fusion of multi-source data, large data quantity and high complexity are overcome.
Inventors
- SUN YAOHUA
- REN LIPENG
- JI JIANHUA
- QI YONGHAO
- LIU XUEQIAN
Assignees
- 兰州石化职业技术大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260202
Claims (10)
- 1. A sewage treatment water quality prediction method based on multi-source data fusion is characterized by comprising the following steps: acquiring a water quality parameter time sequence, an equipment parameter time sequence and a process parameter time sequence based on a preset time acquisition window; respectively extracting characteristics of the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence, so as to obtain a water quality parameter characteristic vector, an equipment parameter characteristic vector and a process parameter characteristic vector; Performing feature dimension alignment on the water quality parameter feature vector, the equipment parameter feature vector and the process parameter feature vector to construct an initial feature matrix; Performing feature intersection based on the initial feature matrix to obtain a fusion feature matrix; Performing dimension compression and feature integration on the fusion feature matrix to obtain a fusion feature vector; Acquiring a current effluent chemical oxygen demand time sequence based on a preset time acquisition window; Performing signal decomposition on the current effluent chemical oxygen demand time sequence, so as to obtain a plurality of effluent chemical oxygen demand time sequence components and a decomposition residual error item; Respectively inputting each water outlet chemical oxygen demand time sequence component and the fusion characteristic vector into a pre-trained water quality prediction model to generate a predicted component time sequence corresponding to each water outlet chemical oxygen demand time sequence component; predicting a residual prediction value corresponding to the decomposed residual item based on a weighted moving average method; and linearly reconstructing all the predicted component time sequences and the residual error predicted values to generate a future effluent chemical oxygen demand time sequence, so as to finish the sewage treatment water quality prediction.
- 2. The method for predicting the sewage treatment water quality based on the multi-source data fusion according to claim 1, wherein the acquiring of the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence based on the preset time acquisition window comprises the steps of acquiring the influent ammonia nitrogen, the dissolved oxygen, the water quality pH value and the activated sludge microbial community image of any time node of the preset time acquisition window; Converting the activated sludge microbial community image into a gray level histogram, and further analyzing the gray level mean value and the gray level standard deviation of the gray level histogram; synthesizing a water inlet ammonia nitrogen time sub-sequence, a dissolved oxygen time sub-sequence, a water quality pH value time sub-sequence, a gray level average time sub-sequence and a gray level standard deviation time sub-sequence based on the water inlet ammonia nitrogen, the dissolved oxygen, the water quality pH value, the gray level average value and the gray level standard deviation of all the time nodes; combining the water inlet ammonia nitrogen time sub-sequence, the dissolved oxygen time sub-sequence, the water quality pH value time sub-sequence, the gray scale mean time sub-sequence and the gray scale standard deviation time sub-sequence into the water quality parameter time sequence; Acquiring the current, the power and the air quantity of the aeration equipment at any time node of a preset time acquisition window; synthesizing an aeration equipment current time sub-sequence, an aeration equipment power time sub-sequence and an aeration equipment air volume time sub-sequence based on the aeration equipment current, the aeration equipment power and the aeration equipment air volume of all the time nodes; Combining the aeration equipment current time sub-sequence, the aeration equipment power time sub-sequence and the aeration equipment air quantity time sub-sequence into the equipment parameter time sequence; Acquiring aeration quantity, sludge concentration and reflux ratio of any time node of a preset time acquisition window; synthesizing an aeration amount time subsequence, a sludge concentration time subsequence and a reflux ratio time subsequence based on the aeration amount, the sludge concentration and the reflux ratio of all the time nodes; and combining the aeration quantity time sub-sequence, the sludge concentration time sub-sequence and the reflux ratio time sub-sequence into the process parameter time sequence.
- 3. The method for predicting water quality of sewage treatment based on multi-source data fusion according to claim 2, wherein the feature extraction is performed on the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence, respectively, so as to obtain a water quality parameter feature vector, an equipment parameter feature vector and a process parameter feature vector, and the method comprises the following steps: Acquiring a water inlet ammonia nitrogen average value, a water inlet ammonia nitrogen change rate, a water inlet ammonia nitrogen maximum value and a water inlet ammonia nitrogen minimum value based on the water inlet ammonia nitrogen time subsequence; obtaining a dissolved oxygen average value, a dissolved oxygen change rate, a dissolved oxygen maximum value and a dissolved oxygen minimum value based on the dissolved oxygen time subsequence; Acquiring a water quality pH average value, a water quality pH change rate, a water quality pH maximum value and a water quality pH minimum value based on the water quality pH value time subsequence; based on the gray average time subsequence, acquiring an average value of a gray average value, a gray average value change rate, a gray average value maximum value and a gray average value minimum value; acquiring a gray standard deviation average value, a gray standard deviation change rate, a gray standard deviation maximum value and a gray standard deviation minimum value based on the gray standard deviation time subsequence; And splicing the water inlet ammonia nitrogen average value, the water inlet ammonia nitrogen change rate, the water inlet ammonia nitrogen maximum value, the water inlet ammonia nitrogen minimum value, the dissolved oxygen average value, the dissolved oxygen change rate, the dissolved oxygen maximum value, the dissolved oxygen minimum value, the water quality pH average value, the water quality pH change rate, the water quality pH maximum value, the water quality pH minimum value, the average value of the gray level average value, the gray level average change rate, the gray level average value maximum value, the gray level average value minimum value, the gray level standard deviation average value, the gray level standard deviation change rate, the gray level standard deviation maximum value and the gray level standard deviation minimum value into twenty-dimensional water quality parameter feature vectors.
- 4. The method for predicting water quality of sewage treatment based on multi-source data fusion according to claim 2, wherein the feature extraction is performed on the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence, respectively, so as to obtain a water quality parameter feature vector, an equipment parameter feature vector and a process parameter feature vector, and the method comprises the following steps: acquiring an aeration equipment current average value, an aeration equipment current change rate, an aeration equipment current maximum value and an aeration equipment current minimum value based on the aeration equipment current time subsequence; acquiring an aeration equipment power average value, an aeration equipment power change rate, an aeration equipment power maximum value and an aeration equipment power minimum value based on the aeration equipment power time subsequence; acquiring an aeration equipment air volume average value, an aeration equipment air volume change rate, an aeration equipment air volume maximum value and an aeration equipment air volume minimum value based on the aeration equipment air volume time subsequence; And splicing the aeration equipment current average value, the aeration equipment current change rate, the aeration equipment current maximum value, the aeration equipment current minimum value, the aeration equipment power average value, the aeration equipment power change rate, the aeration equipment power maximum value, the aeration equipment power minimum value, the aeration equipment air volume average value, the aeration equipment air volume change rate, the aeration equipment air volume maximum value and the aeration equipment air volume minimum value into a ten-two-dimensional equipment parameter feature vector.
- 5. The method for predicting water quality of sewage treatment based on multi-source data fusion according to claim 2, wherein the feature extraction is performed on the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence, respectively, so as to obtain a water quality parameter feature vector, an equipment parameter feature vector and a process parameter feature vector, and the method comprises the following steps: acquiring an aeration quantity average value, an aeration quantity change rate, an aeration quantity maximum value and an aeration quantity minimum value based on the aeration quantity time subsequence; obtaining a sludge concentration average value, a sludge concentration change rate, a sludge concentration maximum value and a sludge concentration minimum value based on the sludge concentration time subsequence; obtaining a reflux ratio average value, a reflux ratio change rate, a reflux ratio maximum value and a reflux ratio minimum value based on the reflux ratio time subsequence; And splicing the aeration quantity average value, the aeration quantity change rate, the aeration quantity maximum value, the aeration quantity minimum value, the sludge concentration average value, the sludge concentration change rate, the sludge concentration maximum value, the sludge concentration minimum value, the reflux ratio average value, the reflux ratio change rate, the reflux ratio maximum value and the reflux ratio minimum value into a ten-two-dimensional technological parameter feature vector.
- 6. The method for predicting the sewage treatment water quality based on multi-source data fusion according to claim 1, wherein the feature intersection is performed based on the initial feature matrix to obtain a fusion feature matrix, and the method comprises the following steps: mapping the initial feature matrix into a query matrix, a key matrix and a value matrix; Obtaining the product of the query matrix and the transposed matrix of the key matrix to obtain an original similarity matrix, and further scaling the original similarity matrix to obtain a self-attention feature matrix; normalizing the self-attention feature matrix to obtain an attention weight matrix; and obtaining the product of the attention weight matrix and the value matrix to obtain a fusion feature matrix.
- 7. The method for predicting wastewater treatment quality based on multi-source data fusion according to claim 1, wherein the step of performing signal decomposition on the current time sequence of the effluent chemical oxygen demand to obtain a plurality of time sequence components of the effluent chemical oxygen demand and a decomposition residual term comprises the steps of: generating a plurality of Gaussian white noise; respectively adding a plurality of Gaussian white noises to the current water chemical oxygen demand time sequence to obtain a plurality of noise functions; Performing modal decomposition on each noise function to obtain a first residual error item and a plurality of first function components corresponding to each noise function; constructing a first function component sequence corresponding to each noise function based on a plurality of first function components corresponding to each noise function, and further obtaining a plurality of first function component sequences; for any first function component sequence, determining a sequence number of each first function component in the first function component sequence; Screening a plurality of first function components with the same serial number for any serial number in all the first function component sequences to construct a second function component sequence, so as to obtain a plurality of second function component sequences; Carrying out mean value solving on each second function component sequence to obtain a plurality of second function components, and further taking the plurality of second function components as a plurality of water outlet chemical oxygen demand time sequence components; and obtaining residual error item average values of a plurality of first residual error items, and further taking the residual error item average values as the decomposition residual error items.
- 8. The method for predicting water quality of sewage treatment based on multi-source data fusion according to claim 1, wherein the step of inputting each of the time-series components of the effluent chemical oxygen demand and the fusion feature vector into a pre-trained water quality prediction model to generate a time sequence of predicted components corresponding to each of the time-series components of the effluent chemical oxygen demand comprises the steps of: acquiring sample entropy of each water outlet chemical oxygen demand time sequence component, and further determining that the component type of each water outlet chemical oxygen demand time sequence component is a high-frequency component or a low-frequency component based on the sample entropy of each water outlet chemical oxygen demand time sequence component; for any of the effluent chemical oxygen demand time series components: If the time sequence component of the water-out chemical oxygen demand is a high-frequency component, inputting the time sequence component of the water-out chemical oxygen demand and the fusion characteristic vector into a pre-trained long-short memory neural network to generate a predicted component time sequence corresponding to the time sequence component of the water-out chemical oxygen demand; If the time sequence component of the water-out chemical oxygen demand is a low-frequency component, inputting the time sequence component of the water-out chemical oxygen demand and the fusion characteristic vector into a pre-trained support vector regression neural network to generate a predicted component time sequence corresponding to the time sequence component of the water-out chemical oxygen demand.
- 9. The method for predicting wastewater treatment quality based on multi-source data fusion according to claim 1, wherein the linearly reconstructing all the predicted component time sequences to generate a future effluent chemical oxygen demand time sequence, after completion of the wastewater treatment quality prediction, further comprises: acquiring an actual effluent chemical oxygen demand time sequence in a preset time sliding window; Based on the actual effluent chemical oxygen demand time sequence, calculating a predicted deviation data set of the future effluent chemical oxygen demand time sequence in a preset time sliding window; Analyzing the deviation mean value and the deviation standard deviation of the predicted deviation data set, and further generating a primary deviation early-warning threshold value, a secondary deviation early-warning threshold value and a tertiary deviation early-warning threshold value based on the deviation mean value and the deviation standard deviation; The method comprises the steps of obtaining the predicted deviation of the future water-outlet chemical oxygen demand, carrying out primary abnormality early warning on a user if the predicted deviation is between the primary deviation early warning threshold and the secondary deviation early warning threshold, carrying out secondary abnormality early warning on the user if the predicted deviation is between the secondary deviation early warning threshold and the tertiary deviation early warning threshold, and carrying out tertiary abnormality early warning on the user if the predicted deviation exceeds the tertiary deviation early warning threshold.
- 10. A sewage treatment water quality prediction system based on multi-source data fusion, comprising: the system comprises a data acquisition module, a current water-out chemical oxygen demand time sequence, a water quality parameter time sequence, a device parameter time sequence and a process parameter time sequence, wherein the data acquisition module is used for acquiring the water quality parameter time sequence, the device parameter time sequence and the process parameter time sequence based on a preset time acquisition window; the characteristic extraction module is used for respectively carrying out characteristic extraction on the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence so as to obtain a water quality parameter characteristic vector, an equipment parameter characteristic vector and a process parameter characteristic vector; The characteristic alignment module is used for carrying out characteristic dimension alignment on the water quality parameter characteristic vector, the equipment parameter characteristic vector and the process parameter characteristic vector to construct an initial characteristic matrix; the feature intersection module is used for performing feature intersection based on the initial feature matrix to obtain a fusion feature matrix; the feature fusion module is used for carrying out dimension compression and feature integration on the fusion feature matrix to obtain a fusion feature vector; The data decomposition module is used for carrying out signal decomposition on the current effluent chemical oxygen demand time sequence so as to obtain a plurality of effluent chemical oxygen demand time sequence components and a decomposition residual error item; The water quality prediction module is used for respectively inputting each water outlet chemical oxygen demand time sequence component and the fusion characteristic vector into a pre-trained water quality prediction model to generate a predicted component time sequence corresponding to each water outlet chemical oxygen demand time sequence component, predicting residual prediction values corresponding to the decomposition residual items based on a weighted moving average method, and linearly reconstructing all the predicted component time sequences and the residual prediction values to generate a future water outlet chemical oxygen demand time sequence to complete the sewage treatment water quality prediction.
Description
Sewage treatment water quality prediction method and system based on multi-source data fusion Technical Field The invention relates to the technical field of sewage treatment, in particular to a sewage treatment water quality prediction method and system based on multi-source data fusion. Background In the technical field of water quality monitoring, a traditional water quality prediction method is mainly based on a historical water quality parameter sequence in a water area environment, and the future water quality is deduced through time sequence analysis or regression models so as to monitor the water quality change in an ecological environment. However, in the process that the water quality prediction technology is applied to the technical field of sewage treatment, the prior art still only utilizes the historical water quality parameters from the water area environment to predict the water quality, but the water quality of the effluent of the sewage treatment plant is not only determined by the water quality of the influent water of the water area environment, but is also complicated results comprehensively influenced by the water quality of the influent water, the dynamic running state and parameters of the sewage treatment equipment, the technological parameters of the whole treatment system and other multidimensional factors, so that the water quality prediction result of the prior art for sewage treatment is inaccurate, and the prediction deviation degree is higher when the working condition of the sewage treatment equipment fluctuates. At present, even though those skilled in the art have recognized the importance of water quality prediction requiring integration of parameters of sewage treatment facilities, it is still difficult to achieve at a technical level. On the one hand, the water quality parameters and the equipment operation parameters belong to heterogeneous data with different physical meanings and dimensions, and the time sequence of the heterogeneous data possibly has different change rules and scales, so that the data integration is difficult to effectively perform. On the other hand, in the technical field of water quality monitoring, the existing water quality prediction method already relates to a large number of different water quality monitoring indexes, if more equipment parameter indexes and process parameters are introduced on the basis, the complexity and dimension of model input data are rapidly increased, and a data set with large data quantity and high complexity is formed. And the water quality data has the complex characteristics of nonlinearity, large inertia, large time lag and strong coupling. Therefore, if the data input neural network prediction model is directly used, the calculation burden is aggravated by the mutual interference of different scale data, the problems of overfitting, gradient abnormality and the like are more likely to be caused by feature redundancy, noise interference and model capacity limitation, and finally, the model prediction misalignment, generalization capability reduction and even difficult convergence are finally caused, and a water quality prediction result cannot be obtained. Disclosure of Invention The invention aims to provide a sewage treatment water quality prediction method and a system based on multi-source data fusion, which are used for overcoming the technical bottlenecks of difficult fusion of multi-source data, large data volume and high complexity, and predicting the sewage treatment effluent water quality by combining the running state and the technological parameters of sewage treatment equipment so as to improve the accuracy and the practicability of sewage treatment effluent water quality prediction. In order to achieve the above object, a first aspect of the present invention provides a sewage treatment water quality prediction method based on multi-source data fusion, the method comprising the steps of: acquiring a water quality parameter time sequence, an equipment parameter time sequence and a process parameter time sequence based on a preset time acquisition window; respectively extracting characteristics of the water quality parameter time sequence, the equipment parameter time sequence and the process parameter time sequence, so as to obtain a water quality parameter characteristic vector, an equipment parameter characteristic vector and a process parameter characteristic vector; Performing feature dimension alignment on the water quality parameter feature vector, the equipment parameter feature vector and the process parameter feature vector to construct an initial feature matrix; Performing feature intersection based on the initial feature matrix to obtain a fusion feature matrix; Performing dimension compression and feature integration on the fusion feature matrix to obtain a fusion feature vector; Acquiring a current effluent chemical oxygen demand time sequence based on a preset time acquisition window; P