Search

CN-121980394-A - Water pollution monitoring method, device and storage medium integrating multi-source data

CN121980394ACN 121980394 ACN121980394 ACN 121980394ACN-121980394-A

Abstract

The invention provides a water quality pollution monitoring method, a device and a storage medium integrating multisource data, and relates to the technical field of water quality monitoring. The method comprises the key steps of data preprocessing, cross-modal feature fusion, pollution concentration estimation, graph neural network tracing and the like, and the multi-source water quality data and monitoring point space topology information are deeply integrated to construct a full-flow automatic monitoring system. The invention realizes end-to-end processing from data standardization to pollution emission source positioning, effectively breaks through multi-source heterogeneous data barriers, fully excavates complementary values and space-time association rules of different mode data, not only improves the integrity, the accuracy and the efficiency of water quality pollution monitoring, but also can accurately position emission source positions and pollution diffusion paths, provides full-link and targeting technical support for water quality pollution control and treatment decision, and solves the problems of low utilization rate and insufficient tracing accuracy of traditional monitoring data.

Inventors

  • ZHANG SHAOMEI
  • HUANG YUAN
  • Huang Xinteng
  • HUANG BODANG
  • LING ZHENGXUE
  • WEI JUN
  • Li Qiuyao
  • WU ZHIBIN
  • HUANG NAIZUN
  • HUANG ZENG
  • Yan jiaxing

Assignees

  • 广西壮族自治区生态环境监测中心
  • 碧兴物联科技(深圳)股份有限公司

Dates

Publication Date
20260505
Application Date
20260409

Claims (8)

  1. 1. The water pollution monitoring method integrating the multi-source data is characterized by comprising the following steps of: S1, importing multi-source water quality raw data acquired by a plurality of monitoring points, performing space-time alignment treatment on the multi-source water quality raw data, and performing standardization treatment on the aligned multi-source water quality raw data to obtain a standardized water quality data set; S2, constructing a water quality pollution cross-modal feature fusion model, and carrying out feature fusion processing on the standardized water quality data set based on the water quality pollution cross-modal feature fusion model to obtain fusion features; S3, importing space topology information of a plurality of monitoring points, splicing the fusion characteristics and the space topology information to obtain comprehensive characteristics, constructing a pollution concentration self-adaptive estimation model, estimating the comprehensive characteristics based on the pollution concentration self-adaptive estimation model, and outputting pollution concentration estimated values corresponding to the monitoring points; and S4, constructing a graph structure by taking a plurality of monitoring points as nodes and taking water connectivity as edges, embedding pollution concentration estimated values corresponding to the monitoring points into the graph structure as graph node characteristics, constructing a graph neural network traceability model, learning pollution propagation association relations among nodes in the graph structure based on the graph neural network traceability model, classifying and outputting the probability that each node is a pollution emission source through the nodes based on the pollution propagation association relations, and positioning the position of the pollution emission source based on the probability of the pollution emission source.
  2. 2. The method for monitoring water pollution by fusing multisource data according to claim 1, wherein the step S1 further comprises the steps of: S1.1, importing multi-source water quality raw data acquired by a plurality of monitoring points, and classifying and marking the imported multi-source water quality raw data according to data types and monitoring point attributions to obtain a classified and marked data set, wherein the data types comprise sensor real-time monitoring data, satellite remote sensing data, hydrological meteorological data and historical pollution event data, and each data item is associated with a unique monitoring point identifier, a data type label and an acquisition time-space stamp; S1.2, extracting time sequences of different types of data of each monitoring point based on the classified marking data set, and carrying out self-adaptive interpolation supplementation on each data of a missing time node in the classified marking data set by using the time sequence relativity of the time sequences with set time as uniform time granularity to obtain a multi-type data subset with time synchronization of each monitoring point; s1.3, correcting the position deviation of low spatial resolution of each data in the multi-type data subset by using a spatial neighbor interpolation method with longitude and latitude of each monitoring point as a reference to obtain a multi-source data set with aligned space-time double dimensions; and S1.4, uniformly converting each data in different formats in the multi-source data set into a structured data table format by adopting a self-adaptive multi-source data format conversion model to obtain a standardized water quality data set, wherein the structured data table format comprises monitoring point identification, acquisition time, space coordinates and a data type label.
  3. 3. The method for monitoring water pollution by fusing multisource data according to claim 1, wherein the step S2 further comprises the steps of: S2.1, dividing the standardized water quality data set according to data mode types to obtain numerical mode data, image mode data and text mode data; S2.2, extracting the numerical model data through a fully connected neural network to obtain numerical characteristics , , As a dimension of the numerical feature, Representing numerical features Is that Extracting the image model data through a lightweight convolutional neural network to obtain image features , , As a dimension of the image feature, Representing image features Is that Extracting the text model data through BERT training model to obtain semantic features , , As a dimension of the text feature, Representing semantic features Is that Real vectors of dimensions; s2.3 characterizing the numerical values Image characteristics And semantic features Mapping to the same dimension to obtain the features with unified dimensions 、 And , , , , Wherein, the , , The mapping matrix is adapted for each modality, 、 、 Is a corresponding bias term; s2.4, constructing a modal attention mechanism module, and unifying the dimension characteristics based on the modal attention mechanism module 、 And The expression of the modal attention mechanism module is as follows: , , Wherein, the In order to pay attention to the weight parameter matrix, In order to be able to focus on the bias term, The function is activated for Sigmoid, For the characteristic splicing operation, the method comprises the following steps, In order to focus on the projection matrix, Is a numerical characteristic Image characteristics And semantic features A corresponding attention weight; S2.5 based on the respective characteristics 、 And Attention to the individual features 、 And Weighted fusion is carried out to obtain final fusion characteristics , 。
  4. 4. The method for monitoring water pollution by fusing multisource data according to claim 1, wherein the step S3 comprises the steps of: S3.1, importing space topology information of a plurality of monitoring points, converting the space topology information into a relative distance matrix and a water connectivity adjacent matrix between the monitoring points through graph structure modeling, and obtaining a standardized space topology feature set based on the relative distance matrix and the water connectivity adjacent matrix between the monitoring points, wherein the relative distance matrix between the monitoring points is expressed as The water connectivity adjacency matrix is expressed as ; S3.2, the fusion features are in one-to-one correspondence with the monitoring point identifications, and the fusion features of each monitoring point are spliced with the corresponding row vectors in the standardized space topology feature set to obtain the comprehensive features of each monitoring point ; S3.3, constructing a pollution concentration self-adaptive estimation model, wherein the pollution concentration self-adaptive estimation model comprises an input layer, a self-adaptive feature adjustment layer, a time sequence feature extraction layer and an output layer; the input layer receives integrated features ; The self-adaptive feature adjustment layer dynamically adjusts the weight duty ratio of the standardized space topological feature and the fusion feature through a gating mechanism to obtain a self-adaptively adjusted feature vector , Expressed as: , Wherein, the In order to adapt the weight coefficient of the model, Performing normalization operation on each comprehensive characteristic; the time sequence feature extraction layer extracts the self-adaptive adjusted feature vector through a pre-trained ELSTM network Obtaining a time sequence characteristic; and the output layer carries out regression fitting on the extracted time sequence characteristics through a pre-trained BP network and outputs pollution concentration estimated values corresponding to all monitoring points.
  5. 5. The method for monitoring water pollution by fusing multisource data according to claim 1, wherein the step S4 further comprises the steps of: s4.1, defining a graph structure based on the spatial distribution and water body communication relation of a plurality of monitoring points Wherein, the node set , To monitor the number of points, each node A corresponding one of the monitoring points is provided, Edge set If the monitoring point is And (3) with The water body communication relationship exists between the two water bodies, so that the edge is constructed , ) And calculates an edge weight based on the water flow rate and the distance between the two monitoring points, expressed as: , Wherein, the In order to monitor the linear distance between the points, For the average flow velocity of the water body, the larger the weight value is, the stronger the pollution transmission relevance is; Estimating the pollution concentration value of each monitoring point As a core feature, constructing each node based on longitude and latitude of the monitoring point and the drainage basin partition label Feature vectors of (a) , And In terms of longitude and latitude, the terms, For basin partition coding, node characteristic matrix is formed ; S4.2, constructing a graph neural network traceability model based on a graph attention network, wherein the graph neural network traceability model comprises a characteristic aggregation layer, a causal propagation layer and a classification output layer, The feature aggregation layer performs weighted aggregation on node neighborhood features based on an attention mechanism, and calculates nodes And neighborhood node Attention coefficient of (a) , , Wherein, the Is a node Is a set of neighborhood nodes of (a), In order to transform the matrix for the features, For the characteristic splicing operation, the method comprises the following steps, As a learnable attention weight vector Is a transpose of (2); And updating node characteristics based on coefficients , wherein, In order to activate the function, The causal propagation layer builds causal constraint based on the water body flowing direction and the pollutant diffusion rate, corrects the neighborhood aggregation weight, and if the water body is from the neighborhood node Flow direction node Then the neighborhood nodes are enhanced Opposite node Is updated by the feature contribution weight of (a) , wherein, Is a node Is a set of upstream neighborhood nodes of (a), For causal constraint coefficients, a range of values ; The classification output layer is used for characterizing final nodes through a fully-connected network Classifying and outputting nodes Probability of being a source of polluting emissions , wherein, And As a matrix of weights, the weight matrix, And Is a bias term; S4.3. the structure of the graph Inputting the graph neural network traceability model, and outputting the probability that each node is a pollution emission source If the probability is greater than or equal to the probability threshold, judging the node as a suspected pollution emission source, and outputting specific longitude and latitude coordinates of the emission source based on the geographical position information of the monitoring point corresponding to the node.
  6. 6. A multi-source data fusion water pollution monitoring device, applying the multi-source data fusion water pollution monitoring method according to any one of claims 1 to 5, comprising: The data preprocessing module is used for importing multi-source water quality original data acquired by a plurality of monitoring points, carrying out space-time alignment processing on the multi-source water quality original data, and carrying out standardization processing on the aligned multi-source water quality original data to obtain a standardized water quality data set; The cross-modal feature fusion module is used for constructing a water quality pollution cross-modal feature fusion model, and carrying out feature fusion processing on the standardized water quality data set based on the water quality pollution cross-modal feature fusion model to obtain fusion features; The pollution concentration estimation module is used for importing the space topology information of a plurality of monitoring points, splicing the fusion characteristics and the space topology information to obtain comprehensive characteristics, constructing a pollution concentration self-adaptive estimation model, estimating the comprehensive characteristics based on the pollution concentration self-adaptive estimation model and outputting pollution concentration estimated values corresponding to the monitoring points; The pollution tracing and positioning module is used for constructing a graph structure by taking a plurality of monitoring points as nodes and taking water connectivity as edges, embedding estimated pollution concentration values corresponding to the monitoring points into the graph structure as graph node characteristics, constructing a graph neural network tracing model, learning pollution propagation association relations among the nodes in the graph structure based on the graph neural network tracing model, classifying and outputting the probability that each node is a pollution emission source through the nodes based on the pollution propagation association relations, and positioning the position of the pollution emission source based on the probability of the pollution emission source.
  7. 7. A multi-source data fusion water pollution monitoring device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, performs the multi-source data fusion water pollution monitoring method according to any one of claims 1 to 5.
  8. 8. A computer readable storage medium, wherein the computer readable storage medium stores a computer program, and when the computer program runs, controls a device in which the computer readable storage medium is located to perform the water pollution monitoring method of fusion of multi-source data according to any one of claims 1 to 5.

Description

Water pollution monitoring method, device and storage medium integrating multi-source data Technical Field The invention mainly relates to the technical field of water quality monitoring, in particular to a water quality pollution monitoring method, a device and a storage medium integrating multi-source data. Background At present, water environment pollution has become a great challenge facing the field of global environmental protection, and the problems of water pollution, such as industrial wastewater theft and discharge, agricultural non-point source pollution, urban domestic sewage discharge and other artificial pollution sources, are further aggravated by acceleration of industrialization and urban processes, so that pollutant components, such as heavy metals, organic pollutants, pathogenic microorganisms and the like in water bodies are more complex, the balance of aquatic ecosystems is destroyed, and sustainable utilization of water resources and human health are threatened. In order to solve the water pollution problem, a water environment monitoring technology is used as a core tool for pollution control, but the traditional monitoring means has obvious limitations that firstly, manual sampling and laboratory physicochemical analysis can be realized, complicated sample pretreatment and expensive instrument support are required, the problems of poor monitoring timeliness and limited space coverage range are existed, pollution dynamic change and sudden pollution events are difficult to capture, secondly, a single monitoring technology, namely an inherent short plate, biological monitoring is greatly influenced by environment and biological life cycle and takes a long time, can only be used as a supplementary means, the remote sensing monitoring can cover a large-scale water area, specific components and quantitative concentration of pollutants cannot be accurately identified, conventional monitoring assistance is needed, thirdly, the problems of inconsistent time and space, format isomerism, dimensional difference and the like of multi-source monitoring data exist, the traditional technology is difficult to realize effective data fusion, monitoring information fragmentation is caused, and comprehensive support cannot be provided for pollution source tracing and treatment. In recent years, although technologies such as an on-line monitoring sensor, surface Enhanced Raman Scattering (SERS), the internet of things and the like are gradually applied to water environment monitoring, breakthroughs are made in real-time performance and sensitivity, how to further integrate advantages of multi-source data, improve monitoring precision and reliability, and simultaneously realize full-link coverage from pollution monitoring and accurate positioning of pollution sources, is still a core bottleneck of current technical development. In addition, the existing monitoring system lacks intelligent analysis capability, is difficult to mine pollution propagation rules and trends based on monitoring data, so that pollution control decisions lack scientific and efficient technical support, and urgent requirements of water resource protection and ecological system health maintenance cannot be met. Disclosure of Invention The invention aims to solve the technical problem of providing a water pollution monitoring method, a device and a storage medium which are integrated with multi-source data aiming at the defects of the prior art. The technical scheme for solving the technical problems is as follows, the water quality pollution monitoring method integrating the multi-source data comprises the following steps: S1, importing multi-source water quality raw data acquired by a plurality of monitoring points, performing space-time alignment treatment on the multi-source water quality raw data, and performing standardization treatment on the aligned multi-source water quality raw data to obtain a standardized water quality data set; s2, constructing a water quality pollution cross-modal feature fusion model, and carrying out feature fusion processing on the standardized water quality data set based on the water quality pollution cross-modal feature fusion model to obtain fusion features; S3, importing space topology information of a plurality of monitoring points, splicing the fusion characteristics and the space topology information to obtain comprehensive characteristics, constructing a pollution concentration self-adaptive estimation model, estimating the comprehensive characteristics based on the pollution concentration self-adaptive estimation model, and outputting pollution concentration estimated values corresponding to the monitoring points; and S4, constructing a graph structure by taking a plurality of monitoring points as nodes and taking water connectivity as edges, embedding pollution concentration estimated values corresponding to the monitoring points into the graph structure as graph node characteristics, const