CN-122022474-A - Intelligent verification method and system for environmental protection annual audit data of enterprise
Abstract
The invention discloses an intelligent verification method and system for environmental annual audit data of enterprises, which belong to the technical field of artificial intelligence and environmental supervision, and comprise the steps of S1, uniformly constructing a time sequence feature vector, S2, constructing a digital twin model and two-way calibration, S3, constructing a multi-scale anomaly detection and comprehensive risk scoring system, S4, establishing a self-adaptive verification strategy based on layered reinforcement learning, S5, performing intelligent verification, report generation and closed-loop evolution, and constructing a personalized dynamic digital twin model for each enterprise, combining anomaly data in a microscopic timing anomaly detection and macroscopic association rule mining process through a deep neural network, wherein the method and system can accurately identify hidden and systematic data anomalies, identify fake behaviors, and discover complex cheating means.
Inventors
- MA GUIHUA
- Xue Qiuwang
- Liao Caiping
Assignees
- 广东未来环境监测有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260129
Claims (9)
- 1. An intelligent verification method for environmental protection annual audit data of enterprises is characterized by comprising the following steps: S1, uniformly constructing a time sequence feature vector, acquiring enterprise environmental protection annual examination related data of various data sources, and performing data cleaning, time alignment and normalization on the acquired data to construct a time sequence feature vector X t ; S2, constructing a digital twin model and performing bidirectional calibration, wherein the construction of the digital twin model comprises constructing an environment-friendly operation digital twin model HT of an enterprise and a prediction model h t thereof through a feature vector X t obtained in the S1, and predicting emission data of the enterprise under different working conditions through the prediction model h t Output all The bidirectional calibration comprises cross verification and restoration of the D ori and the D real by using a time sequence network based on an attention mechanism through a annual audit original data stream D ori submitted by an enterprise and an Internet of things real-time data stream D real to obtain correction data D cor , and synchronously inputting the correction data D cor into a digital twin model to realize the calibration of the digital twin model; The method comprises the steps of S3, establishing a multi-scale anomaly detection and comprehensive risk scoring system, wherein the multi-scale anomaly detection comprises the steps of carrying out parallel analysis on correction data D cor and simulated data flow D sim in the S2, carrying out microscopic time sequence anomaly detection, calculating dynamic compliance deviation indexes R i of emission data of each type, i of the emission data of each type of pollutant through the microscopic time sequence anomaly detection, S32, carrying out macro association rule mining, constructing a dynamic association graph of actual data through a graph neural network, and identifying invisible contradictions through comparing structural consistency of the actual data association graph and a digital twin model reference graph, wherein the comprehensive risk scoring comprises the steps of inputting anomaly features output by the S31 and the S32 into a deep neural network, and calculating an anomaly risk score P t (i) of each pollutant at a moment t; S4, establishing a self-adaptive verification strategy based on hierarchical reinforcement learning, establishing a verification strategy agent through the comprehensive risk score set output by S3 and the multi-scale abnormal characteristics, training the agent through a deep reinforcement learning algorithm, and optimizing the decision strategy of the agent; S5, agent verification execution, report generation and closed-loop evolution, verification decision generation according to an optimized agent decision strategy, visual environment-friendly comprehensive report comprising abnormal data points, risk scores, trend analysis and prediction is generated according to verification decision results, the verification decision results and subsequent real supervision feedback are used as neuron groups for storage and used for periodically updating agents, and verified data knowledge is reversely injected into an enterprise digital twin model and a risk scoring system to realize collaborative optimization and autonomous evolution of the model.
- 2. The intelligent verification method for the environmental protection annual audit data of the enterprise according to claim 1 is characterized in that the environmental protection annual audit related data of the enterprise comprises sensor data of the Internet of things, data collected in a production process database, automatic monitoring platform data and manually filled data of the enterprise, the data are cleaned, time aligned and normalized, the acquired data are cleaned by adopting a mode of combining a rule engine and a stream processing frame with a machine learning model, missing values, abnormal values and format errors in the data are processed, data streams with different frequencies and time delays are aligned to standard time stamps through a unified time service and time window aggregation technology, all features are scaled to a unified numerical range by applying a hierarchical normalization strategy according to data types and domain knowledge, and finally standardized time sequence feature vectors X t are output.
- 3. The intelligent verification method of enterprise environmental annual data according to claim 1, wherein the environmental operation digital twin model HT comprises HT being a dynamic simulation model coupled by multiple physical fields, wherein state vectors are defined as HT= (E, Q, U, C), wherein E is a pollutant time sequence vector group, E= (E 1 (t),e 2 (t)…,e m (t)), m represents m-class pollutants in total, E i (t) represents the concentration of i-class pollutants at time t, i is an integer from 1 to m, including 1 and m, Q is a production condition parameter vector group, Q= (Q 1 (t),q 2 (t)…,q n (t)), n represents n parameters in total, Q j (t) represents the value of j parameter at time t, j is an integer from 1 to n, including 1 and n, U is a pollution control facility operation state vector group, U= (U 1 (t),u 2 (t)…,u k (t)), k represents k operation indexes in total, U z (t) represents the value of z-th operation index at time t, z is an integer from 1 to k, and the emission constraint vector is constant, including 1 to k, and the emission constraint vector is constant The calculation formula of (2) is as follows: Wherein, the For the predicted emission value at the t moment, X t is an input time sequence feature vector, and omega is a model parameter; the prediction model h t adopts a hybrid architecture combining a mechanism model and a data driving model, and the formula of the hybrid architecture is as follows: Wherein h t (X t , ω) represents the emission data of the digital twin model to the output variable at time t H physics (X t ) is a mechanism model part, the known data structure and behavior are described through physical and chemical principle analysis of material balance, energy conservation and reaction dynamics of a time sequence feature vector X t , h NN (X t , omega) is a data driving model part, and an error correction and feature supplementation are carried out on an undetermined part in the mechanism model by adopting a time sequence convolutional network TCN deep learning structure.
- 4. The intelligent verification method for enterprise environmental protection annual audit data according to claim 1, wherein the cross verification and repair of D ori and D real by using a time sequence network based on an attention mechanism comprises S21, dynamically calculating mutual attention weights of two data streams of D ori and D real on each time step by an attention layer based on the time sequence network of the attention mechanism, obtaining attention weights based on the attention weights for evaluating the credibility and consistency of the mutual attention weights, S22, fusing and reconstructing information of the two data streams based on the attention weights, repairing and filling missing, conflict and obvious abnormal values, regenerating a more complete and reliable data sequence, S23, outputting the regenerated data sequence as a corrected data stream D cor , and synchronously injecting D cor into a digital twin model.
- 5. The intelligent verification method for environmental protection annual audit data of enterprises according to claim 1, wherein the calculation formula of the dynamic compliance deviation index R i is: wherein T is the annual audit period length, R i represents the whole dynamic compliance deviation index of the ith pollutant in the annual audit period T, Indicating measured data at time t after the i-th type of contaminant correction, Representing a predicted value of the ith pollutant at a moment t in the digital twin model; Representing a first derivative operator, wherein omega 1 (t) and omega 2 (t) are dynamic weight functions aiming at a deviation term and a derivative term respectively, and the values are adjusted by real-time environment sensitivity; Inputting a pollutant time sequence vector group E, a production condition parameter vector group Q and a pollution control facility running state vector group U into a graph neural network, and constructing a dynamic correlation graph of (E, Q, U) parameters through the graph neural network; the calculation formula of the abnormal risk score P t (i) is as follows: wherein X t (i) represents the actual data of the i-th pollutant directly output by the time series characteristic vector X t , (I) In order to predict data of the ith pollutant through a prediction model h t , graph (Diff) is the correlation diagram difference characteristic of S32, concat represents vector splicing, and concat is equal to x t (i) (I) Three vectors of R i and Graph (Diff) are spliced into a vector with higher dimension, theta 0 and theta 1 are weight parameters of the deep neural network, alpha 0 and alpha 1 are corresponding bias items, reLU (·) is a ReLU activation function, which is totally called RECTIFIED LINEAR Unit, sigma (·) is a sigmoid function, which is used for mapping output to interval [0,1], namely, the final score is controlled in the range of [0,1], and the judgment threshold of abnormal risk score P t (i) is dynamically adjusted according to different enterprise emission standards and regional environment policies.
- 6. The intelligent verification method for environmental annual audit data of enterprises according to claim 5, wherein the determining method for omega 1 (t) and omega 2 (t) is characterized in that real-time meteorological data and peripheral sensitive point information are accessed, an environmental sensitive coefficient lambda esc (t) at the current moment is output through a pre-trained environmental scoring model, omega 1 (t) and omega 2 (t) are calculated through lambda esc (t), and the calculation formula of omega 1 (t) is as follows: Wherein eta base is a reference weight coefficient, lambda esc (t) is an environment sensitivity coefficient, omega 2 (t) is configured as a monotonically decreasing function of omega 1 (t), omega 2 (t)=f(ω 1 (t)) and the function f (·) satisfies 。
- 7. The intelligent verification method for environmental annual audit data of enterprises according to claim 1, wherein the verification policy agent comprises a comprehensive risk score set and multi-scale abnormal characteristics as a state space for establishing the verification agent, the established verification agent further comprises an action space, the action space is { deep audit, formal audit, triggering on-site audit, compliance passing }, namely the action space is used for verifying whether the multi-scale abnormal characteristics in the state space are compliant for the verification agent, and the total rewarding function of the agent is: Wherein R immediate represents instant rewards, R long term represents long-term rewards, and the expression of the instant rewards R immediate is as follows: Precision t denotes accuracy, recall t denotes Recall, cost t denotes resource consumption, β, δ, and ε denote weight coefficients that balance the indices, and the long-term rewards have the following expression: Zeta controls the stability reward and punishment importance coefficient, tau is the intensity coefficient for controlling the missed judgment and misjudgment punishment, the evaluation of R longterm is based on the dynamically updated reference true pool, delta gamma is the strategy stability reward and punishment, FNP is the missed judgment rate, and FPN is the misjudgment rate; The construction and dynamic updating method of the reference truth pool comprises the steps of collecting historical cases which are independently checked and agreed by a plurality of field experts before the system is operated, enabling an agent to be decided as a case which is deeply checked and triggered to be checked on site and finally confirmed to have problems after the system is operated, and enabling the decision to pass through, reporting and generating environmental accidents in a short period of time, and automatically entering the reference truth pool and adjusting corresponding labels after manual arbitration.
- 8. The intelligent verification method for environmental annual audit data of enterprises according to claim 1 is characterized in that training an intelligent body through a deep reinforcement learning algorithm comprises the steps that training the intelligent body data form a simulation training environment formed by a digital twin model constructed by massive heterogeneous enterprises and a preset complex modeling mode, the simulation training environment is constructed by collecting reference digital twin models of hundreds of typical enterprises in the industry and implanting a plurality of preset complex data modeling modes into a system of the simulation training environment, and the modeling modes comprise working condition relevance modeling, multi-parameter collaborative modeling and countermeasure modeling based on an generated countermeasure network.
- 9. An intelligent verification system for enterprise environmental protection annual audit data is characterized in that the system is used for realizing an intelligent verification method for enterprise environmental protection annual audit data according to any one of claims 1-8, and comprises a preprocessing module S100 for multi-source data fusion, a construction and calibration module S200 for a dynamic digital twin model, a multi-scale anomaly detection and comprehensive risk scoring module S300, a hierarchical reinforcement learning strategy optimization module S400, an agent verification and closed-loop evolution module S500, wherein the preprocessing module S100 for multi-source data fusion is used for acquiring enterprise environmental protection annual audit related data of various data sources, performing data cleaning, time alignment and normalization processing on the acquired data to construct a time sequence feature vector X t , and the construction and calibration module S200 for constructing an environmental protection operation digital twin model HT and a prediction model h t of the enterprise through the acquired feature vector X t and predicting emission data of the enterprise under different working conditions through a prediction model h t Output all The method comprises the steps of forming a simulation data stream D sim , simultaneously, cross-verifying and repairing the data streams D ori and D real by using a time sequence network based on an attention mechanism through a annual audit original data stream D ori and an Internet of things real-time data stream D real submitted by an enterprise to obtain correction data D cor , synchronously inputting a digital twin model to realize the calibration of the digital twin model, wherein a multi-scale anomaly detection and comprehensive risk scoring module S300 is used for carrying out parallel analysis on the correction data D cor and the simulation data stream D sim , and comprises microscopic time sequence anomaly detection, The method comprises the steps of macro association rule mining and comprehensive risk scoring, obtaining a dynamic compliance deviation index R i of each type of emission data, constructing a dynamic association graph of actual data through a graph neural network, identifying stealth contradiction through comparing the structural consistency of the actual data association graph with a digital twin model reference graph, finally inputting abnormal features output by S31 and S32 into a deep neural network, calculating an abnormal risk score P t (i) of each pollutant at a moment t, establishing a check strategy agent through a comprehensive risk score set and a multi-scale abnormal feature output by a multi-scale abnormal detection and comprehensive risk score module S300 by a hierarchical reinforcement learning strategy optimization module S400, optimizing a total rewarding function of the agent through interaction training with an environment, training the agent through a deep reinforcement learning algorithm, optimizing a decision strategy of the agent, and generating a check decision comprising abnormal data points according to an optimized agent decision strategy by an agent check and closed loop evolution module S500, and generating a check decision comprising abnormal data points according to a check decision result, The method comprises the steps of providing a visual environment-friendly comprehensive report of risk scoring, trend analysis and prediction, storing as a neuron group through verification decision results and follow-up real supervision feedback for periodically updating an intelligent agent, and reversely injecting verified data knowledge into an enterprise digital twin model and a risk scoring system to realize collaborative optimization and autonomous evolution of the whole system.
Description
Intelligent verification method and system for environmental protection annual audit data of enterprise Technical Field The invention relates to the technical field of artificial intelligence and environmental supervision, in particular to an intelligent verification method and system for environmental protection annual audit data of enterprises. Background In the current environmental protection and pollution control fields, the efficiency and accuracy of environmental protection supervision depend on comprehensive analysis of a large amount of sensor data and enterprise reports, along with the increasingly strict environmental regulations and the rapid increase of data volume, the traditional manual auditing and supervision mode has difficulty in meeting increasingly complex supervision demands, and in order to solve the problems, the intelligent and automatic means are increasingly important in environmental protection supervision, in particular to aspects of pollutant emission monitoring, environmental protection compliance verification, risk assessment and the like. At present, although some systems based on data analysis help to improve the monitoring efficiency, the problems of unstable quality of monitoring data, difficulty in interfacing monitoring equipment with manual data, difficulty in timely detecting data anomalies and the like still exist, in addition, the traditional evaluation method is single, and the system has weak response capability to emergencies and complex behaviors due to the lack of effective fusion and dynamic risk prediction of multi-source heterogeneous data. The digital twin technology is used as an emerging method in the industrial and environmental fields in recent years, an accurate virtual model can be built on the basis of a physical system through a virtualization and real-time data driving mode, dynamic optimization is realized through continuous updating and feedback, the technology is widely applied to multiple fields of intelligent manufacturing, intelligent transportation, energy management and the like, but the application in the environmental supervision field, especially in the aspects of environmental compliance and risk assessment, is in an exploration stage, and particularly the application of fully combining the digital twin technology, a neural network, deep learning and reinforcement learning in the environmental supervision field is very rare. Disclosure of Invention In order to overcome the problems in the prior art, the invention provides an intelligent verification method for environmental protection annual audit data of enterprises, which comprises the following steps: The method comprises the steps of S1, uniformly constructing time sequence feature vectors, acquiring enterprise environmental protection annual examination related data of various data sources, carrying out data cleaning, time alignment and normalization processing on the acquired data, constructing time sequence feature vectors X t, wherein the enterprise environmental protection annual examination related data comprise data of an Internet of things sensor, data acquired in a production process database, automatic monitoring platform data and manually filled data of an enterprise, the data cleaning, time alignment and normalization processing on the acquired data comprise cleaning the acquired data by adopting a mode of combining a rule engine and a stream processing framework with a machine learning model, processing missing values, abnormal values and format errors in the data, realizing a data cleaning process of multi-source data, aligning data streams with different frequencies and delays to standard time stamps by adopting a uniform time service and time window aggregation technology, realizing a time alignment process of the multi-source data, applying a layered normalization strategy to scale all features to a uniform numerical range according to data types and domain knowledge, and finally outputting a standardized time sequence feature vector X t, and realizing a normalization process of the multi-source data. S2, constructing a digital twin model and performing bidirectional calibration, wherein the construction of the digital twin model comprises constructing an environment-friendly operation digital twin model HT of an enterprise and a prediction model h t thereof through a feature vector X t obtained in the S1, and predicting emission data of the enterprise under different working conditions through the prediction model h tOutput allThe two-way calibration comprises cross verification and restoration of a D ori and a D real by using a time sequence network based on an attention mechanism through a annual audit original data stream D ori submitted by an enterprise and an Internet of things real-time data stream D real to obtain corrected data D cor, and synchronously inputting a digital twin model to realize calibration of the digital twin model, specifically, HT is a m