CN-121981493-A - Hydropower data safety management and compliance circulation method based on hierarchical classification
Abstract
The invention discloses a hydropower data safety control and compliance circulation method based on hierarchical classification, and belongs to the technical field of intelligent operation and maintenance of hydropower equipment and industrial data management. The method comprises the steps of managing multi-source heterogeneous time sequence operation data of the hydroelectric equipment based on a hierarchical classification system, constructing a supervision dimension reduction network integrating health state labels, realizing self-adaptive extraction of low-dimensional, safe and sensitive feature subsets conforming to hierarchical classification standards from high-dimensional original data, inputting the managed low-dimensional features into a time sequence application model according to compliance requirements, supporting compliance applications such as fault prediction and the like, and realizing compliance mining and safety circulation of data values.
Inventors
- WANG WEIHAO
- XUE YUANYUAN
- MA WENHUA
- CHEN YANAN
- WANG DANDAN
- HUANG CHENYU
Assignees
- 国家能源集团新疆吉林台水电开发有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260225
Claims (8)
- 1. The hydropower data safety management and compliance circulation method based on hierarchical classification is characterized by comprising the following steps of: S1, based on a preset data classification and classification standard, multi-source heterogeneous time sequence operation data generated in the operation process of the hydroelectric equipment are collected and treated to form an original high-dimensional data sample set which accords with classification and classification standards, and a health state label sequence which is aligned in time sequence and represents the security level and the application value of the data is correspondingly generated; S2, taking an original high-dimensional data sample set conforming to the classification specification as input, taking the corresponding health state label sequence as a treatment and supervision signal, and constructing and training a classification-oriented supervision dimension reduction treatment model; the mapping relation from high-dimensional data to low-dimensional features meeting the requirements of safety and compliance is learned by using the health state tag sequence as a network learning mechanism of a supervision signal, and a low-dimensional safety feature subset which has high value density and high discriminant and meets the classification standard is extracted; S3, training a time sequence analysis model facing to compliance application by utilizing the low-dimensional safety feature subsets and the corresponding time sequences under a data compliance circulation framework, extracting corresponding low-dimensional safety feature subsets from the supervision and dimension reduction treatment model aiming at equipment operation data to be circulated or applied, organizing the time sequence and compliance requirements into a time sequence, inputting the time sequence analysis model to the trained time sequence analysis model, and outputting an analysis result in a compliance range by the time sequence analysis model based on the time evolution mode of the low-dimensional features.
- 2. The method for safety management and compliance flow of hydroelectric data based on hierarchical classification as claimed in claim 1, wherein step S1 specifically comprises: the method comprises the steps of accessing a state monitoring system of hydroelectric equipment, synchronously collecting original time sequence signals from a vibration sensor, a swing degree sensor, a pressure sensor, a temperature sensor and an oil monitoring sensor according to data classification and grading specifications, and recording a preset security level and a data type label of the original time sequence signals; Preprocessing the original time sequence signal according with the data treatment requirement, wherein the preprocessing comprises compliance segmentation based on a time window, outlier detection and safety treatment, missing value compliance interpolation and feature primary extraction, so as to form a multi-dimensional operation data sample with a fixed time length under a unified time stamp, ensure that the processing process meets the grading protection requirement, and form an original high-dimensional data sample set according with the grading classification specification; And marking the data value and the safety state corresponding to each data sample in the original high-dimensional data sample set based on a data classification grading strategy, an equipment history maintenance record, an expert diagnosis report, a preset alarm threshold and an operation rule, and generating a health state tag sequence which is used as a treatment and supervision signal and used for representing the complete value and the safety state of the data, wherein the tag sequence comprises normal data, data to be focused and marks of sensitive data types so as to guide the subsequent feature extraction and circulation application.
- 3. The grading-classification-based hydropower data safety management and compliance circulation method according to claim 1, wherein the step-classification-oriented supervision dimension reduction management model is constructed and trained, and specifically comprises: The supervised dimension reduction control model is a data sensitivity supervised dimension reduction network based on a depth self-encoder, and the network comprises an encoder and a decoder; embedding a lightweight data state classifier at the end of the encoder; designing a joint loss function of a data sensitivity supervision dimensionality reduction network, wherein the joint loss function is formed by weighting a data reconstruction loss term and a data state classification loss term, and the calculation of the data state classification loss term is based on the difference between the output of the lightweight data state classifier and a real state label from a health state label sequence; The method comprises the steps of taking an original high-dimensional data sample set conforming to a hierarchical classification specification as input, taking a corresponding health state label sequence as a treatment and supervision signal, training parameters of the data sensitivity supervision dimensionality reduction network by minimizing the joint loss function, enabling the health state label sequence to directly supervise and optimize a feature coding process, enabling an encoder to learn a nonlinear mapping relation which is from high-dimensional data to low-dimensional feature space and is strongly related to data value and safety state, mapping the original high-dimensional data sample into a low-dimensional safety feature subset meeting the high-discriminant requirements of data reconstruction fidelity and data state, and conforming to hierarchical classification and minimum necessary principles.
- 4. The method for safety management and compliance flow of hydroelectric data based on hierarchical classification according to claim 3, wherein designing a joint loss function of a data sensitivity supervision dimensionality reduction network comprises: the data reconstruction loss term is configured to form a constraint on the encoder that retains critical reconstruction information by calculating a difference between the reconstructed data output by the decoder and the original high-dimensional data samples; the data state classification loss term is configured to form a constraint on the encoder that extracts data state separable features by calculating a difference between a predicted output of the lightweight data state classifier and a real state label from a sequence of health state labels, relating features to data value/security level; And by adjusting the weighting coefficients of the data reconstruction loss term and the data state classification loss term in the joint loss function, balancing two constraints of information retention and state discrimination, the encoder outputs a low-dimensional security feature subset which simultaneously meets the requirements of data reconstruction fidelity and data state discrimination after training.
- 5. The method for safety management and compliance flow of hydroelectric data based on hierarchical classification as claimed in claim 1, wherein step S2 specifically comprises: The low-dimensional safety feature subset extracted from the equipment operation data to be circulated or applied through the supervision and dimension reduction treatment model is organized into a time sequence according to the acquisition time sequence and the compliance sequence organization requirement, and is input into a time sequence analysis model after training is completed; the time sequence analysis model extracts and quantifies an evolution mode of the low-dimensional features in the time dimension from the time sequence through an internal time feature extraction structure; Based on the captured evolution mode, the time sequence analysis model synchronously executes multi-task analysis under a preset compliance application scene, outputs probability distribution of fault types of the hydroelectric equipment in a future preset period, outputs fault occurrence risk probability estimation value in the future preset period, and outputs residual service life predicted value from the current moment to functional failure of the hydroelectric equipment.
- 6. The method for safety management and compliance flow of hydroelectric data based on hierarchical classification according to claim 5, wherein the time series analysis model is a multitasking time series neural network based on an attention mechanism, specifically comprising: a temporal feature encoder for receiving and encoding the subset of low-dimensional security features organized chronologically, outputting a temporal encoding feature; a multi-head self-attention module connected after the time feature encoder for capturing long-range dependency of the time sequence encoding features in a time dimension and focusing on a time segment contributing to the right application; the multi-task analysis head is connected behind the multi-head self-attention module and comprises a parallel classification sub-network and a regression sub-network which are respectively used for synchronously outputting probability distribution of fault types, risk probability estimation of fault occurrence and predicted value of residual service life under a compliance framework.
- 7. The method for safety management and compliance flow of hydroelectric data based on hierarchical classification according to claim 1, wherein the method further comprises step S4, specifically comprising: setting corresponding dynamic application early warning thresholds for the fault occurrence risk probability and the residual service life predicted value according to a data security policy, an operation and maintenance rule and a preset risk tolerance; When the risk probability estimated value output by the time sequence analysis model continuously exceeds the dynamic early warning threshold value or the residual service life estimated value continuously falls below the dynamic early warning threshold value, automatically triggering an application early warning signal of a corresponding level, and generating a diagnosis report containing data application discovery and suggestion, wherein the report content accords with the data desensitization and knowledge output standard; The original high-dimensional data sample triggering the early warning, the corresponding low-dimensional safety feature subset, the analysis result output by the model and the actual state conclusion confirmed by equipment overhaul after the fact are formed into a feedback sample with a label, and the feedback sample is stored into a special model and a data treatment evolution database according to the data classification and grading requirements; and periodically extracting newly added feedback samples from the model and the data management evolution database, performing collaborative incremental learning training on the supervision dimension reduction management model and the time sequence analysis model, and evaluating the compliance of the supervision dimension reduction management model and the time sequence analysis model to the data classification standard so as to continuously optimize the performance and the data management effect of the model on the premise of compliance by utilizing new feedback and new knowledge generated in the data application process.
- 8. The hierarchical classification-based hydropower data safety management and compliance flow method according to claim 7, wherein the supervised dimension reduction management model and the time sequence analysis model are subjected to collaborative incremental learning training, and specifically comprising: Adopting an incremental learning strategy based on experience playback, constructing a mixed training set from the model and the data management evolution database, wherein the mixed training set comprises the newly added feedback sample and a representative core sample sampled from the historical management data, and the data classification and balance are considered in the sampling process; sequentially inputting samples in the mixed training set into a complete application pipeline formed by connecting the supervision and dimension reduction treatment model and the time sequence analysis model in series for forward calculation; And calculating loss based on the difference between the result of the forward calculation and the actual state conclusion, and carrying out joint optimization updating on the parameters of the two models in the complete application pipeline through a back propagation algorithm.
Description
Hydropower data safety management and compliance circulation method based on hierarchical classification Technical Field The invention relates to the technical field of intelligent operation and maintenance of hydroelectric equipment and industrial data management, in particular to a hydropower data safety management and compliance circulation method based on hierarchical classification. Background The hydro-electric key equipment such as the hydroelectric generating set operates under complex working conditions for a long time, mass operation data generated by the hydro-electric key equipment are valuable assets, but the multi-source, high-dimensional and heterogeneous characteristics of the hydro-electric key equipment also bring challenges for data management, safety and compliance application. Along with the popularization of the state monitoring system, the multidimensional sensors such as vibration, swing degree, pressure, temperature and the like generate massive high-dimensional operation data, so that the possibility is provided for intelligent application based on data driving, and the data management and compliance circulation become particularly important. The current technical route faces serious challenges in practical application, and is mainly embodied in the following three layers: The existing method has fundamental limitation in the data management level. The operation data of the hydroelectric equipment has the characteristics of high dimensionality, strong coupling and uneven value density. The original high-dimensional data is directly stored, transmitted or applied, so that the efficiency is low, the calculation burden is heavy, and the exposure of sensitive information, the abuse of data or the risk of compliance are more likely to be caused. The current non-supervision dimension reduction method is generally adopted, the optimization target is completely irrelevant to the security level and the application value of the data, so that the most sensitive components to the key state may be lost by the characteristics after treatment, or the sensitive data and the public data cannot be effectively distinguished, and the value of the data compliance circulation and application is limited from the source. In the aspect of model construction and application, the existing method is insufficient in intelligent degree, poor in interpretability and compliance, the mainstream method still relies on expert experience to carry out complicated manual feature engineering and data screening, the process is solidified and difficult to standardize and scale, although the deep learning model can carry out end-to-end feature learning, the model is directly trained and applied on high-dimensional original data, the model is extremely easy to fit, and the data circulation process and result lack of transparency and compliance basis, a data manager is difficult to understand how the data is used or not to meet the hierarchical protection requirement, and the trust degree and operability of data circulation and application are reduced. Essentially, the existing method cannot deeply fuse the data classification grading standard, the security policy and the data driving model in the feature extraction and application stage. In the engineering and compliance application level, the data circulation and application performance of the existing method are difficult to meet the scene requirements of predictive maintenance and the like. Due to the defects of data processing and model construction, the existing method is generally characterized in that the data circulation efficiency is low, the value mining is insufficient, the early abnormal slow-change states are perceived and early-warning delayed, the early abnormal slow-change states can be identified from data only when the states are obviously deteriorated, the value of data early warning is lost, meanwhile, the false report rate of a data application model is high, frequent false early warning or improper data output interferes with normal decisions, and potential compliance risks are brought, so that the method is difficult to land and has actual effect in the data value release of an actual power station. Disclosure of Invention The invention provides a water and electricity data safety management and compliance circulation method based on hierarchical classification, which solves the technical problems of feature redundancy, high safety risk, low circulation efficiency and insufficient compliance caused by directly processing high-dimensional and heterogeneous time sequence data in the existing data management method, is difficult to effectively extract the features related to high data value/safety state, and constructs a data processing and application framework which is used for extracting the low-dimensional safety features meeting the standard from the high-dimensional data and performing intelligent analysis under the compliance framework based on a