CN-121980483-A - Dynamic pruning method and multi-mode transformer anomaly detection method and system

CN121980483ACN 121980483 ACN121980483 ACN 121980483ACN-121980483-A

Abstract

The invention provides a dynamic pruning method and a multi-mode transformer anomaly detection method and system. Aiming at the bottlenecks of computational redundancy, indiscriminate treatment of modal quality and rigidity starting of a structure existing in the existing scheme, the invention constructs an elastic detection framework comprising multi-modal data quality perception, structural value evaluation, constrained dynamic pruning and conditional reasoning closed loop. The method comprises the steps of forming value ratio sequencing by introducing delay, power consumption and memory cost through multi-granularity importance measures such as channels/layers/attention heads/fusion branches, generating an enabling mask by utilizing a combined constraint optimization and double-threshold and middle interval random exploration strategy, and forming a continuously evolving feedback closed loop by combining confidence degree calibration, precision loss, marginal contribution statistics and risk self-adaptive threshold adjustment. Therefore, the invention solves the problems of high-frequency redundant operation and structure solidification, keeps the detection capability close to the full-quantity model, obviously reduces the cost of calculation and energy consumption, and improves the instantaneity, the stability and the detection sensitivity in the risk period.

Inventors

Long zhuo
YU YANG
RAO XIANG
WU GONGPING
MAO JIAONA
HUANG JING
YAO TONG
YANG BO
HUANG TAO
Yuan Yizhe
LIU YAQIN
KOU HAIJUN

Assignees

长沙理工大学
国网湖北省电力有限公司荆州供电公司

Dates

Publication Date: 20260505
Application Date: 20260409

Claims (10)

1. A dynamic pruning method is characterized in that: the method is used for dynamically pruning the multi-mode detection model of the transformer and comprises the following steps of: Step 1, multi-mode data quality perception, namely encoding each type of mode data of a transformer to obtain an embedded vector representing the mode characteristics and generating a quality index of each type of current mode data; The value evaluation is carried out, wherein the value of each candidate structural unit in the multi-mode detection model of the transformer is evaluated based on the embedded vector of each type of mode and the quality index, and the candidate structural units are pruning alternative units which can be independently started or closed and are unnecessary structures in the multi-mode detection model of the transformer; and 3, policy optimization and dynamic clipping, namely generating an enabling mask of each candidate structural unit based on the value of each candidate structural unit, constructing an objective function containing resource and precision punishment, dynamically updating the enabling mask, realizing iterative optimization closed loop of the enabling mask until convergence conditions are met, and clipping the multi-mode detection model of the transformer according to the enabling mask.
2. The method of claim 1, wherein the step 3 of generating an enabling mask by introducing a dual-threshold and middle interval random exploration strategy is as follows: ; ; In the formula, Scoring the weights of candidate building block k, Respectively an importance weight, a value ratio weight and an accuracy loss weight, As the importance score for candidate building block k, Is the value ratio of the candidate structural unit k, And The value in the step 2 is evaluated; the unit corresponding to the candidate structural unit k of the history statistics is cut out of precision loss; Is the mask value for candidate structural element k, Respectively a retention threshold value, a pruning threshold value and an intermediate threshold value, The retention probability for the middle interval of the candidate structural unit k satisfies: , The function is activated for Sigmoid, Is a temperature coefficient used for controlling the steepness degree of probability distribution; Is a Bernoulli distribution function; Wherein, the Determining enablement for candidate structural unit k of 1; a candidate structural unit k of 0, determining to disable and cut; Is that Random sampling generates a 0 or 1.
3. The method of dynamic pruning as set forth in claim 2, wherein the objective function including the resource and precision penalties in step 3 is , For the total objective function, M is the mask set for all candidate building blocks, The method meets the following conditions: ; Wherein, T (M), P (M) are respectively estimated total time delay and power consumption; As a cross entropy loss function for the mask set M, For the time delay constraint penalty factor, For the maximum allowable inference delay time, Penalty coefficients are constrained for power consumption, An upper limit is calculated for the power consumption, As a function of the ReLU, For the loss of the weight coefficient of precision, Is the mask value for candidate structural element k.
4. The method of dynamic pruning as set forth in claim 2, further comprising, after step 3, step 4 of adaptively adjusting the threshold value And/or And/or importance scores, implementing a continuously evolving feedback closed-loop mechanism; the method comprises the following steps: ; ; ; In the formula, The prediction result of the complete transformer multi-mode detection model, The result is output for reasoning of the current transformer multi-mode detection model under the enabling mask, Is the learning rate; Smoothing coefficient, parameter, as exponential moving average , In order to achieve a resource utilization degree, In order to achieve the target resource availability, In order to adapt the coefficients in a self-adaptive manner, An indicator function representing that candidate building block k is clipped.
5. The method of claim 1, wherein the value of the candidate building block in step 2 comprises a value ratio of importance score to introduced resource cost, specifically: step 2.1 for each candidate building block Calculating an initial importance score : ; Wherein, the As candidate building blocks Is activated by the output of the (c) signal, Activating for output The mid-position index u corresponds to the result, In order to proxy for the loss gradient, As candidate building blocks Attention entropy reduction amplitude participating in the attention calculation process of the subsequent layer; as a result of the normalization factor, As the weight of the candidate structural unit k, As the weight coefficient of the gradient sensitivity, For the sparsity of the canonical weight coefficient, As the information gain weight coefficient(s), For L1 norm, k is the index variable of the candidate structural unit, and u is the position index; Step 2.2, modulating an initial importance score based on quality and risk; ; Wherein, the A monotonic scaling function is used to scale the function, In order to obtain a modulated importance score, As a function of the risk modulation index, Is the source mode to which the candidate structural unit k belongs Is used for the quality score of (c) in the (c), Is a system risk level; Step 2.3, constructing resource cost normalization and value ratio; For candidate building blocks Is a rational cost vector of (a) Normalized to obtain the synthetic cost : ; Recalculating the value ratio : ; In the formula, Respectively normalizing the weight of the time cost, the normalized weight of the power consumption cost and the normalized weight of the memory cost, and presetting the super-parameters and meeting ; Respectively a time normalization reference, a power consumption normalization reference and a memory normalization reference; respectively reasoning time delay, reasoning power consumption and memory overhead of the candidate structural unit k; is a small positive number stable term.
6. The method of claim 1, wherein the quality index of the model data in step 1 is an original quality score formed by weighting some or all of the comprehensive signal-to-noise ratio, data integrity, short-time drift stability and time synchronization deviation, and is obtained by normalizing and temperature smoothing.
7. The multi-mode transformer abnormality detection method is characterized by comprising the following steps of: The multi-mode data acquisition is carried out, and a multi-mode detection model of the transformer is input to carry out anomaly detection; The multi-mode detection model of the transformer adopts the dynamic pruning method of any one of claims 1-6 to determine the model structure.
8. A system based on the dynamic pruning method according to any one of claims 1-6, characterized by comprising: the quality perception module is used for multi-mode data quality perception, namely, each type of modal data of the transformer is encoded to obtain an embedded vector representing the modal characteristics and generate a quality index of each type of current modal data; The evaluation module is used for evaluating the value of each candidate structural unit in the multi-mode detection model of the transformer based on the embedded vector of each type of mode and the quality index, wherein the candidate structural units can be independently started or closed and are pruning alternative units of unnecessary structures in the multi-mode detection model of the transformer; The clipping module is used for policy optimization and dynamic clipping, generating an enabling mask of each candidate structural unit based on the value of each candidate structural unit, constructing an objective function containing resource and precision punishment, dynamically updating the enabling mask, realizing iterative optimization closed loop of the enabling mask until convergence conditions are met, and clipping the multi-mode detection model of the transformer according to the enabling mask.
9. A computer apparatus, comprising: One or more processors; A memory storing one or more computer programs; Wherein the processor invokes a computer program to implement: the steps of a dynamic pruning method according to any one of claims 1 to 6 and a multimode transformer anomaly detection method according to claim 7.
10. A computer-readable storage medium storing a computer program, the computer program being invoked by a processor to implement: the steps of a dynamic pruning method according to any one of claims 1 to 6 and a multimode transformer anomaly detection method according to claim 7.

Description

Dynamic pruning method and multi-mode transformer anomaly detection method and system Technical Field The invention relates to the technical field of power equipment state monitoring and intelligent diagnosis, in particular to a dynamic pruning method and a multimode transformer abnormality detection method and system. Background The mutual inductor (including a current transformer, a voltage transformer and a combined mutual inductor) is used as a key pivot between primary and secondary measurement, measurement and protection devices of the power system, and the accuracy and stability of output signals of the mutual inductor directly influence the setting reliability of relay protection fixed values, the timeliness and accuracy of fault discrimination and the fairness of electric energy measurement. Once the transformer is abnormal, such as insulation aging, partial discharge, partial saturation of an iron core, loosening of windings, partial overheating, poor contact or turn-to-turn short circuit, the transformer can cause secondary side measurement drift, so that protection misoperation or refusal operation can be caused, and a larger range of power grid faults and power failure events can be promoted. Furthermore, the latent evolution of defects increases the running loss, shortens the equipment life, and transmits error priors to the upper energy management and state evaluation system, amplifying systematic risks. Therefore, effective monitoring and fault diagnosis of the transformer are necessary. However, with the improvement of the ratio of the new energy access to the power electronic device, the fluctuation and uncertainty of the operation condition of the power grid are remarkably enhanced, and the traditional strategy of relying on regular power outage overhaul or single-point parameter offline test is difficult to timely, continuously and finely reveal the full life cycle health state of the transformer, so that the research and development requirements on online intelligent abnormality detection means are urgent. The existing transformer anomaly detection technology generally goes through evolution from 'periodical offline test' to 'single parameter online monitoring' to 'multi-source data fusion and intelligent analysis'. The method is simple to implement, but has insufficient sensitivity to multiple physical coupling or early hidden faults, the analysis means based on a mechanism or an equivalent model depends on parameter identification precision, the adaptability is limited when facing complex disturbance environments, various signal processing and time-frequency characteristics (such as wavelet, small window Fourier, empirical mode decomposition and the like) can strengthen local characteristic capture, but the artificial characteristic engineering is high in dependence and limited in generalization capability, and the recognition precision is improved by the data-driven machine learning and deep learning model through automatic characteristic extraction, however, the robustness is still insufficient under the conditions of sensor noise, working condition drift or mode loss. On the basis, the multi-mode fusion method introduces multi-source synchronous or quasi-synchronous operation information from different physical dimensions, and improves complex fault coverage rate and discrimination confidence through feature splicing, attention mechanism, graph structure modeling or deep fusion network, thereby becoming the key direction of research and engineering application in recent years. Although multi-modal detection has advanced in accuracy, coverage and interference immunity over single-modal approaches, it still faces several significant problems in engineering deployment, namely, firstly, the majority of model structures are fixed in the off-line training phase, and full computation is performed on all modalities and all network branches, layers, channels or attention heads during reasoning, and dynamic differences in information contribution under different time windows and operating conditions are ignored, resulting in computational redundancy. Secondly, on-site edge computing resources (embedded processors, industrial PCs, etc.) are limited by computational effort, power consumption and latency budget, and it is difficult to support stable low-latency reasoning for large-scale multi-modal depth models for long periods of time. And thirdly, electromagnetic interference, sensor aging, time synchronization drift, data jitter or instant packet loss exist in an actual acquisition link, so that the signal-to-noise ratio, the integrity or the time sequence consistency of partial modes are reduced, the quality of each mode is balanced by default in the existing fusion strategy, and low-quality signals are easily and directly injected into a subsequent judging process, so that noise amplification and misjudgment are caused. In addition, common model compression technologies such as pruning, qu