Search

CN-121033401-B - Robustness improving method and system for target detection in complex power scene

CN121033401BCN 121033401 BCN121033401 BCN 121033401BCN-121033401-B

Abstract

The invention discloses a method and a system for improving the robustness of target detection in a complex power scene, which relate to the technical field of intelligent power grids, and the embodiment of the invention jointly forms a power target detection model through a purification model and a pre-training typical target detection model, and trains the power target detection model by combining enhanced data added with specific interference, in the model training process, firstly, a pre-training typical target detection model is pre-trained, then parameters of the pre-training typical target detection model are fixed, and the resistance loss output by the pre-training typical target detection model is used as training guide of the purification model so as to adjust the parameters of the purification model. The anti-interference boundary of the electric power target detection model is expanded by combining the purification model with the enhanced data, so that key tasks such as smoke detection, flame identification and the like can be judged and decided on the basis of cleaner and credible data, the target of smoke or flame is focused accurately, and the robustness of the model in the face of complex interference is improved.

Inventors

  • CHEN XIAOXIAO
  • ZHU HONGJIANG
  • ZHOU HUIKAI
  • HUANG JIALING
  • CHEN RUIZHI
  • ZHAO YI
  • WEI YUHANG
  • LIU RUOLIN
  • ZHOU PENG
  • HU YUNLONG
  • QI WEIQIANG
  • ZHENG SHIYU
  • QIAN JINGWEI
  • SHEN SIQI
  • ZHANG YEHUA
  • LU XIN

Assignees

  • 国网浙江省电力有限公司信息通信分公司

Dates

Publication Date
20260508
Application Date
20251030

Claims (6)

  1. 1. The method for improving the robustness of target detection in the complex power scene is characterized by comprising the following steps of: Acquiring an image and an image annotation under a preset power scene to form an original scene sample, wherein the image annotation comprises a smoke target annotation and a flame target annotation; performing boundary attack and probability diffusion on the original scene sample to obtain an enhanced sample; constructing a training sample set according to the original scene sample and the enhanced sample; The method comprises the steps of constructing an electric power target detection model comprising a purification model and a pre-training typical target detection model, wherein the purification model is used for purifying disturbance and noise data of the electric power target detection model to generate purification data; training the purification model by adopting the training sample set, fixing parameters of the pre-training typical target detection model in the training process, and taking the resistance loss as a loss function of the purification model; inputting the data of the region to be detected into the trained electric power target detection model to obtain a smoke target and a flame target of the region to be detected; the training of the purification model by using the training sample set, fixing parameters of the pre-training typical target detection model in the training process, taking the resistance loss as a loss function of the purification model, and comprising the following steps: Generating random Gaussian noise, adding the random Gaussian noise into the training sample set, and generating a noise-increasing training sample; based on a predefined mask, performing random transformation on the noise-added training samples to generate mask samples; when a preset model training termination condition is met, obtaining an electric power target detection model according to parameters of the purification model and the pre-training typical target detection model; Otherwise, fixing parameters of the pre-training typical target detection model, and iteratively training the parameters of the purification model by taking the loss of the pre-training typical target detection model as a loss function of the purification model; The step of fixing the parameters of the pre-training typical target detection model, using the loss of the pre-training typical target detection model as a loss function of the purification model, and performing iterative training on the parameters of the purification model, wherein the step of performing iterative training comprises the following steps: Fixing parameters of the pre-training typical target detection model; inputting the mask sample into the purification model, wherein the purification model takes the training sample set as an output target, and purifying the mask sample to obtain purification data; inputting the purification data into a pre-training typical target detection model to obtain a detection target result, and obtaining an antagonism loss according to the detection target result and the training sample set; obtaining a loss function of the purification model according to the resistance loss; introducing a gradient regularization term as gradient penalty, and iteratively adjusting parameters of the purification model according to the loss function and the gradient penalty; performing boundary attack and probability diffusion on the original scene sample to obtain an enhanced sample, wherein the method comprises the following steps: Based on a preset cross ratio index, carrying out boundary attack on the original scene sample to generate a boundary countermeasure sample; extracting edge visual features and semantic features of the original scene sample, and performing probability diffusion on the original scene sample according to the edge visual features and the semantic features to generate a mixed sample; obtaining an enhanced sample according to the boundary challenge sample and the mixed sample; Extracting the edge visual features and semantic features of the original scene sample, and performing probability diffusion on the original scene sample according to the edge visual features and the semantic features to generate a mixed sample, wherein the method comprises the following steps: extracting edge visual features of the original scene sample by adopting an edge detection algorithm to generate a visual priori edge map; extracting semantic features of the original scene sample, and constructing text prompt words according to the semantic features; inputting the visual priori edge map and the text prompt words into a pre-trained probability diffusion model, and mixing the output of the diffusion model with the original scene sample to obtain an initial mixed sample; And calculating the semantic similarity of the initial mixed sample and the text prompt word by adopting a CLIP model, and filtering the initial mixed sample according to the semantic similarity to obtain a mixed sample.
  2. 2. The method for improving the robustness of target detection in a complex power scene according to claim 1, wherein the performing boundary attack on the original scene sample based on a preset cross-over index to generate a boundary countermeasure sample comprises: Performing boundary attack on the original scene sample to generate an initial boundary countermeasure sample with fuzzy or blocked target boundary; calculating classification loss, loU loss and disturbance loss of the initial boundary countermeasure sample as boundary attack loss; And screening the initial boundary countermeasure sample according to a preset cross ratio index and the boundary attack loss to obtain a boundary countermeasure sample.
  3. 3. The method of claim 1, wherein the purification model and the pre-training representative target detection model are both obtained by pre-training the original scene sample, and wherein the pre-training representative target detection model comprises an attention module for calculating spatial attention and channel attention.
  4. 4. The method for improving the robustness of target detection in a complex power scene according to claim 1, wherein the pre-training typical target detection model is a convolution neural network model of multi-scale bidirectional feature fusion, and a dynamic weight mechanism is adopted to carry out convolution weighted fusion on features of different scales in the feature fusion process.
  5. 5. The method for improving the robustness of target detection in a complex power scene according to claim 1, wherein the randomly transforming the noise-plus-training samples based on a predefined mask to generate mask samples comprises: multiplying the predefined mask with the sample element by element to obtain a first random transformation algorithm; Adding Gaussian noise in the sample, multiplying a predefined mask with the sample added with the Gaussian noise element by element, and performing a second random transformation algorithm; the first random transformation algorithm or the second random transformation algorithm is adopted for the same sample for a plurality of times, and a third random transformation algorithm is obtained; and carrying out random transformation on the noise-added training sample by adopting the first random transformation algorithm, the second random transformation algorithm or the third random transformation algorithm to generate a mask sample.
  6. 6. A robust lifting system for target detection in a complex power scenario, comprising: The system comprises an original scene sample acquisition module, a power generation module and a power generation module, wherein the original scene sample acquisition module is used for acquiring images and image labels under a preset power scene to form an original scene sample; the enhanced sample generation module is used for carrying out boundary attack and probability diffusion on the original scene sample to obtain an enhanced sample; The training sample set construction module is used for constructing a training sample set according to the original scene sample and the enhanced sample; the system comprises a model construction module, a model detection module and a model analysis module, wherein the model construction module is used for constructing an electric power target detection model comprising a purification model and a pre-training typical target detection model, the purification model is used for purifying disturbance and noise data of the electric power target detection model to generate purification data, and the pre-training typical target detection model is used for generating a detection target result and an antagonism loss according to the purification data; The model training module is used for training the purification model by adopting the training sample set, fixing parameters of the pre-training typical target detection model in the training process, and taking the resistance loss as a loss function of the purification model; The model application module is used for inputting the data of the region to be detected into the trained electric power target detection model to obtain a smoke target and a flame target of the region to be detected; the model training module comprises: the noise adding unit is used for generating random Gaussian noise, adding the random Gaussian noise into the training sample set and generating a noise-increasing training sample; A random masking unit, configured to perform random transformation on the noise-added training samples based on a predefined mask, and generate mask samples; The model parameter determining unit is used for obtaining an electric power target detection model according to the parameters of the purification model and the pre-training typical target detection model when a preset model training termination condition is met; the iterative training unit is used for fixing parameters of the pre-training typical target detection model, taking the loss of the pre-training typical target detection model as a loss function of the purification model, and carrying out iterative training on the parameters of the purification model; The iterative training unit is specifically configured to: Fixing parameters of the pre-training typical target detection model; inputting the mask sample into the purification model, wherein the purification model takes the training sample set as an output target, and purifying the mask sample to obtain purification data; inputting the purification data into a pre-training typical target detection model to obtain a detection target result, and obtaining an antagonism loss according to the detection target result and the training sample set; obtaining a loss function of the purification model according to the resistance loss; introducing a gradient regularization term as gradient penalty, and iteratively adjusting parameters of the purification model according to the loss function and the gradient penalty; The enhanced sample generation module includes: The boundary attack unit is used for carrying out boundary attack on the original scene sample based on a preset cross ratio index to generate a boundary countermeasure sample; The probability diffusion unit is used for extracting edge visual features and semantic features of the original scene sample, and carrying out probability diffusion on the original scene sample according to the edge visual features and the semantic features to generate a mixed sample; An enhanced sample integration unit, configured to obtain an enhanced sample according to the boundary challenge sample and the mixed sample; The probability diffusion unit is specifically configured to: extracting edge visual features of the original scene sample by adopting an edge detection algorithm to generate a visual priori edge map; extracting semantic features of the original scene sample, and constructing text prompt words according to the semantic features; inputting the visual priori edge map and the text prompt words into a pre-trained probability diffusion model, and mixing the output of the diffusion model with the original scene sample to obtain an initial mixed sample; And calculating the semantic similarity of the initial mixed sample and the text prompt word by adopting a CLIP model, and filtering the initial mixed sample according to the semantic similarity to obtain a mixed sample.

Description

Robustness improving method and system for target detection in complex power scene Technical Field The invention relates to the technical field of smart grids, in particular to a robustness improving method and system for target detection in a complex power scene. Background With the development and digital transformation of the smart grid, a large-scale power business system is gradually accessed into an artificial intelligent model service so as to improve the operation and maintenance efficiency and the safety level. In key application scenes such as smoke hidden danger identification of a power transmission line and fire hidden danger detection of substation equipment, the conventional artificial intelligent model mostly adopts a general image enhancement technology to improve image quality in a branch of a target detection model, but natural environment conditions are complex and changeable, and specific interference in actual operation is difficult to face only through image enhancement. The robustness of the target detection model in the current power grid in a complex power scene is insufficient, and the realization of the requirements of high reliability and real-time performance of a power system is restricted. Disclosure of Invention The embodiment of the invention aims to provide a robustness improving method and a robustness improving system for target detection in a complex power scene, so that key tasks such as smoke detection, flame identification and the like can be judged and decided on the basis of cleaner and credible data, the target of smoke or flame is accurately focused, and the robustness of a model in the face of complex interference is improved. In a first aspect, an embodiment of the present invention provides a method for improving robustness of target detection in a complex power scenario, including: Acquiring an image and an image annotation under a preset power scene to form an original scene sample, wherein the image annotation comprises a smoke target annotation and a flame target annotation; performing boundary attack and probability diffusion on the original scene sample to obtain an enhanced sample; constructing a training sample set according to the original scene sample and the enhanced sample; The method comprises the steps of constructing an electric power target detection model comprising a purification model and a pre-training typical target detection model, wherein the purification model is used for purifying disturbance and noise data of the electric power target detection model to generate purification data; training the purification model by adopting the training sample set, fixing parameters of the pre-training typical target detection model in the training process, and taking the resistance loss as a loss function of the purification model; And inputting the data of the region to be detected into the trained electric power target detection model to obtain a smoke target and a flame target of the region to be detected. As an improvement of the above solution, the performing boundary attack and probability diffusion on the original scene sample to obtain an enhanced sample includes: Based on a preset cross ratio index, carrying out boundary attack on the original scene sample to generate a boundary countermeasure sample; extracting edge visual features and semantic features of the original scene sample, and performing probability diffusion on the original scene sample according to the edge visual features and the semantic features to generate a mixed sample; and obtaining an enhanced sample according to the edge countermeasure sample and the mixed sample. As an improvement of the above solution, the performing a boundary attack on the original scene sample based on a preset cross ratio index to generate a boundary countermeasure sample includes: Performing boundary attack on the original scene sample to generate an initial boundary countermeasure sample with fuzzy or blocked target boundary; calculating classification loss, loU loss and disturbance loss of the initial boundary countermeasure sample as boundary attack loss; And screening the initial boundary countermeasure sample according to a preset cross ratio index and the boundary attack loss to obtain a boundary countermeasure sample. As an improvement of the above solution, the extracting the edge visual feature and the semantic feature of the original scene sample, and performing probability diffusion on the original scene sample according to the edge visual feature and the semantic feature to generate a mixed sample includes: extracting edge visual features of the original scene sample by adopting an edge detection algorithm to generate a visual priori edge map; extracting semantic features of the original scene sample, and constructing text prompt words according to the semantic features; inputting the visual priori edge map and the text prompt words into a pre-trained probability diffusion model, and