CN-122023771-A - Target detection method, device, equipment, storage medium and product

CN122023771ACN 122023771 ACN122023771 ACN 122023771ACN-122023771-A

Abstract

The invention discloses a target detection method, a target detection device, target detection equipment, a target detection storage medium and a target detection product. The method comprises the steps of obtaining an image to be detected, and determining a target detection result corresponding to the image to be detected based on the processing of lossless downsampling, bidirectional feature weighted fusion and multi-scale target detection of the image to be detected. The method reduces the problem of information loss caused by traditional downsampling operation by performing lossless downsampling processing on the image to be detected, particularly for small targets which occupy only a few pixels, reserves fine granularity information which is critical to small target identification to the maximum extent, realizes more effective multiscale feature fusion by using a self-adaptive weight mechanism and a bidirectional trans-scale connection path through bidirectional feature weighted fusion processing, enhances the discrimination precision and the perception sensitivity of target features under complex background noise interference, and improves the accuracy and the reliability of positioning and identifying tiny targets through multiscale target detection processing.

Inventors

CUI YANDONG
WANG YIDING
YOU BO
HUANG YANQING
Gao Shengda
WU DI
SUN YUEXIN

Assignees

国网江苏省电力有限公司徐州供电分公司
江苏徐电建设集团有限公司

Dates

Publication Date: 20260512
Application Date: 20260128

Claims (10)

1. A method of detecting an object, comprising: Acquiring an image to be detected; And determining a target detection result corresponding to the image to be detected based on the processing of lossless downsampling, bidirectional feature weighted fusion and multi-scale target detection on the image to be detected.
2. The method according to claim 1, wherein the determining the target detection result corresponding to the image to be detected based on the processing of performing lossless downsampling, bidirectional feature weighted fusion and multi-scale target detection on the image to be detected includes: Extracting features of the image to be detected according to a dimension conversion module and a non-stride convolution module to obtain a multi-scale first feature map; Respectively carrying out feature extraction on each first feature map according to a semantic feature extraction module and a spatial position feature extraction module to obtain a corresponding semantic feature map and a spatial position feature map, and determining a second feature map corresponding to each scale according to each semantic feature map and the spatial position feature map through a weighted fusion mechanism; And respectively carrying out target detection on the second feature images with corresponding scales according to each detection head, and determining target detection results based on each prediction result obtained by detection.
3. The method according to claim 2, wherein the feature extraction is performed on the image to be detected according to a dimension conversion module and a non-stride convolution module to obtain a multi-scale first feature map, including: Preprocessing the image to be detected to obtain an initial feature map; For a dimension conversion module and a non-stride convolution module which are positioned at a first level, dividing the initial feature map in a space dimension according to the dimension conversion module, rearranging pixels of each divided sub feature map in a channel dimension according to a scale factor to generate a rearranged feature map corresponding to the first level, and processing the rearranged feature map according to the non-stride convolution module to obtain a first feature map corresponding to the first level; For a dimension conversion module and a non-stride convolution module which are positioned at other levels except the first level, dividing a first feature image output by the previous level in a space dimension according to the dimension conversion module, rearranging pixels of each divided sub feature image in a channel dimension according to a scale factor to generate a rearranged feature image corresponding to the level, and processing the rearranged feature images according to the non-stride convolution module to obtain the first feature image corresponding to the level.
4. The method according to claim 2, wherein the feature extraction module performs feature extraction on each of the first feature maps according to a semantic feature extraction module and a spatial location feature extraction module to obtain a corresponding semantic feature map and a spatial location feature map, and determines a second feature map corresponding to each scale according to each of the semantic feature map and the spatial location feature map through a weighted fusion mechanism, including: According to the semantic feature extraction modules and the spatial position feature extraction modules which are respectively distributed in the corresponding layers, respectively carrying out feature extraction on the first feature images of the corresponding layers to obtain semantic feature images and spatial position feature images respectively corresponding to the layers; Aiming at a first level, acquiring a fusion semantic feature map of a next adjacent level, and carrying out weighted fusion on the fusion semantic feature map and a semantic feature map, a spatial position feature map and a first feature map corresponding to the level to obtain a second feature map corresponding to the first level, wherein the fusion semantic feature map is determined according to the semantic feature map of the level and the fusion semantic feature map of the next adjacent level; For the last hierarchy, acquiring a fused spatial position feature map of the last adjacent hierarchy, and carrying out weighted fusion on the fused spatial position feature map and a semantic feature map and a spatial position feature map corresponding to the hierarchy to obtain a second feature map corresponding to the last hierarchy, wherein the fused spatial position feature map is determined according to the spatial position feature map of the hierarchy and the fused spatial position feature map of the last adjacent hierarchy; and aiming at each other level except the first level and the last level, acquiring a fusion semantic feature map of the next adjacent level and a fusion spatial position feature map of the last adjacent level, and carrying out weighted fusion on the fusion semantic feature map and the fusion spatial position feature map and the semantic feature map and the spatial position feature map corresponding to the level to obtain a second feature map corresponding to the level.
5. The method according to claim 4, wherein each first feature map is input to a corresponding semantic feature extraction module and a spatial position feature extraction module through original input nodes arranged in a corresponding hierarchy, and each second feature map is determined through output nodes and output to a detection head arranged in a corresponding hierarchy; for the original input nodes and the output nodes which are in the same hierarchy, a direct communication path is established through cross-node residual connection.
6. The method according to any one of claims 1 to 5, wherein the determining, based on the processing of performing lossless downsampling, bidirectional feature weighted fusion, and multi-scale object detection on the image to be detected, a corresponding object detection result of the image to be detected is performed by an object detection model; The training step of the target detection model comprises the following steps: acquiring an initial detection model and a sample training set, wherein the sample training set comprises at least one sample spacer image serving as input and a spacer labeling result serving as a label correspondingly; Inputting the sample spacer image into the initial detection model to obtain a current detection result output by the initial detection model; Extracting real anchor frame information in the spacer labeling result, extracting predicted anchor frame information of the current detection result, and determining a first loss function value of a boundary frame regression loss function according to the real anchor frame information and the predicted anchor frame information; determining a second loss function value of the classification loss function according to the category information in the spacer labeling result and the category information in the current detection result, and determining a target loss function value according to the first loss function value and the second loss function value; reversely learning and adjusting network parameters in the initial detection model according to the objective loss function value to obtain an adjusted initial detection model, and returning to execute the input operation of the sample spacer image again until reaching the training ending condition; And determining the initial detection model obtained after training is finished as a target detection model.
7. An object detection apparatus, comprising: the acquisition module is used for acquiring the image to be detected; the detection module is used for determining a target detection result corresponding to the image to be detected based on the processing of performing lossless downsampling, bidirectional feature weighted fusion and multi-scale target detection on the image to be detected.
8. An electronic device, the electronic device comprising: At least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the object detection method of any one of claims 1-6.
9. A computer readable storage medium storing computer instructions for causing a processor to perform the object detection method according to any one of claims 1-6.
10. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the object detection method of any of claims 1-6.

Description

Target detection method, device, equipment, storage medium and product Technical Field The present invention relates to the field of image processing technologies, and in particular, to a target detection method, apparatus, device, storage medium, and product. Background As an important component of an electrical power system, the operational state of the transmission line is critical to the reliability and safety of the power delivery. The spacer acts as a key component in the transmission line to maintain proper spacing between wires to prevent wire collisions or contact. Along with the expansion of the power network, the monitoring of the spacer is particularly important, because faults such as loosening, sliding or falling off and the like can occur due to long-term use, and the stable operation of the power system is threatened. In the field of electric power inspection, traditional target detection methods rely on manually designed feature extraction methods, such as scale-invariant feature variation and directional gradient histograms, but the methods have limited effects when facing challenges such as complex background, illumination variation, scale variation and visual angle diversity. With the rise of deep learning, the target detection method based on the convolutional neural network remarkably improves the detection precision and speed, and becomes the main stream of research. Although existing target detection methods perform well in a variety of applications, many challenges remain in power inspection, particularly for the detection of small objects, such as spacers. The spacer has small size and simple shape, occupies only few pixels in the high-resolution aerial image, and is very easy to lose due to continuous downsampling operation in the deep network transfer process. In addition, the problems of complex background, multi-wire shielding, unstable illumination and the like also increase the detection difficulty, so that the problems of high omission rate and inaccurate positioning exist in the existing target detection method when the power line spacer is detected. Disclosure of Invention The invention provides a target detection method, a device, equipment, a storage medium and a product, which are used for solving the problems of high omission ratio and inaccurate positioning existing in the existing target detection method when a power line spacer is detected. In a first aspect, an embodiment of the present invention provides a target detection method, including: Acquiring an image to be detected; And determining a target detection result corresponding to the image to be detected based on the processing of lossless downsampling, bidirectional feature weighted fusion and multi-scale target detection on the image to be detected. In a second aspect, an embodiment of the present invention provides an object detection apparatus, including: the acquisition module is used for acquiring the image to be detected; the detection module is used for determining a target detection result corresponding to the image to be detected based on the processing of performing lossless downsampling, bidirectional feature weighted fusion and multi-scale target detection on the image to be detected. In a third aspect, an embodiment of the present invention provides an electronic device, including: At least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the object detection method according to any one of the embodiments of the present invention. In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause a processor to execute the method for detecting an object according to any one of the embodiments of the present invention. In a fifth aspect, embodiments of the present invention also provide a computer program product comprising a computer program which, when executed by a processor, implements the object detection method according to any of the embodiments of the present invention. According to the technical scheme, the target detection result corresponding to the image to be detected is determined based on the processing of lossless downsampling, bidirectional feature weighted fusion and multi-scale target detection of the image to be detected. The method reduces the problem of information loss caused by traditional downsampling operation by performing lossless downsampling processing on the image to be detected, particularly for small targets which occupy only a few pixels, reserves fine granularity information which is critical to small target identification to the greatest extent, realizes more effective multiscale feature fusion by using a self-adaptive weight mechanism and a bidirectional tra