CN-121982594-A - Target identification and accurate photographing method and system for transmission tower inspection
Abstract
The invention relates to the technical field of electric power inspection, in particular to a target identification and accurate photographing method and a system for transmission tower inspection. Acquiring a camera preview frame, extracting multi-scale fusion characteristics, extracting context characteristics through longitudinal and transverse banding pooling, generating a main structure heat map, a cross arm structure heat map and a wire line area heat map of a tower body, carrying out gating re-weighting according to the structure heat map to obtain structure guide characteristics, carrying out target detection to output detection frames and confidence levels of six types of targets, correcting the confidence levels by sampling response values of the center points of the detection frames on the corresponding structure heat maps, selecting key shooting targets according to the corrected confidence levels, generating a holder posture adjustment instruction and zoom parameters, and triggering shooting when continuous multiframes meet centering, stabilizing and clear conditions. The invention realizes airborne real-time detection and accurate photographing closed-loop control, and improves the quality and efficiency of inspection photos.
Inventors
- LI FAN
- TONG CHAO
- GAO YANLING
- Mei Yucong
- YANG YUHANG
- LI CHANGDONG
Assignees
- 国网江西省电力有限公司电力科学研究院
- 国家市场监督管理总局国家标准技术审评中心
Dates
- Publication Date
- 20260505
- Application Date
- 20260407
Claims (10)
- 1. A target identification and accurate shooting method for transmission tower inspection is characterized by comprising the following steps: s100, acquiring a camera preview frame in the unmanned aerial vehicle inspection process, and extracting multi-scale fusion characteristics; s200, respectively carrying out longitudinal banding pooling and transverse banding pooling on the fusion features, extracting longitudinal context features and transverse context features, fusing the longitudinal context features and the transverse context features with original fusion features, generating a multi-class structure heat map comprising a tower main structure heat map, a cross arm structure heat map and a wire line region heat map based on the fused features, and carrying out gate control weighting on the fused features according to the multi-class structure heat map to obtain structure guiding features; S300, performing target detection based on the structure guide characteristics, and outputting detection frames, categories and confidence degrees comprising a tower whole, a cross arm, an insulator string, a hardware fitting connection area, a wire clamp and a wire; S400, sampling a response value of a center point of the detection frame on a structural heat map corresponding to the category of the center point, and carrying out weighted calculation on the confidence coefficient and the response value to obtain a confidence coefficient after structural correction; S500, selecting a key shooting target from a detection result according to the confidence coefficient after structure correction, generating a holder posture adjustment instruction according to the offset of the center of the target detection frame and the center of the image, and adjusting a camera zooming parameter according to the proportion of the area of the target detection frame to the area of the image; and S600, controlling the camera to take a high-resolution picture when the continuous multiframes simultaneously meet the target centering condition, the detection stability condition and the image definition condition.
- 2. The method for identifying and accurately photographing a target for transmission tower inspection according to claim 1, wherein in S200, the steps of longitudinal striping and transverse striping include: Dividing the fusion feature into a plurality of longitudinal strips along the longitudinal direction of the image, and pooling the features in each longitudinal strip to obtain longitudinal context features; and dividing the fusion feature into a plurality of transverse strips along the transverse direction of the image, and pooling the features in each transverse strip to obtain transverse context features.
- 3. The method for identifying and accurately photographing a target for transmission tower inspection according to claim 1, wherein in S200, the step of gate re-weighting comprises: generating a gating weight map according to the tower body main structure heat map, the cross arm structure heat map and the wire-shaped area heat map; and multiplying the gating weight graph with the fused features element by element to obtain the structure guiding features.
- 4. The method for identifying and accurately photographing a target for transmission tower inspection according to claim 1, wherein in S400, the structural heat map corresponding to the category is: The cross arm category corresponds to a cross arm structure heat map; the wire category corresponds to a wire-like region heat map; Insulator string type, hardware fitting connection area type and wire clamp type correspond to tower main structure heat maps, cross arm structure heat maps or wire line area heat maps; the whole class of the pole tower corresponds to the main structure heat map of the tower body.
- 5. The method for identifying and accurately photographing a target for transmission tower inspection according to claim 1, wherein in S400, the step of weighting calculation is: Multiplying the response value by a preset weight coefficient, and then adding 1 to obtain a correction factor; And multiplying the confidence coefficient by the correction factor to obtain the confidence coefficient after the structure is corrected.
- 6. The method for identifying and accurately photographing a target for transmission tower inspection according to claim 1, wherein in S500, the step of selecting the key photographing target comprises: According to the priority order of the insulator string, the wire clamp, the hardware fitting connection area, the cross arm, the whole tower and the wires, selecting a target with highest confidence after structure correction from the detection results as a key shooting target.
- 7. The target recognition and precise photographing method for transmission tower inspection according to claim 1, wherein in S600: the target centering condition is that the offset of the center of the target detection frame and the center of the image is smaller than a preset centering threshold value; The detection stability condition is that the central position variation and the area variation of a target detection frame in continuous multiframes are smaller than a preset stability threshold; The image definition condition is that the image gradient amplitude of the target area is larger than a preset definition threshold.
- 8. The method for identifying and accurately photographing a target for transmission tower inspection according to claim 1, further comprising the step of generating a supervision signal of a structural heat map in the training of the model for detecting the target in S300 by: Generating a supervision signal of a main structure heat map of the tower body according to the marked central line of the whole tower detection frame; generating a supervision signal of the cross arm structure heat map according to the length direction of the marked cross arm detection frame; And generating a supervision signal of the wire-shaped area heat map according to the marked wire detection frame area.
- 9. The method for identifying and accurately photographing a target for transmission tower inspection according to claim 1, further comprising, when training, using a model for target detection in S300 with a structural consistency training constraint, the structural consistency training constraint comprising: class-structure matching loss, namely calculating loss according to the difference value between the threshold value and the response value when the response value of the central point of the predicted detection frame on the corresponding structure heat map is lower than a first preset threshold value; Calculating the loss according to the distance between the central point and the boundary of the region when the central point of the predicted detection frame of the cross arm type, the insulator string type, the hardware fitting connection region type or the wire clamp type exceeds the expansion region of the whole predicted detection frame of the tower; the loss of the topological relation of the components is calculated according to the normalized distance between the predicted detection frames of the preset key component pairs and the area of the whole detection frame of the tower, and the normalized distance exceeds a second preset threshold value; The preset key component pair comprises at least one group of an insulator string and hardware fitting connection area, a wire clamp and a wire, and an insulator string and a cross arm.
- 10. A target recognition and precise photographing system for transmission tower inspection, wherein the system is configured to perform the target recognition and precise photographing method for transmission tower inspection according to any one of claims 1 to 9, the system comprising: The image acquisition module is used for acquiring a camera preview frame in the unmanned aerial vehicle inspection process; The feature extraction module is used for extracting multi-scale fusion features from the preview frames; The tower body structure priori module is used for respectively carrying out longitudinal banding pooling and transverse banding pooling on the fusion characteristics, extracting longitudinal context characteristics and transverse context characteristics, fusing the longitudinal context characteristics and the transverse context characteristics with original fusion characteristics, generating a multi-class structure heat map comprising a tower body main structure heat map, a cross arm structure heat map and a wire line area heat map based on the fused characteristics, and carrying out gate control weighting on the fused characteristics according to the multi-class structure heat map to obtain structure guiding characteristics; The target detection module is used for carrying out target detection based on the structure guide characteristics and outputting detection frames, categories and confidence degrees comprising a tower whole, a cross arm, an insulator chain, a hardware fitting connection area, a wire clamp and a wire; The confidence coefficient correction module is used for sampling the response value of the center point of the detection frame on the structural heat map corresponding to the category of the center point, and carrying out weighted calculation on the confidence coefficient and the response value to obtain the confidence coefficient after structural correction; The photographing control module is used for selecting a key photographing target from the detection result according to the confidence coefficient after the structure correction, generating a holder posture adjustment instruction according to the offset of the center of the target detection frame and the center of the image, adjusting a camera zooming parameter according to the proportion of the area of the target detection frame to the area of the image, and controlling the camera to photograph a high-resolution picture when continuous multiframes simultaneously meet the target centering condition, the detection stability condition and the image definition condition.
Description
Target identification and accurate photographing method and system for transmission tower inspection Technical Field The invention relates to the technical field of electric power inspection, in particular to a target identification and accurate photographing method and a system for transmission tower inspection. Background With the development of unmanned aerial vehicle technology, unmanned aerial vehicle inspection has been widely used in transmission line, shaft tower and power equipment's detection. In the prior art, image acquisition is carried out by carrying a camera, target identification is carried out by combining deep learning, and a foundation is laid for automation and intellectualization of unmanned aerial vehicle inspection. However, there are significant disadvantages to the prior art. The general detection model lacks explicit modeling of the vertical and horizontal structure of the pole tower and the wire line characteristic, and is outstanding in false detection and omission under a complex background. Real-time performance and accuracy are difficult to be considered under the condition of limited airborne calculation force, and the light-weight model accuracy is unstable. Lacking structural consistency constraints, the detection results are difficult to directly drive closed-loop control. The reasoning stage depends on complex rules, and the parameter adjustment is difficult and the maintenance cost is high. The object identification and photographing process are separated, and composition and focusing cannot be dynamically adjusted according to the identification result, so that the object deviation, defocusing or insufficient definition under high-power zooming are caused, and the quality of the inspection photo and the subsequent defect detection efficiency are seriously affected. Disclosure of Invention The invention provides a target identification and accurate photographing method and a system for transmission tower inspection, and aims to solve the problems of unstable target detection, insufficient airborne real-time performance and poor imaging quality caused by identification and photographing control cutting in the conventional transmission tower inspection. In order to achieve the above purpose, the present invention provides the following technical solutions: The invention relates to a target identification and accurate photographing method for transmission tower inspection, which comprises the following steps: s100, acquiring a camera preview frame in the unmanned aerial vehicle inspection process, and extracting multi-scale fusion characteristics; s200, respectively carrying out longitudinal banding pooling and transverse banding pooling on the fusion features, extracting longitudinal context features and transverse context features, fusing the longitudinal context features and the transverse context features with original fusion features, generating a multi-class structure heat map comprising a tower main structure heat map, a cross arm structure heat map and a wire line region heat map based on the fused features, and carrying out gate control weighting on the fused features according to the multi-class structure heat map to obtain structure guiding features; S300, performing target detection based on the structure guide characteristics, and outputting detection frames, categories and confidence degrees comprising a tower whole, a cross arm, an insulator string, a hardware fitting connection area, a wire clamp and a wire; S400, sampling a response value of a center point of the detection frame on a structural heat map corresponding to the category of the center point, and carrying out weighted calculation on the confidence coefficient and the response value to obtain a confidence coefficient after structural correction; S500, selecting a key shooting target from a detection result according to the confidence coefficient after structure correction, generating a holder posture adjustment instruction according to the offset of the center of the target detection frame and the center of the image, and adjusting a camera zooming parameter according to the proportion of the area of the target detection frame to the area of the image; and S600, controlling the camera to take a high-resolution picture when the continuous multiframes simultaneously meet the target centering condition, the detection stability condition and the image definition condition. As a preferred embodiment of the present invention, in S200, the steps of longitudinal striping and transverse striping include: Dividing the fusion feature into a plurality of longitudinal strips along the longitudinal direction of the image, and pooling the features in each longitudinal strip to obtain longitudinal context features; and dividing the fusion feature into a plurality of transverse strips along the transverse direction of the image, and pooling the features in each transverse strip to obtain transverse context featur