CN-122023808-A - Photovoltaic module image segmentation method combining structure priori and feature fusion

CN122023808ACN 122023808 ACN122023808 ACN 122023808ACN-122023808-A

Abstract

The invention provides a photovoltaic module image segmentation method combining structure priori and feature fusion, which comprises the steps of obtaining a remote sensing image containing a photovoltaic module, extracting shallow layer features containing local space details and deep layer features containing global semantic information from the remote sensing image, carrying out decoding treatment on the shallow layer features and the deep layer features, carrying out feature aggregation on current decoding features in the horizontal direction and the vertical direction respectively to generate a structure priori weight map based on the two types of aggregation features, carrying out weighted modulation on the current decoding features through the weight map to obtain enhanced decoding features, and generating a pixel level separation result of the photovoltaic module based on the enhanced decoding features.

Inventors

ZHANG JIE
CHEN HANSHUO
DONG YUHANG
CHENG SHUYING
LIN PEIJIE

Assignees

福州大学

Dates

Publication Date: 20260512
Application Date: 20260212

Claims (10)

1. The method for dividing the photovoltaic module image is characterized by comprising the following steps of: acquiring a remote sensing image containing a photovoltaic module; Extracting shallow features containing local space details and deep features containing global semantic information from the remote sensing image; Performing decoding processing on the shallow layer features and the deep layer features, and performing structure priori guiding operation in the decoding process, namely performing feature aggregation on the current decoding features along the horizontal direction and the vertical direction respectively to generate horizontal direction aggregation features and vertical direction aggregation features, generating a structure priori weight map based on the two types of aggregation features, and performing weighted modulation on the current decoding features through the weight map to obtain enhanced decoding features; and generating a pixel level classification result of the photovoltaic component based on the enhancement decoding feature.
2. The method for segmenting the photovoltaic module image is characterized in that the shallow layer features are extracted through a convolutional neural network, at least one layer of high layer features are input into a transform module for modeling a long-range dependency relationship to obtain the deep layer features, the shallow layer features are transmitted to a decoding stage through jump connection and are fused with corresponding decoding features to serve as input of the structure priori guiding operation.
3. The photovoltaic module image segmentation method is characterized by specifically comprising the steps of carrying out channel dimension adjustment on input fusion features, respectively constructing feature maps in the horizontal direction and the vertical direction, carrying out average pooling and maximum pooling on the feature maps in the horizontal direction along the horizontal dimension, carrying out average pooling and maximum pooling on the feature maps in the vertical direction along the vertical dimension to obtain four types of directional features, splicing the two types of features in the same direction into aggregation features, carrying out channel compression, convolution enhancement processing and activation processing on the aggregation features to generate the structure prior weight map, carrying out element-by-element weighted fusion on the weight map and the input fusion features, and outputting the enhancement decoding features through residual connection.
4. The method for segmenting the photovoltaic module image according to claim 2, wherein before the structure priori guiding operation, self-adaptive fusion is performed on shallow layer characteristics conveyed by jump connection and characteristics of corresponding decoding stages through a learnable weight parameter so as to balance local details and global semantic information.
5. The method of claim 1, further comprising the step of performing a feature enhancement process on the enhanced decoded features, the feature enhancement process including at least one of a channel attention process, a spatial attention process, and a multi-scale convolution branch process.
6. The method for segmenting the photovoltaic module image according to claim 2, wherein the multi-head self-focusing unit in the transducer module is integrated with a channel focusing mechanism and a multi-scale feature extraction module, and different scale features are extracted and fused through multi-branch convolution.
7. The method for dividing the photovoltaic module image according to claim 5, wherein the multi-scale convolution branch processing is realized by depth expansion convolution, and different void ratios are set according to typical scales of the photovoltaic module.
8. The method for segmenting the image of the photovoltaic module according to claim 1, wherein the step of generating the segmentation result of the pixel segment comprises the steps of performing depth separable convolution on the enhancement decoding feature, recovering to the original spatial resolution of the remote sensing image through interpolation up-sampling operation, and outputting the segmentation result.
9. The method for segmenting the photovoltaic module image according to claim 1, further comprising a network training step of calculating a segmentation result and a true labeling loss by adopting a joint loss function formed by cross entropy loss and Dice loss, and training a segmentation network by dynamically adjusting a learning rate through a strategy of combining linear preheating and cosine annealing.
10. A photovoltaic module image segmentation system, comprising: the characteristic extraction module is configured to acquire a remote sensing image containing a photovoltaic module and extract shallow space detail characteristics and deep global semantic characteristics of the remote sensing image; The decoding and modulating module is configured to perform decoding processing on the shallow layer features and the deep layer features, and perform structure priori guiding operation in the decoding process to generate enhanced decoding features, wherein the structure priori guiding operation comprises horizontal and vertical direction feature aggregation, structure priori weight map generation and feature weighted modulation; and the segmentation generation module is configured to generate a pixel level segmentation result of the photovoltaic component based on the enhancement decoding feature.

Description

Photovoltaic module image segmentation method combining structure priori and feature fusion Technical Field The invention belongs to the technical field of remote sensing image processing and computer vision, and particularly relates to a photovoltaic module image segmentation method combining structure priori and feature fusion. Background In the prior art, a semantic segmentation method based on deep learning is widely applied to remote sensing scene target extraction. The method typically employs an encoder-decoder architecture, obtains high-level semantic features through downsampling and gradually restores spatial resolution during the decoding stage to output pixel-level classification results. However, in complex Photovoltaic (PV) scenarios, the existing methods still suffer from the following drawbacks: Firstly, the photovoltaic module often presents remarkable multi-scale change and dense arrangement characteristics in the remote sensing image. Meanwhile, the spectrum and texture of the photovoltaic module are highly variable and are easily confused with similar ground objects such as roads, roofs, water bodies, shadows and the like under the influence of the installation mode, ground object materials, weather and illumination conditions, so that missed detection and false detection are caused. Conventional Convolutional Neural Network (CNN) encoders, such as the architecture employed by the U-Net, deep Lab series, while excellent in extracting local texture and edge features, their inherent local receptive fields limit their ability to model long-range context dependencies. This makes the model challenging when distinguishing features with similar spectral characteristics but different spatial layouts (e.g., a sheeted photovoltaic panel and a large-area cement pavement or a roof of a specific material), it is difficult to understand the scene layout from a global perspective, thus suppressing background interference. In the pixel level segmentation task, deep features have strong semantic expression capability, but small targets and edge details are easy to lose in the continuous downsampling and nonlinear transformation processes, and shallow features retain more texture and edge information but have insufficient semantic representation capability. Existing feature fusion strategies, such as simple Skip Connection (Skip Connection) or feature map stitching (Connection), often have difficulty in simultaneously considering high-level semantics and low-level details. This simple information superposition fails to fully consider the essential differences and complementary relationships of different hierarchical features in terms of semantic granularity and spatial detail, and the fusion process lacks an effective guiding or modulation mechanism. Therefore, when the resolution is restored by decoding, the characteristic confusion is easy to cause, the boundary of the photovoltaic component is blurred, jagged artifacts are generated, or adhesion or fracture between targets occurs in a densely arranged area, so that the structural integrity and the geometric accuracy of a segmentation result are seriously affected. Furthermore, while some improvements have attempted to introduce attention mechanisms (e.g., channel attention, spatial attention) to enhance feature expression, or to use hole convolution (Dilated Convolution) to expand the receptive field, these improvements are mostly generic designs that are not optimized for regular rectangular or stripe-like geometries specific to photovoltaic modules. The strong directivity (horizontal or vertical arrangement) and the regular grid-like layout of the photovoltaic array in the remote sensing image are important priori knowledge, and the existing general segmentation model lacks a mechanism for effectively utilizing the strong structure priors, so that the capability of capturing and maintaining the regular boundaries of the photovoltaic panel in a complex background is insufficient, and the robustness is required to be improved when dealing with partial shielding, uneven illumination or irregular arrangement of scenes. In summary, in the prior art, when the photovoltaic module is segmented in the complex remote sensing scene, there are still limitations in global context modeling, effective fusion and guidance of multi-level features, utilization of specific geometric priors of targets, and the like, which restrict further improvement of segmentation accuracy and practicality. Disclosure of Invention Aiming at the defects and shortcomings in the prior art, the invention provides a photovoltaic module image segmentation method combining structure priori and feature fusion, which aims to solve the problems of low segmentation precision, poor structural integrity and the like caused by multi-scale change, serious background interference and edge detail loss of a photovoltaic module in a complex remote sensing scene. The method is based on an encoder-decoder