CN-121999332-A - Underwater shielding target segmentation method based on multi-scale feature fusion
Abstract
The invention provides an underwater shielding target segmentation method based on multi-scale feature fusion, which comprises the steps of inputting an underwater image to be processed into a coding and decoding type segmentation network, setting a multi-scale feature fusion module aiming at each scale, taking coding features of the scale as low-level features and taking adjacent deeper features as high-level features, wherein two paths of features can have different spatial resolutions. In the fusion module, global average pooling is carried out on low-level and high-level features under respective original spatial scales, pooling results are added and are subjected to nonlinear mapping and Sigmoid normalization to obtain low-level weights and high-level weights, channel-by-channel and pixel-by-pixel weights are respectively carried out on the two paths of features according to the low-level weights, then on-line sampling is carried out in the fusion module to match output scales, element-by-element addition is carried out on the low-level weights and the high-level weights to complete aggregation, fusion features of all scales are obtained, step-by-step up sampling is carried out by taking bottleneck features as starting points in a decoding stage, parallel aggregation and convolution reforming are carried out on the bottleneck features and the fusion features of corresponding scales respectively, spatial details and boundaries are recovered gradually, and finally the pixel-level separation results of underwater shielding targets are output.
Inventors
- LIU MOUYUAN
- WANG RUIXUE
- CHEN RUYI
- CHEN ZHUO
- ZHAN CHUNLEI
- ZHAO XINYU
- CHENG BIN
Assignees
- 大连海事大学
Dates
- Publication Date
- 20260508
- Application Date
- 20251230
Claims (7)
- 1. The underwater shielding target segmentation method based on multi-scale feature fusion is characterized by comprising the following steps of: Acquiring an underwater image to be processed, and inputting the image into a coding and decoding type segmentation network, wherein the coding and decoding type segmentation network comprises four-level coding blocks and four-level decoding blocks, and coding features and bottleneck features of corresponding scales are obtained between adjacent coding blocks in a haar wavelet downsampling mode; Constructing a corresponding multi-scale feature fusion module aiming at each scale coding feature, taking the coding feature of each scale as a low-layer feature and taking an adjacent deep-layer feature as a high-layer feature, wherein two paths of features keep respective original spatial resolution to enter fusion; In the multi-scale feature fusion module, global average pooling processing is respectively carried out on low-level features and high-level features under respective original spatial scales, pooled results are restored in a broadcast mode at the scale and added pixel by pixel to form a fusion guide map, and low-level weights and high-level weights are generated through nonlinear mapping and Sigmoid normalization processing; channel-by-channel and pixel-by-pixel weighting is carried out on the corresponding features according to the low-layer weight and the high-layer weight, up-sampling is carried out on the low-layer weighted features before aggregation so as to match and output the space size of the high-layer branches, and element-by-element addition is carried out to obtain the fused features of each scale; step-by-step up-sampling is carried out in the decoding stage by taking the bottleneck characteristic as a starting point, and parallel aggregation and convolution reformation are respectively carried out with the fusion characteristic of the corresponding scale, so that space details and boundaries are gradually recovered; and finally outputting a pixel level division result of the underwater shielding target.
- 2. The method for segmenting an underwater occlusion target based on multi-scale feature fusion according to claim 1, wherein the coding and decoding type segmentation network comprises four-level coding blocks and four-level decoding blocks, wherein adjacent coding blocks are downsampled according to haar wavelets so as to sequentially obtain coding features of first to fourth scales ~ And the deepest bottleneck characteristic B is obtained.
- 3. The method for segmenting an underwater occlusion target based on multi-scale feature fusion according to claim 1, wherein a multi-scale feature fusion module is constructed for each scale feature, wherein the scale feature is encoded For low-level features, with adjacent deeper features For higher-level features, where no spatial size alignment or resampling process is performed prior to fusion, only up-sampling is performed on lower-level weighted features to match the output scale at aggregation.
- 4. The underwater shielding target segmentation method based on multi-scale feature fusion according to claim 1, wherein the multi-scale feature fusion module performs global average pooling on the low-level features and the high-level features on the respective original spatial scales, the pooled result is used for constructing a fusion guide signal, and the low-level weights and the high-level weights with values ranging from [0,1] are generated through nonlinear mapping and Sigmoid normalization processing.
- 5. The method for segmenting the underwater shielding target based on multi-scale feature fusion according to claim 1, wherein the channel-by-channel and pixel-by-pixel weighting is carried out on the corresponding features according to the low-level weight and the high-level weight, the aggregation is completed by adding elements-by-element or 1X 1 convolution with the high-level weighted features after the up-sampling of the low-level weighted features, and each scale fusion feature is obtained ~ 。
- 6. The multi-scale feature fusion-based underwater occlusion object segmentation method of claim 1, wherein the decoding stage starts with the bottleneck feature and up-samples step by step in a 2-fold ratio, and wherein the decoding block 4, the decoding block 3, the decoding block 2 and the decoding block 1 respectively perform parallel aggregation and convolution reformation with the corresponding fusion feature D 4 , the fusion feature D 3 , the fusion feature D 2 and the fusion feature D 1 to gradually restore spatial details and boundaries.
- 7. The method for segmenting an underwater occlusion target based on multi-scale feature fusion according to claim 1, wherein the pixel level segmentation result is generated by a prediction head, and the prediction head applies 1 x 1 convolution to the final decoded output and obtains a pixel level probability map of the target and the background through Sigmoid.
Description
Underwater shielding target segmentation method based on multi-scale feature fusion Technical Field The invention relates to the technical field of underwater shielding target segmentation, in particular to an underwater shielding target segmentation method based on multi-scale feature fusion. Background In an underwater environment, targets are often shielded to different degrees due to suspended matters, aquatic vegetation, seafloor relief or operation equipment, so that apparent information defects and structural continuity are damaged, and the segmentation difficulty is remarkably increased. Local deletion, transmission attenuation and multipath scattering caused by shielding weaken boundary signals and destroy texture details, so that the traditional single-scale feature extraction is difficult to capture the correlation of fine granularity edges and cross-region contexts at the same time, and omission, mistakes and boundary blurring are easy to occur. Therefore, a multiscale feature fusion and perception mechanism is required to be constructed facing the shielding scene, and the discrimination and recovery capability of the model to the structural integrity of the target under different shielding degrees are improved through hierarchical feature interaction, cross-scale alignment and attention weighting, so that continuous reconstruction and accurate boundary positioning of the shielded region are realized. The existing underwater shielding target segmentation method is mostly dependent on single-scale convolution characterization or single-domain enhancement or threshold value paradigm, has insufficient modeling of cross-scale context and shielding dependency, is sufficient in contrast or slightly fashionable and stable in shielding, and is difficult to realize continuous reconstruction and boundary fine positioning of a shielded structure in a scene with strong scattering, low signal to noise ratio and significant background interference. Further, in the reasoning stage, the method is difficult to simultaneously consider the consistency of fine-granularity edge fidelity and global semantics, and the problems of adhesion between a target and a background, unclosed holes, fine-crushing of false edges and the like are easy to occur, so that the overall segmentation quality and the robustness are reduced. Aiming at the situation, the underwater shielding target segmentation technology based on multi-scale feature fusion is adopted to strengthen and recover the boundary details of the underwater shielding target, so that the adaptability to complex shielding structures is improved, and the interference caused by accurate segmentation of the underwater target is reduced. Disclosure of Invention According to the technical problems, the invention provides an underwater shielding target segmentation method based on multi-scale feature fusion, which mainly utilizes a four-level coding decoding network and a multi-scale feature fusion mechanism to conduct cross-scale alignment and weighted fusion on an underwater shielding target, sets corresponding multi-scale feature fusion modules for each scale coding feature, introduces adjacent deeper features as parallel branches, respectively performs global average pooling on two paths of features under respective original spatial scales, performs nonlinear mapping and Sigmoid normalization after pooling result addition to generate low-layer weight and high-layer weight, performs channel-by-channel and pixel-by-pixel weighting on the two paths of features according to the global average pooling, performs on-line sampling on the low-layer weighted features in an aggregation link to match an output scale, performs element-by-element addition on the low-layer weighted features with the high-layer weighted features to obtain each scale fusion feature, further performs step-by-step up sampling by taking bottleneck features as starting points in a decoding stage, performs parallel aggregation and convolution reforming with the multi-scale fusion features, gradually recovers spatial details and boundaries, and completes pixel-level division result output of the underwater shielding target. An underwater shielding target segmentation method based on multi-scale feature fusion comprises the following steps: S1, inputting an underwater image to be processed into a coding-decoding type segmentation network formed by four-level coding blocks and four-level decoding blocks, and sequentially obtaining coding features of first to fourth scales between adjacent coding blocks according to haar wavelet downsampling ~Extracting the deepest bottleneck characteristic B; S2, constructing a multi-scale feature fusion module aiming at each scale, namely encoding features by the current scale For low-level features, with adjacent deeper features(Bottleneck feature B when k=4) is a high-level feature, and the two-way feature maintains the respective native spatial resolution as a pair input in