CN-117315239-B - Image segmentation method based on improved U2-Net network

CN117315239BCN 117315239 BCN117315239 BCN 117315239BCN-117315239-B

Abstract

An image segmentation method based on an improved U2-Net network comprises the following steps of 1) preprocessing an image, 2) carrying out self-adaptive histogram smoothing processing on the image, 3) carrying out data amplification, 4) obtaining output 1, 5) obtaining output 2, 6) obtaining output 3, 7) obtaining output 4, 8) obtaining output 5;9) obtaining output 6, a saliency probability map 6, 10) obtaining output 7, a saliency probability map 5, 11) obtaining output 8, a saliency probability map 4, 12) obtaining output 9, a saliency probability map 3, 13) obtaining output 10, and a saliency probability map 2, 14) obtaining a saliency probability map 1, a saliency probability map 6, through a convolution layer and a Sigmoid activation function, is mapped into pixel values between 0 and 1, and the probability that each pixel belongs to a foreground, namely an object is represented, and a segmentation map is obtained. The method reduces the manpower and material resources required by manpower, reduces the calculated amount and improves the segmentation accuracy.

Inventors

WANG ZIMIN
ZHOU YUE

Assignees

桂林电子科技大学

Dates

Publication Date: 20260505
Application Date: 20230801

Claims (1)

1. An image segmentation method based on an improved U2-Net network is characterized by comprising the following steps: 1) Acquiring a data set, dividing the data set into a training set and a testing set, preprocessing all images in the data set, and reducing image noise by using a Gaussian filter; 2) Performing self-adaptive histogram smoothing on the image in the step 1), and increasing the contrast ratio of the multiple deformity area and other areas so as to reduce the influence of the irrelevant area on the segmentation of the target area; 3) Performing data amplification, namely performing angle rotation, inversion and rotation operation on the image in the step 2), wherein the angle rotation operation is set to be clockwise and anticlockwise by 30 degrees and 60 degrees, the inversion operation is set to be horizontal inversion, and the rotation operation is set to be vertical rotation; 4) The image in the training set is used as the input of a model to be transmitted to an encoder En_1 of an RSU-7 layer, a convolution block in the RSU-7 layer encoder is a dense connection module 1, 7 layers of convolution blocks are shared, the middle jump connection adopts a connection mode combining long connection and short connection, the image characteristics are integrated, the image characteristics are extracted, and the result of the RSU-7 layer encoder is transmitted to a channel attention module and is recorded as output 1; 5) The output 1 is sent into an encoder En_2 of an RSU-6 layer, the convolution blocks of the RSU-6 layer encoder are dense connection modules 2,6 layers of convolution blocks are shared, the middle jump connection adopts a connection mode combining long connection and short connection, the image characteristics are further extracted, the spatial scale is reduced, and the result of the RSU-6 layer encoder is sent into a channel attention module and is recorded as output 2; 6) The output 2 is sent to an encoder En_3 of an RSU-5 layer, the convolution blocks of the RSU-5 layer encoder are dense connection modules 3, 5 layers of convolution blocks are shared, the middle jump connection adopts a connection mode combining long connection and short connection, the image characteristics are further extracted, the spatial scale is reduced, and the result of the RSU-5 layer encoder is sent to a channel attention module and is recorded as output 3; 7) The output 3 is sent to an encoder En_4 of an RSU-4 layer, the convolution blocks of the RSU-4 layer encoder are dense connection modules 4, 4 layers of convolution blocks are shared, the middle jump connection adopts a connection mode combining long connection and short connection, the image characteristics are further extracted, the spatial scale is reduced, and the result of the RSU-4 layer encoder is sent to a channel attention module and is recorded as output 4; 8) The output 4 is sent to an encoder En_5 of an RSU-4F layer, the jump connection of a convolution block of the RSU-4F layer keeps the design of a U2-Net original network unchanged, and the result of the encoder of the RSU-4F layer is sent to a channel attention module and is recorded as output 5; 9) The output 5 is sent to an encoder En_6 of an RSU-4F layer, the jump connection of a convolution block of the RSU-4F layer keeps the design of a U2-Net original network unchanged, and the result of the encoder of the RSU-4F layer is sent to a channel attention module to obtain an output 6, and a remarkable probability chart 6 is obtained; 10 The output 6 is sent to a decoder De_5 of the RSU-4F layer, the jump connection of the convolution block of the RSU-4F layer keeps the design of the U2-Net original network unchanged, and the result of the decoder of the RSU-4F layer is sent to a channel attention module to obtain an output 7, and a remarkable probability diagram 5 is obtained; 11 The output 7 is sent to a decoder De_4 of the RSU-4 layer, the convolution blocks of the RSU-4 layer decoder are dense connection modules 4,4 layers of convolution blocks are shared, the middle jump connection adopts a connection mode of combining long connection and short connection, image characteristics are extracted, the spatial scale is enlarged, the result of the RSU-4 layer decoder is sent to a channel attention module, an output result 8 is obtained, and a remarkable probability chart 4 is obtained; 12 The output 8 is sent to a decoder De_3 of the RSU-5 layer, the convolution blocks of the RSU-5 layer decoder are dense connection modules 3,5 layers of convolution blocks are shared, the middle jump connection adopts a connection mode of combining long connection and short connection, the image characteristics and the enlarged space scale are further extracted, the result of the RSU-5 layer decoder is sent to a channel attention module, and an output result 9 is obtained, and the obvious probability is shown in fig. 3; 13 The output 9 is sent to a decoder De_2 of the RSU-6 layer, the convolution blocks of the RSU-6 layer decoder are dense connection modules 2, 6 layers of convolution blocks are shared, the middle jump connection adopts a connection mode of combining long connection and short connection, the image characteristics are further extracted, the spatial scale is enlarged, the result of the RSU-6 layer decoder is sent to a channel attention module, and an output result 10 is obtained, and the obvious probability is shown in fig. 2; 14 The output 10 is sent into a decoder De_1 of an RSU-7 layer, the convolution blocks of the RSU-7 layer decoder are dense connection modules 1,7 layers of convolution blocks are shared, the middle jump connection adopts a connection mode of combining long connection and short connection, the image characteristics are further extracted, the spatial scale is enlarged, and a remarkable probability diagram 1 is obtained; 15 The saliency probability map 1-saliency probability map 6 is mapped into pixel values between 0 and 1 after passing through a convolution layer and a Sigmoid activation function, the probability that each pixel belongs to a foreground, namely an object, is represented, and a segmentation map is obtained.

Description

Image segmentation method based on improved U2-Net network Technical Field The invention relates to an image segmentation technology, in particular to an image segmentation method based on an improved U2-Net network. Background 2015, Ronneberger et al propose a U-Net network consisting of three parts, an encoder, a decoder and a hop connection. The U-Net has good effect in medical image segmentation task, so that the U-Net is widely applied. To meet the actual demands of different tasks, different scholars have proposed a series of improvements to U-Net. For example Oktay, attention U-Net is proposed, and the network designs an Attention gating mechanism (Attention gate, abbreviated as AG), and then replaces the jump connection of U-Net with an AG module, so that the learning of the network to an irrelevant area is effectively inhibited, and meanwhile, the network pays Attention to the learning and task related area. The U-Net++ is proposed by Zhou et al, and the network designs a nested structure and dense jump connection on the basis of the U-Net, so that the requirements of different scene applications on network depth are solved, and meanwhile, a decoder can better fuse multi-scale information. Huang et al propose U-Net3+ which combines semantic information of feature graphs of different scales by full-scale jump connection while learning hierarchical representation from feature graphs aggregated in full scale by deep supervision. In 2020, the U2-Net network proposed by Xuebin Q et al has achieved significant results in tasks such as significant object detection, image segmentation, and natural scene segmentation, and has become one of the important reference models in the field of image segmentation. However, the model generally has the problems of inaccurate boundary prediction, large memory consumption, poor information richness and the like. Therefore, the model is based on a U2-Net network, dense connection modules are introduced to capture the characteristics of different layers, the receptive fields of different layers are obtained, and the segmentation accuracy is improved. And meanwhile, a channel attention mechanism module is introduced, and each feature is given different weights, so that important features are focused, unimportant features are restrained, and the calculated amount is reduced. Disclosure of Invention The invention aims at overcoming the defects in the prior art and provides an image segmentation method based on an improved U2-Net network. The method reduces the manpower and material resources required by manpower, reduces the calculated amount and improves the segmentation accuracy. The technical scheme for realizing the aim of the invention is as follows: an image segmentation method based on an improved U2-Net network comprises the following steps: 1) Acquiring a data set, dividing the data set into a training set and a testing set, preprocessing all images in the data set, and reducing image noise by using a Gaussian filter; 2) Performing self-adaptive histogram smoothing on the image in the step 1), and increasing the contrast ratio of the multiple deformity area and other areas so as to reduce the influence of the irrelevant area on the segmentation of the target area; 3) Performing data amplification, namely performing angle rotation, inversion and rotation operation on the image in the step 2), wherein the angle rotation operation is set to be clockwise and anticlockwise by 30 degrees and 60 degrees, the inversion operation is set to be horizontal inversion, and the rotation operation is set to be vertical rotation; 4) The image of the training set is used as the input of a model to be transmitted to an encoder En_1 of an RSU-7 layer, a convolution block in the RSU-7 layer encoder is a dense connection module 1, 7 layers of convolution blocks are shared, the middle jump connection adopts a connection mode combining long connection and short connection, the image characteristics are integrated, the image characteristics are extracted, and the result of the RSU-7 layer encoder is transmitted to a channel attention module and is recorded as output 1; 5) The output 1 is sent into an encoder En_2 of an RSU-6 layer, the convolution blocks of the RSU-6 layer encoder are dense connection modules 2,6 layers of convolution blocks are shared, the middle jump connection adopts a connection mode combining long connection and short connection, the image characteristics are further extracted, the spatial scale is reduced, and the result of the RSU-6 layer encoder is sent into a channel attention module and is recorded as output 2; 6) The output 2 is sent to an encoder En_3 of an RSU-5 layer, the convolution blocks of the RSU-5 layer encoder are dense connection modules 3, 5 layers of convolution blocks are shared, the middle jump connection adopts a connection mode combining long connection and short connection, the image characteristics are further extracted, the spatial scale is re