CN-121788843-B - GH4169 alloy grain segmentation method based on joint attention mechanism
Abstract
The invention discloses a GH4169 alloy grain segmentation method based on a joint attention mechanism, which comprises the steps of collecting microstructure images of GH4169 alloy after corrosion treatment, dividing the microstructure images into a training set, a verification set and a test set after pretreatment, constructing a GH4169 alloy grain segmentation network based on a coding and decoding framework of a Transformer and ConvNext, inputting the training set images and the verification set images into the network for training, inputting the test set images into the trained network for realizing grain boundary division in the microstructure images, and counting grain sizes and intercept points based on the grain boundary division condition. And the coding and decoding structure of fusion of the Transformer and the CNN is adopted, feature complementation is realized through the global context modeling capability of the Swin Transformer and the local feature extraction advantage of the CNN, and the precision of grain segmentation in the complex microstructure is remarkably improved.
Inventors
- DENG BOWEN
- LI KE
- YANG CHAO
- WANG HU
- CHEN PENGYUAN
Assignees
- 西部超导材料科技股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260304
Claims (3)
- 1. The GH4169 alloy grain segmentation method based on the joint attention mechanism is characterized by comprising the following steps of: Step 1, collecting microstructure images of GH4169 alloy after corrosion treatment to form a data set, and dividing the data set into a training set, a verification set and a test set after pretreatment; Step 2, constructing a GH4169 alloy grain division network based on a trans-former and ConvNext coding and decoding architecture, wherein the method specifically comprises the following steps: step 2.1, an encoder module based on a transducer architecture in a GH4169 alloy grain segmentation network consists of two stages, wherein the encoder module is used for extracting shallow layer features and deep semantic features in a microstructure image, and the encoder module specifically comprises: the method comprises the steps of extracting low-level local structural features of an image through Patch Partition and Patch Embedding operation in the first stage, dividing an input microstructure image into image blocks with the size of 4 multiplied by 4 by Patch Partition, then carrying out Patch Embedding on the image blocks by using convolution layers with the size of 4 multiplied by 4, the step size of 4 and filling 0, connecting LayerNorm layers in the first stage, and finally outputting an initial feature map with the dimension of 96; the second Stage is a hierarchical feature coding module, which consists of four continuous stages Stage 1 to Stage 4, wherein Stage 1 to Stage 4 respectively comprise 2 Transformer Block, 2 Transformer Block, 6 Transformer Block and 2 Transformer Block, and the specific processing flow is as follows: Stage 1. Inputting the initial feature map output in the first Stage into Stage 1, extracting features by using 2 Transformer Block, capturing local geometric features and preliminary texture gradient information in microstructure image, and outputting feature map with size of 64×64 under the condition of keeping resolution unchanged ; Stage 2 feature map Firstly, through PATCH MERGING layers, using window with step length of 2 to make space downsampling, combining 2X 2 adjacent image blocks to make feature image length and width halve to 32X 32, at the same time doubling channel number to 192, then, 2 pieces of Transformer Block are fed into them to make treatment so as to extract context related features of grain area, and output feature image ; Stage 3 feature map After PATCH MERGING downsampling, the size is changed to 16 multiplied by 16, the number of channels is increased to 384, 6 pieces of Transformer Block are stacked, abstract semantic features are extracted, and a feature map is output ; Stage 4 feature map Performing PATCH MERGING downsampling again to obtain 8×8 channels with number of 768, extracting global deep semantic features by 2 pieces of Transformer Block processing, and outputting feature map ; Step 2.2, a ConvNeXt-based decoder module in the GH4169 alloy grain segmentation network is used for gradually recovering the spatial resolution and fusing the multi-scale features extracted by the encoder, so that the boundary segmentation of the microstructure image is realized, specifically: step 2.21, feature map output by Stage 4 Taking the up-sampled feature map and the feature map output by Stage 3 as initial input, up-sampling the feature map by 2 times by bilinear interpolation to restore the space size to 16×16 Splicing the channel dimensions, and sending the spliced feature images into a CSDA module to obtain the reinforced feature images; Step 2.22, the feature map reinforced by the CSDA module is further input ConvNeXt Block to output a fusion feature map; Step 2.23, the decoder performs up-sampling on the fused feature map again by 2 times, and increases the feature map size to 32×32, and outputs the feature map with Stage 2 Channel splicing is carried out, and the spliced feature images are sequentially sent to a CSDA module and ConvNeXt Block for processing; Step 2.24, the decoder performs up-sampling to 64×64 on the feature map obtained in step 2.23, and outputs the feature map with Stage 1 The spliced feature images are sequentially sent to a CSDA module and ConvNeXt Block for processing, so that a decoding feature image with high-resolution structural details and stable semantic expression is obtained; the CSDA module comprises the following processing procedures: input feature map is Output characteristic diagram is Wherein, the method comprises the steps of, The height and width of the feature map are respectively, In order to input the number of channels of the feature map, The number of channels for outputting the feature map; Feature map to be input The method comprises the steps of respectively carrying out global average pooling and global maximum pooling on space dimensions in an input channel attention module, obtaining channel description vectors with different feature graphs in different feature compression modes, respectively inputting the two vectors into a convolution layer with shared weights for feature transformation, then fusing the two output vectors in an element-by-element addition mode, and generating normalized channel attention weights through a Sigmoid activation function; (1) (2) (3) Wherein, the Representing a space self-adaptive average pooling branch output characteristic diagram; representing a space self-adaptive maximum pooling branch output characteristic diagram; outputting a characteristic diagram by the representative channel attention module; representing a convolution operation; represents a leaky linear rectifying unit operation; representing the space self-adaptive average pooling operation with the output space size of 1; Representing the space self-adaptive maximum pooling operation with the output space size of 1; next, the feature map to be input The input space attention module is used for respectively executing the operations of maximum pooling and average pooling on the input feature images along the channel dimension, then splicing the two space images in the channel dimension, and inputting the convolution kernel with the size of The method comprises the steps of outputting a spatial attention diagram through a Sigmoid activation function, multiplying the spatial attention diagram with an input feature diagram element by element according to a spatial dimension to realize feature re-weighting in the spatial dimension, wherein mathematical expressions are shown in the following formulas (4) to (6): (4) (5) (6) Wherein, the Representing a channel self-adaptive average pooling branch output characteristic diagram; Representing a channel self-adaptive maximum pooling branch output characteristic diagram; Outputting a characteristic diagram by the representative spatial attention module; The channel self-adaptive average pooling operation with the number of the representative output channels being 1; the channel self-adaptive maximum pooling operation with the number of the output channels being 1 is represented; Adding random inactivation mechanism between channel attention module and space attention module output, and outputting intermediate characteristic diagram as The mathematical expression is shown in the following formula (7): (7) Wherein: representing channel dimension splicing operation; A channel random discard operation representing a probability of 30%; A spatial random discard operation representing a probability of 30%; Intermediate characteristic map obtained by formula (7) The channel dimension of the channel is expanded to 2 times, and then a channel rearrangement mechanism is introduced and then the channel is passed through The convolution restores the characteristic diagram to the number of target channels, residual connection is introduced, and finally, the characteristic expression capacity is enhanced, and the mathematical expression is shown as the following formula (8): (8) Wherein: representing a lane rearrangement operation; ConvNeXt Block firstly, applying large-kernel depth separable convolution to an input feature map, then, carrying out normalization processing on the feature map, sequentially passing through a channel expansion and compression path formed by point-by-point convolution on the normalized feature map, introducing GELU activation functions, and fusing the input feature map and a transformed output feature map through residual connection to obtain a fused feature map; step 2.25, restoring the spatial resolution of the decoding feature map to be consistent with the original input feature map through 4 times of up-sampling operation once, carrying out channel compression through 1X 1 convolution, and finally outputting a pixel level grain segmentation probability map to realize segmentation of grain areas and boundaries; step 3, inputting a training set image and a verification set image into a GH4169 alloy grain segmentation network for training, inputting a test set image into the trained GH4169 alloy grain segmentation network, and realizing grain boundary division in a microstructure image; and 4, counting the grain size and the intercept based on the grain boundary division condition.
- 2. The GH4169 alloy grain segmentation method based on the joint attention mechanism as in claim 1, wherein in the step 1, the preprocessing process comprises the steps of manually labeling the microstructure image, expanding the microstructure image after manual labeling, and specifically adopting any one or more of random rotation, random scaling, random clipping, random brightness adjustment, random contrast enhancement and random noise addition during expansion.
- 3. The GH4169 alloy grain segmentation method based on joint attention mechanism as in claim 1, wherein in step 3, during training, the verification set image is used to evaluate the GH4169 alloy grain segmentation network performance in real time after each training round is finished, and the specific training strategy is as follows: the mixed loss function is adopted, and consists of binary cross entropy loss and Dice loss weighting, and the mathematical expression is as follows: Wherein, the For measuring the accuracy of the pixel classification, The method is used for relieving the imbalance problem of positive and negative samples and optimizing the coincidence ratio of the boundary; the optimizer and super parameters are that the parameters are updated by adopting AdamW optimizer, and the initial learning rate is set as follows Dynamically adjusting the learning rate by adopting a cosine annealing strategy; and (3) terminating the condition that the maximum iteration round is set to be 200 times, if the loss value of the verification set does not decline in 20 continuous rounds, terminating training in advance, and loading the weight with the minimum loss on the verification set as a final GH4169 alloy grain segmentation network.
Description
GH4169 alloy grain segmentation method based on joint attention mechanism Technical Field The invention belongs to the technical field of image processing, and particularly relates to a GH4169 alloy grain segmentation method based on a joint attention mechanism. Background The accurate grain structure distribution information has important significance for the research of performance evaluation, process optimization, service life prediction and the like of the high-temperature alloy material. The semantic segmentation of the grains in the microscopic image is an important component of the microscopic tissue analysis of the material, and the main purpose of the semantic segmentation is to extract relevant characteristics of the grains from the high-resolution metallographic image, so that the semantic classification of all pixels in the image is realized, and the precise segmentation and description of the grain area are completed. However, the grain structure in the superalloy often has the characteristics of complex morphology, large size difference, fuzzy boundary, tight grain connection and the like, and the technical difficulty of the segmentation task is remarkably increased. In the preparation process of the superalloy, the morphology of the crystal grains is varied and is possibly changed from fine equiaxed crystal to long columnar crystal, and the complex morphology makes it difficult for a general segmentation algorithm to accurately identify each crystal grain. In terms of grain size, the grain size difference in the same image can be several times or even tens times, small grains are easy to miss or misjudge in segmentation, and the detail characteristics of large grains can be lost due to insufficient resolution. Meanwhile, the boundary part is blurred due to mutual extrusion in the grain growth process, and the grains which are tightly connected are more difficult to distinguish. In particular, when there is a microstructure such as a twin crystal, a subgrain boundary, or a second phase particle, it is often difficult to accurately determine the grain boundary position. The twin structure is similar to the gray scale and texture features of a normal grain boundary, only orientation differences exist, which forms a great challenge for a segmentation algorithm based on visual features, subgrain boundaries are usually thin and have low contrast and are difficult to clearly present in an image, the existence of second phase particles can interfere judgment of a segmentation model on the grain boundaries, and the position and morphology changes of the second phase particles also increase the complexity of segmentation. In addition, due to factors such as uneven corrosion, change of illumination conditions, low imaging contrast, background inclusion and the like in the sample preparation process, the identification and extraction of grain boundaries are further interfered, and a serious challenge is brought to high-precision automatic segmentation. The uneven corrosion can cause overlarge gray scale difference in the same crystal grain, the local brightness of the image is different due to the change of illumination conditions, the imaging contrast is low, the detail of the crystal grain boundary is blurred, the existence of background inclusion can be misjudged as the crystal grain or the crystal grain boundary, and the problems greatly influence the accuracy and the reliability of the segmentation. Disclosure of Invention The invention aims to provide a GH4169 alloy grain segmentation method based on a joint attention mechanism, which solves the problems of poor grain segmentation precision and low efficiency of the existing machine learning method. The technical scheme adopted by the invention is that the GH4169 alloy grain segmentation method based on a joint attention mechanism is implemented specifically according to the following steps: Step 1, collecting microstructure images of GH4169 alloy after corrosion treatment to form a data set, and dividing the data set into a training set, a verification set and a test set after pretreatment; step 2, constructing a GH4169 alloy grain division network based on a trans-former and ConvNext coding and decoding architecture; step 3, inputting a training set image and a verification set image into a GH4169 alloy grain segmentation network for training, inputting a test set image into the trained GH4169 alloy grain segmentation network, and realizing grain boundary division in a microstructure image; and 4, counting the grain size and the intercept based on the grain boundary division condition. The present invention is also characterized in that, In the step 1, the preprocessing process comprises the steps of manually labeling the microstructure image, and expanding the microstructure image after manual labeling, wherein any one or more of random rotation, random scaling, random cutting, random brightness adjustment, random contrast enhancement and random