CN-122023813-A - Medical ultrasonic image segmentation method based on fusion of geometric guidance and concept perception

CN122023813ACN 122023813 ACN122023813 ACN 122023813ACN-122023813-A

Abstract

The invention relates to the field of image processing, and provides a medical ultrasonic image segmentation method based on fusion of geometric guidance and concept perception, which automatically generates high-quality medical text description according to connected component analysis and multidimensional geometric analysis by designing a geometric guidance text generation module, realizes the transition from text-free segmentation to text-guided segmentation, solves the problems of limited quantity, difficult manufacture and high cost of high-quality text description, and designs a concept perception cross-modal fusion module, the invention solves the problem of insufficient cross-modal fusion by utilizing multi-scale space alignment and medical knowledge graph modeling, fully utilizes the professional knowledge in the medical field to guide the fusion process and segmentation decision according to explicit modeling of medical priori knowledge, and also provides a self-adaptive segmentation decoding module which dynamically selects an optimal strategy according to feature complexity, thereby improving the efficiency while ensuring the accuracy.

Inventors

WAN HUAN
XU WUJIAN
KANG ZIYANG
WEI XIN

Assignees

江西师范大学

Dates

Publication Date: 20260512
Application Date: 20260413

Claims (10)

1. The medical ultrasonic image segmentation method based on the fusion of geometric guidance and concept perception is characterized by comprising the following steps of: collecting a target medical ultrasonic image and preprocessing, and inputting the preprocessed target ultrasonic image into a medical ultrasonic image segmentation model, wherein the medical ultrasonic image segmentation model comprises a geometric guide text generation module, a multi-mode feature coding module, a concept perception cross-mode fusion module and a self-adaptive segmentation decoding module; Performing text generation according to a geometric guide text generation module to obtain medical ultrasonic text description, wherein the text generation comprises geometric feature extraction and self-adaptive classification; inputting the medical ultrasonic text description and the target ultrasonic image into a multi-modal feature encoding module to obtain ultrasonic text features and ultrasonic image features, wherein the multi-modal feature encoding module comprises a text encoder and an image encoder; Inputting ultrasonic text features and ultrasonic image features into a concept perception cross-modal fusion module, carrying out feature enhancement according to a multi-scale cross-modal attention mechanism to obtain multi-scale enhanced image features, carrying out feature fusion according to the concept perception fusion mechanism to obtain concept perception fusion features, wherein the multi-scale cross-modal attention mechanism is based on an adaptive attention mechanism, and the concept perception fusion mechanism is based on medical concept representation; And carrying out decoding processing according to an adaptive segmentation decoding module to obtain a final segmentation result, wherein the adaptive segmentation decoding module is based on adaptive upsampling, feature pyramid refinement and context-aware prediction.
2. The medical ultrasound image segmentation method based on the fusion of geometric guidance and concept perception according to claim 1, wherein the step of generating text according to the geometric guidance text generation module to obtain a medical ultrasound text description specifically comprises: And carrying out connected component analysis on the segmentation mask of the target medical ultrasonic image to identify the geometric characteristics of the target region, wherein the specific algorithm of the geometric characteristics is as follows: , , , , , , Wherein, the A single communication assembly is shown as such, The characteristics of the area are represented and, The roundness characteristics are represented by a set of values, Representing the perimeter of the outline of the communicating member, Representing the aspect ratio features, And Representing the length of the minor axis and the length of the major axis of the connected component ellipse fitting, The characteristic of the smoothness is indicated and, And Representing the approximate contour points and the original contour points of the connected component respectively, The characteristics of the compactness are represented by the fact that, Representing the convex hull area of the communicating members, The characteristic of the position is indicated and, The mapping function is represented as a function of the mapping, And Respectively representing the abscissa and the ordinate of the centroid; and carrying out target area calculation and self-adaptive classification according to the geometric features, wherein the specific algorithm of the target area calculation and self-adaptive classification is as follows: , Wherein, the The area of the target is represented by the area, The total area is indicated as such, Represents the target area size ratio, the veryjsmall represents the minor size ratio, the small represents the minor size ratio, the moderate represents the medium size ratio, the large represents the major size ratio, the veryjlarge represents the broad size ratio, 、、、 Representing a 20 th percentile threshold, a 40 th percentile threshold, a 60 th percentile threshold, an 80 th percentile threshold, respectively, of the dataset; and generating a geometrically guided medical ultrasonic text description according to the geometric features and the adaptive classification result, and performing medical data fine adjustment on the BLIP model.
3. The medical ultrasound image segmentation method based on the fusion of geometric guidance and concept perception according to claim 2, wherein the step of generating a medical ultrasound text description of geometric guidance according to geometric features and adaptive classification results and performing medical data fine tuning on a BLIP model specifically comprises: Using the geometrically guided medical ultrasound text description as a supervisory signal, performing fine tuning on the BLIP model according to LoRA parameter fine tuning strategies and performing joint training according to a plurality of medical data sets to obtain a fine tuning optimized BLIP model, wherein a specific algorithm for obtaining the fine tuning optimized BLIP model is as follows: , , Wherein, the Representing the input text of the user, Representing a fine-tuning optimization function, Representing the total amount of training samples, The index of the sample is represented and, Representing the conditional probability of generating the target output text, The output text representing the BLIP model, Represent the first And input images.
4. The medical ultrasound image segmentation method based on the fusion of geometric guidance and concept perception according to claim 1, wherein the step of inputting the medical ultrasound text description and the target ultrasound image into a multi-modal feature encoding module to obtain the ultrasound text feature and the ultrasound image feature specifically comprises: The multi-mode feature encoding module comprises an image encoder and a text encoder; The image encoder acquires ultrasonic image features based on a pyramid structure according to the image encoder, and the specific algorithm for acquiring the ultrasonic image features is as follows: , Wherein, the Representing the characteristics of the ultrasound image, The image encoder is represented by a picture frame, The number of layers is indicated and, Representing a target ultrasound image; the ultrasonic text features are acquired according to the text encoder, and the specific algorithm for acquiring the ultrasonic text features is as follows: , Wherein, the Representing the characteristics of the ultrasound text, The representation is in the form of a pool, A text encoder is represented by a representation of the text, The marking is indicated by the fact that, Representing a medical ultrasound text description.
5. The medical ultrasound image segmentation method based on the fusion of geometric guidance and concept perception according to claim 1, wherein the step of performing feature enhancement according to a multi-scale cross-modal attention mechanism to obtain multi-scale enhanced image features specifically comprises: The ultrasonic image features and the ultrasonic text features are projected to the same hidden dimension space in a unified mode, and the specific algorithm for the unified projection to the same hidden dimension space is as follows: , , Wherein, the Representing the characteristics of the ultrasound image, Representing the characteristics of the ultrasound text, Representing the characteristics of the ultrasound image after projection, Representing the characteristics of the ultrasound text after projection, Representation of The activation is performed by a process of, The representation BatchNorm is normalized to the value of, Representation of The convolution is performed with the result that, Representation of The regularization process is performed such that, The representation LayerNorm is normalized to the value of, Representing a linear transformation; Adding a learnable 2D position code to the projected ultrasound image features, wherein the specific algorithm for adding the learnable 2D position code is as follows: , Wherein, the Representing the addition of a learnable 2D position encoded ultrasound image feature, The addition of a leachable 2D position code is indicated, And Represent the first Feature height and feature width of the layer; The image-text cross-modal correlation is calculated according to an adaptive attention mechanism, and the specific algorithm of the adaptive attention mechanism is as follows: , Wherein, the 、、 Respectively representing a query feature, a key feature and a value feature, Representing flattening treatment; and carrying out space perception enhancement according to a space modulation mechanism, wherein the specific algorithm of the space perception enhancement is as follows: , , Wherein, the The spatial weight is represented by a weight of the space, Representing the sigmoid activation function, Representation of The convolution is performed with the result that, Representing a global average pooling of the data, The broadcast multiplication is represented by a broadcast, Representing spatially aware enhanced image features; Performing multi-scale feature fusion to obtain the multi-scale enhanced image features, wherein the multi-scale feature fusion comprises the following specific algorithm: , Wherein, the Representing the multi-scale enhanced image features, Representing the stitching process.
6. The medical ultrasound image segmentation method based on geometric guidance and concept perception fusion according to claim 1, wherein the step of performing feature fusion according to a concept perception fusion mechanism to obtain concept perception fusion features specifically comprises: The medical concept attention weight is acquired, and a specific algorithm of the medical concept attention weight is as follows: , Wherein, the Representing the attention weight of the medical concept, Representation of The function of the function is that, A 100-dimensional linear transformation is represented, Representation of The activation is performed by a process of, A 256-dimensional linear transformation is represented, The representation LayerNorm is normalized to the value of, Representing ultrasonic text features; According to the graph convolution operation, each conceptual relation type corresponds to a unique learner adjacency matrix so as to perform characteristic propagation through matrix multiplication and linear transformation; The relation weight is acquired according to a dynamic relation attention mechanism, and the specific algorithm for acquiring the relation weight is as follows: , Wherein, the The weight of the relationship is represented by a weight of the relationship, A 5-dimensional linear transformation is represented, Representing a 64-dimensional linear transformation; The relation weights are weighted and combined to obtain an adjacent matrix, the adjacent matrix is dynamically adjusted according to the content of the medical ultrasonic text description, the corresponding concept relation type is activated through the medical ultrasonic text description, and the specific algorithm for obtaining the adjacent matrix is as follows: , Wherein, the Representing the adjacency matrix, The type of conceptual relationship is represented and, Representing conceptual relationship types A corresponding learnable adjacency matrix; carrying out concept embedding enhancement according to a medical knowledge graph, and carrying out concept relation modeling through graph convolution operation, wherein the specific algorithm of the concept relation modeling is as follows: , Wherein, the The concept relationship is represented by a graph of the concept, The conceptual ordinal number is represented, Represent the first The medical concept attention weights to which the individual concepts correspond, Represent the first Enhancement embedded representations corresponding to the individual concepts; Flattening ultrasonic image features into a sequence form, calculating concept-guided image features through a multi-head attention mechanism, and performing image-text-concept multi-mode information fusion through a self-adaptive gating mechanism to obtain concept perception fusion features, wherein a specific algorithm for obtaining the concept perception fusion features is as follows: , , , , , , , , , Wherein, the Representing the characteristics of the flattened image, Represents a flattening process, and is characterized in that, The image processing is represented by a method of processing an image, Representing the characteristics of the input image, Representing the characteristics of the concept-guided query, And Representing the key feature and the value feature, Represents adding one dimension in the 1 st dimension, and the dimension is 1 in size, the 1 st dimension represents the position with the index of 1, Representing a multi-headed attention-enhancing image feature, Representing the multiple head attention weighting values, Representing the multi-headed attentiveness mechanism, The feature of the global image is represented, Representing a global average pooling of the data, The feature of the global text is represented and, A linear transformation is represented and is used to represent, Respectively representing image fusion characteristics, text fusion characteristics and concept fusion characteristics, Representation of The function of the function is that, The fusion gate is shown as a function of the fusion gate, The global fusion feature is represented as such, Representing the concept-aware fusion feature, Representing the original input image features after image processing, Representation of Spatial attention after the remodeling is sought after, An extension process is represented as a process of extension, And Representing feature height and feature width, respectively.
7. The medical ultrasound image segmentation method based on geometric guidance and concept perception fusion according to claim 1, wherein the step of performing decoding processing according to the adaptive segmentation decoding module to obtain a final segmentation result specifically comprises: the adaptive segmentation decoding module is based on adaptive up-sampling, feature pyramid refinement and context-aware prediction; The self-adaptive up-sampling performs feature complexity distribution calculation according to a lightweight network, and the specific algorithm of the feature complexity distribution calculation is as follows: , Wherein, the The feature complexity profile is represented and the feature complexity profile, Representing the sigmoid activation function, Representation of The convolution is performed with the result that, Representation of The activation is performed by a process of, Representing a global average pooling of the data, Representing decoded input features; The method comprises the steps of selecting a path according to feature complexity, inputting complex features into a complex path, inputting simple features into a simple path, and carrying out self-adaptive weighted fusion and refinement on output features of the complex path and the simple path to obtain self-adaptive up-sampling refinement features, wherein a specific algorithm for obtaining the self-adaptive up-sampling refinement features is as follows: , , , , Wherein, the Representing the output characteristics of the complex path, Representing the output characteristics of a simple path, Representing a transposed convolution, Indicating an up-sampling step size adjustment operation, The filling operation is indicated as such, A zoom operation is indicated and a zoom operation is indicated, A bilinear interpolation operation is shown and, An adaptive weighted fusion feature is represented, Representing the adaptive up-sampling refinement feature, The representation BatchNorm is normalized to the value of, Representation of Convolving; The feature pyramid refinement is based on three-scale feature refinement processing, and the specific algorithm of the three-scale feature refinement processing is as follows: , , , Wherein, the 、、 Representing a first scale refinement feature, a second scale refinement feature, a third scale refinement feature, Representation of The convolution is performed with the result that, The fusion characteristics of the input are represented, Representing a double-rate adaptive up-sampling, Representing eight-fold adaptive upsampling; The context-aware prediction is based on context information and local detail features, and the specific algorithm of the context-aware prediction is as follows: , , , , , , Wherein, the The characteristics of the context are represented by the terms, The local features are represented by a graph of the local features, A fusion feature representing a contextual feature and a local feature, And Representing an auxiliary supervisory output of the device, Representing the final segmentation result.
8. A medical ultrasound image segmentation system based on fusion of geometric guidance and concept perception, comprising: the preprocessing module is used for acquiring and preprocessing a target medical ultrasonic image, inputting the preprocessed target ultrasonic image into the medical ultrasonic image segmentation model, wherein the medical ultrasonic image segmentation model comprises a geometric guide text generation module, a multi-modal feature encoding module, a concept perception cross-modal fusion module and a self-adaptive segmentation decoding module; The system comprises a geometric guide text generation module, a text generation module and a text generation module, wherein the geometric guide text generation module is used for generating texts to acquire medical ultrasonic text descriptions, and the text generation comprises geometric feature extraction and adaptive classification; the multi-mode feature coding module is used for acquiring ultrasonic text features and ultrasonic image features and comprises a text encoder and an image encoder; the concept perception cross-modal fusion module is used for carrying out feature enhancement according to a multi-scale cross-modal attention mechanism so as to obtain multi-scale enhanced image features, carrying out feature fusion according to the concept perception fusion mechanism so as to obtain concept perception fusion features, wherein the multi-scale cross-modal attention mechanism is based on an adaptive attention mechanism, and the concept perception fusion mechanism is based on medical concept representation; And the adaptive segmentation decoding module is used for carrying out decoding processing to obtain a final segmentation result, and is based on adaptive up-sampling, feature pyramid refinement and context-aware prediction.
9. A storage medium storing one or more programs which when executed by a processor implement the medical ultrasound image segmentation method based on fusion of geometric guidance and concept perception as claimed in any one of claims 1 to 7.
10. A computer device comprising a memory and a processor, wherein: The memory is used for storing a computer program; the processor is configured to implement the medical ultrasound image segmentation method based on fusion of geometric guidance and concept perception according to any one of claims 1-7 when executing the computer program stored on the memory.

Description

Medical ultrasonic image segmentation method based on fusion of geometric guidance and concept perception Technical Field The invention relates to the field of image processing, in particular to a medical ultrasonic image segmentation method based on fusion of geometric guidance and concept perception. Background Medical ultrasound imaging technology plays a key role in accurate medical and clinical decisions as a core technology, and radiological staff needs to manually segment abnormal regions from ultrasound images, however, this is not only a time-consuming process, but also the quality of the segmentation results is highly dependent on the experience of the staff, so an automated ultrasound image segmentation model becomes an urgent need. In the prior art, according to clinical practice, comprehensive image features and text descriptions are usually needed to perform comprehensive judgment, which represents the important value of multi-modal information fusion, so that a vision-text image segmentation method is developed, which aims to guide vision feature learning by introducing text descriptions so as to improve segmentation precision, however, the existing vision-text model faces a fundamental challenge in medical image segmentation task that a medical dataset can be provided with high-quality text descriptions for each image, unlike a natural image dataset, the expertise and complexity of medical images make the preparation of large-scale text descriptions extremely difficult and high-cost, and although a medical special model alleviates the problem to a certain extent, the performance of automatically generating high-quality text descriptions is still limited, and more importantly, the prior art method has significant defects in cross-modal fusion that (1) simple feature splicing or attention mechanisms cannot effectively integrate deep semantic information of images and texts, (2) the lack of modeling of medical prior knowledge, so that the fusion process and segmentation process cannot be guided by fully utilizing the expertise of medical prior knowledge. Therefore, how to design a medical ultrasonic image segmentation method to avoid the influence of lack of text labeling in a medical data set and improve segmentation accuracy becomes a problem to be solved urgently. Disclosure of Invention Based on the method, the high-quality medical text description is automatically generated according to the connected component analysis and the multidimensional geometric analysis by designing the geometric guide text generation module, the transition from text-free segmentation to text guide segmentation is realized, the problems of limited text description quantity, difficult manufacture and high cost of high quality are solved, the problem of insufficient cross-modal fusion is solved by designing the concept perception cross-modal fusion module and utilizing multi-scale space alignment and medical knowledge graph modeling, the fusion process and segmentation decision are guided by fully utilizing the professional knowledge in the medical field according to the explicit modeling of medical priori knowledge, the self-adaptive segmentation decoding module dynamically selects the optimal strategy according to the feature complexity, the efficiency is improved while the accuracy is ensured, better details can be kept in the boundary region of the complex focus, the calculation cost is obviously reduced in the uniform background region, and the accuracy and the efficiency of medical ultrasonic image segmentation are improved. The invention provides a medical ultrasonic image segmentation method based on geometric guidance and concept perception fusion, which comprises the following steps: collecting a target medical ultrasonic image and preprocessing, and inputting the preprocessed target ultrasonic image into a medical ultrasonic image segmentation model, wherein the medical ultrasonic image segmentation model comprises a geometric guide text generation module, a multi-mode feature coding module, a concept perception cross-mode fusion module and a self-adaptive segmentation decoding module; Performing text generation according to a geometric guide text generation module to obtain medical ultrasonic text description, wherein the text generation comprises geometric feature extraction and self-adaptive classification; inputting the medical ultrasonic text description and the target ultrasonic image into a multi-modal feature encoding module to obtain ultrasonic text features and ultrasonic image features, wherein the multi-modal feature encoding module comprises a text encoder and an image encoder; Inputting ultrasonic text features and ultrasonic image features into a concept perception cross-modal fusion module, carrying out feature enhancement according to a multi-scale cross-modal attention mechanism to obtain multi-scale enhanced image features, carrying out feature fusion according to the concept perception