CN-121998928-A - Three-dimensional lung nodule target detection method and system based on neural network

CN121998928ACN 121998928 ACN121998928 ACN 121998928ACN-121998928-A

Abstract

The application discloses a three-dimensional lung nodule target detection method and system based on a neural network, which are used for acquiring a lung nodule three-dimensional CT image and preprocessing the image to acquire a training set and a testing set; constructing a target detection model eFATE-Net, training and evaluating, preprocessing a three-dimensional CT image to be detected, and inputting the preprocessed three-dimensional CT image into the trained eFATE-Net to realize target detection. And a false positive reducing module is introduced into eFATE-Net to reduce false positive false alarms of lung nodules on the premise of keeping high detection sensitivity. Based on the fine granularity space features and coarse granularity semantic features of the lung nodules, global dependency modeling is performed through a self-attention mechanism so as to reconstruct and strengthen the deep semantic features of the lung nodules, and the purpose is to further reduce false positive of the lung nodules. A gating attention mechanism is introduced to promote the selectivity of key features, enhance the response of the model to the key features of the lung nodules and promote the detection performance of the lung nodules.

Inventors

GUO JINHONG
ZOU YUANYUAN
CHENG JIE
Lei Qinyao
HAN JINGYI
SHI LU

Assignees

山东大学齐鲁医院

Dates

Publication Date: 20260508
Application Date: 20260120

Claims (10)

1. The three-dimensional lung nodule target detection method based on the neural network is characterized by comprising the following steps of: Acquiring a three-dimensional CT image of a lung nodule and preprocessing the image to acquire a training set and a testing set; constructing a target detection model eFATE-Net based on a 3D-CNN and transducer mixed architecture, and training and evaluating by adopting the training set and the testing set; And preprocessing the three-dimensional CT image to be detected, and inputting the preprocessed three-dimensional CT image into the eFATE-Net after training is completed to realize target detection.
2. The neural network-based three-dimensional lung nodule target detection method of claim 1, wherein the acquiring a three-dimensional CT image of a lung nodule and preprocessing the image to obtain a training set and a test set comprises: Acquiring a three-dimensional CT image of a lung nodule from a common dataset; Eliminating CT images with the node diameter smaller than a preset value according to the follow-up of the slices and the spacing between the slices; And labeling the rest CT images, dividing the labeled data set into a plurality of subsets which are not overlapped with each other, and dividing the data set into a training set and a testing set according to a preset proportion.
3. The neural network-based three-dimensional lung nodule target detection method according to claim 1, wherein the eFATE-Net comprises a feature extraction module and a false positive reduction module, wherein after a three-dimensional CT image is input into the feature extraction module, the feature extraction module extracts main features of the three-dimensional CT image and outputs preliminary prediction probability and position of a candidate nodule, and the false positive reduction module is used for receiving output of the feature extraction module to reclassify the candidate nodule so as to distinguish a true positive nodule from a false positive nodule.
4. The neural network-based three-dimensional lung nodule target detection method according to claim 3, wherein the feature extraction module uses a U-shaped encoder-decoder as a basic framework, and recovers spatial information loss caused by downsampling through multi-scale jump connection; The encoder receives the preprocessed three-dimensional CT image, firstly enters a convolution operation module formed by two groups of basic convolution units to perform primary feature extraction, finishes first dimension reduction by controlling convolution steps, and then performs feature extraction and downsampling by four repeated basic residual convolution modules, wherein the basic residual convolution modules are formed by two groups of basic convolution units with residual connection established until a bottleneck layer outputs a feature map after downsampling of the encoder; The encoder and the decoder introduce a gating attention module between each layer of feature graphs to dynamically adjust channel weight, and introduce a multi-head attention module between the encoder and the decoder to establish a spatial attention relation between the features in a high semantic space; finally, the output of the encoder is input to a region proposal head that outputs preliminary prediction probabilities and locations of the nodule candidates on the fused multi-scale features.
5. The neural network-based three-dimensional lung nodule target detection method of claim 4, wherein the gated attention module comprises a global channel content embedding module, a channel normalization module, and a gated adaptive activation; the global channel content embedding module receives the feature map Calculate for each channel The norms are used for extracting global information of each channel characteristic diagram and training parameters The feature vector used to control the weight of each channel after the global channel content is embedded is defined as: Wherein C, D, H, W represents the number of channels, depth, height and width of the feature map, c represents the index of the channels, Is the first A characteristic map of the channel is provided, Represent the first The eigenvalues of the channel profile at positions (i, j, k), In order to prevent an extremely small constant of unstable values, Representation of A norm; And then the channel normalization module performs channel normalization on the feature vector after the global content is embedded: wherein the global feature vector before normalization , Represent the first Channel-corresponding eigenvalues, scalar quantities For normalization of Is to avoid the scale of the channel number is large Is too small in scale; finally, the gating adaptive activation selectively adjusts the activation intensity of different channel feature graphs, and the attention weights of different channel features are obtained through the learnable gating operators through competition and cooperation of different features in the training process: Wherein: Represent the first The attention modulated output profile of the channel, As hyperbolic tangent function, when gating weight When activated in the forward direction, the profile importance of the channel is amplified, and otherwise suppressed.
6. The neural network-based three-dimensional lung nodule target detection method according to claim 4 or 5, wherein a multi-scale jump connection module is introduced between the encoder and the decoder to enrich multi-scale information of the spliced feature map, and the multi-scale jump connection module respectively applies 3D convolution, upsampling and downsampling operations to different scale feature maps to align and fuse context information from adjacent scales, so as to generate a multi-scale fusion feature map with a higher expression capability.
7. The neural network-based three-dimensional lung nodule target detection method of claim 4, wherein the false positive reduction module takes a 4-fold downsampling feature block of the encoder as input, and extracts fine-grained spatial features and deep semantic features along two parallel branches respectively; The first branch uses 3D ResNet50 as a basic model framework to rotate, horizontally overturn and transform data enhancement operation of histogram intensity offset to construct positive sample pairs, a random sampling mode constructs negative sample pairs, and self-supervision training is completed by maximizing and minimizing feature similarity between the positive and negative sample pairs; The multi-scale features output by the two branches are further integrated by the feature fusion module, and confidence degrees of the candidate nodules are re-scored by the re-scoring head so as to more accurately identify the false positive nodules.
8. The neural network-based three-dimensional lung nodule target detection method of claim 7, wherein the feature fusion module comprises two processing branches, a first processing branch receiving deep semantic features of the first branch output The second processing branch receives the shallow fine granularity characteristic of the output of the second branch ; And Is obtained by respectively reducing the dimension to 256 dimensions and embedding the learned positions And The global dependency relationship among the features is calculated through self-attention so as to reconstruct the semantic features of the lung nodule image, and the calculation formula is as follows: Wherein Q, K and V are respectively characteristics And The resulting 3 attention matrices are combined, d being the vector dimension, as a scaling factor to smooth the output of softmax.
9. The neural network-based three-dimensional lung nodule target detection method of claim 7, wherein the region proposal is composed of a plurality of 3D convolution layers of convolution kernel size 1 x 1 for predicting confidence score and location regression parameters of the nodule candidates; the re-scoring head is composed of two parallel full-connection layers, the characteristics of the fused candidate nodules are taken as input, and the candidate nodules are re-scored so as to reduce false positive false alarms; training with different loss functions for the region proposal head and the re-scoring head: loss function of region proposal header Including classification losses And regression loss The loss function is as follows: Wherein: For the indexing of the anchor frame, And The total number of anchor frames for classification and regression respectively, To weight the cross entropy loss, to balance the positive and negative sample imbalance problem, Is that Loss for measuring geometric error loss between prediction frame and truth frame, parameter Is used to balance the contribution between classification and regression losses, Representing model prediction The probability of the presence of a nodule candidate in the anchor boxes, For the corresponding truth value tag, And The relative offsets of the prediction frame and the truth frame relative to the ith anchor frame are respectively defined as: Wherein: Representing the spatial location and the border length of the nodule prediction frame respectively, The spatial location and border length of the nodule truth box are represented separately, The position and size parameters of the anchor frame are respectively; loss function of re-scoring head Is defined as: Wherein: For the index of the nodule candidates, And (3) with As to the number of nodule candidates, Representing model prediction The probability of the presence of a nodule candidate in the proposed box, For the corresponding truth value tag, And Representing the relative first of the prediction frame and the truth frame The relative offsets of the proposal boxes; In the training process of the regional proposal head, a difficult sample is selected as a training sample to guide the model to focus on the training sample with more distinguishing difficulty, and in the training process of the heavy scoring head, a part of random sample training strategies are introduced to enrich the diversity of the training samples of the false positive reduction module so as to increase the generalization capability of the heavy scoring head under different sample distributions.
10. A neural network-based three-dimensional pulmonary nodule target detection system, comprising: The acquisition module is used for acquiring a three-dimensional CT image of the lung nodule and preprocessing the image to acquire a training set and a testing set; The model construction module is used for constructing a target detection model eFATE-Net based on a 3D-CNN and transducer mixed architecture, and training and evaluating by adopting the training set and the testing set; And the target detection module is used for preprocessing the three-dimensional CT image to be detected and inputting the preprocessed three-dimensional CT image into the eFATE-Net with training completed to realize target detection.

Description

Three-dimensional lung nodule target detection method and system based on neural network Technical Field The application relates to the technical field of image processing, in particular to a three-dimensional lung nodule target detection method and system based on a neural network. Background Lung cancer is one of the malignant tumors with highest morbidity and mortality in the global scope, and the high mortality is mainly caused by delay of lung cancer diagnosis time, so that timely discovery and intervention treatment of early lung cancer are important to improving survival rate of patients. Early lung cancer is typically represented by round or irregularly shaped pulmonary nodules, computed tomography is an effective means of lung nodule detection and lung cancer screening, but analyzing CT results in a single patient often requires a highly experienced physician to spend a significant amount of time processing hundreds of images to complete, and the presence of small nodules, morphologically irregular nodules further increases the difficulty of detection. A computer-aided detection system may be used to automatically mark candidate suspicious nodules quickly to assist the physician in further improving the screening efficiency and detection rate of pulmonary nodules. With the development of deep learning, a Convolutional Neural Network (CNN) -based target detection algorithm has been effectively applied to a target detection task of a lung nodule. In recent years, a basic model trained by self-supervised learning has achieved great success in the fields of natural language processing and computer vision on a large amount of unlabeled data, and can be effectively applied to various downstream tasks. In the lung nodule target detection task, a neural network based on a two-dimensional convolution structure is generally used in a traditional target detection algorithm, and three-dimensional target detection output is generated after feature or detection results are fused or stacked through layering processing on slices of an original CT image. The mainstream algorithm framework can be divided into a two-stage target detection model represented by the R-CNN series and a one-stage target detection model represented by the YOLO series. Most of the current lung nodule target detection methods still rely on CNN architecture trained from zero, and lack of schemes for fully utilizing deep semantic information and general image features contained in a basic model makes it difficult for subsequent false positive nodule screening to acquire sufficiently rich context priors. Therefore, how to effectively introduce the feature mining capability of the medical image basic model into the lung nodule target detection system becomes a key challenge for improving the overall performance of the system and reducing the false alarm rate. Disclosure of Invention In order to solve the technical problems, the application provides the following technical scheme: In a first aspect, an embodiment of the present application provides a method for detecting a three-dimensional lung nodule target based on a neural network, including: Acquiring a three-dimensional CT image of a lung nodule and preprocessing the image to acquire a training set and a testing set; constructing a target detection model eFATE-Net based on a 3D-CNN and transducer mixed architecture, and training and evaluating by adopting the training set and the testing set; And preprocessing the three-dimensional CT image to be detected, and inputting the preprocessed three-dimensional CT image into the eFATE-Net after training is completed to realize target detection. In one possible implementation, the acquiring a three-dimensional CT image of a lung nodule and preprocessing the image to obtain a training set and a test set includes: Acquiring a three-dimensional CT image of a lung nodule from a common dataset; Eliminating CT images with the node diameter smaller than a preset value according to the follow-up of the slices and the spacing between the slices; And labeling the rest CT images, dividing the labeled data set into a plurality of subsets which are not overlapped with each other, and dividing the data set into a training set and a testing set according to a preset proportion. In one possible implementation, the eFATE-Net comprises a feature extraction module and a false positive reduction module, wherein after a three-dimensional CT image is input into the feature extraction module, the feature extraction module extracts main features of the three-dimensional CT image and outputs preliminary prediction probability and position of a candidate nodule, and the false positive reduction module is used for receiving the output of the feature extraction module to re-score the candidate nodule so as to distinguish a true positive nodule from a false positive nodule. In one possible implementation manner, the feature extraction module uses a U-shaped encoder-decoder as