Search

CN-115861998-B - Training method, device and equipment applied to pre-training model of three-dimensional image

CN115861998BCN 115861998 BCN115861998 BCN 115861998BCN-115861998-B

Abstract

The embodiment of the application discloses a training method, a training device and training equipment for a pre-training model applied to a three-dimensional image, and belongs to the technical field of artificial intelligence. The method comprises the steps of repeating the following steps until a preset condition is reached, determining a first feature vector of a mask image to be processed based on a first model to be trained, determining a second feature vector of a three-dimensional image to be processed, obtaining third feature vectors of other three-dimensional images, updating parameters of the first model to be trained according to the first feature vector, the second feature vector and the third feature vectors, and extracting feature images of the three-dimensional images by the first model to be trained after the preset condition is reached. And forming a positive sample pair by the first feature vector and the second feature vector of the same three-dimensional image to be processed, forming a negative sample pair by the first feature vector of the three-dimensional image to be processed and the third feature vector of other three-dimensional images, and updating parameters of the first model to be trained. The feature map of the three-dimensional image obtained by the obtained model is accurate.

Inventors

  • LI ANWEI

Assignees

  • 广州视源电子科技股份有限公司
  • 广州视源人工智能创新研究院有限公司

Dates

Publication Date
20260512
Application Date
20210923

Claims (20)

  1. 1. A training method for a pre-training model applied to a three-dimensional image, the method comprising: Acquiring a three-dimensional image to be processed, a mask image to be processed corresponding to the three-dimensional image to be processed and at least one other three-dimensional image different from the three-dimensional image to be processed, and repeating the following steps until a preset condition is reached: determining a first feature vector of the mask image to be processed and a second feature vector of the three-dimensional image to be processed based on a first model to be trained, and acquiring a third feature vector of each other three-dimensional image, wherein the third feature vector characterizes global features of the other three-dimensional images; Updating parameters of the first model to be trained according to the first feature vector, the second feature vector and each third feature vector; the first model to be trained after reaching the preset condition is used for extracting a feature map of the three-dimensional image to be analyzed; The determining a first feature vector of the mask image to be processed based on a first model to be trained includes: Determining a feature map of the mask image to be processed based on the first model to be trained, and determining a feature map of a local image corresponding to the three-dimensional image to be processed based on the first model to be trained, wherein the mask image to be processed is obtained according to the three-dimensional image to be processed and the local image, and the size of the mask image to be processed is the same as that of the three-dimensional image to be processed; Mapping transformation processing is carried out on the feature map of the mask image to be processed based on a first multi-layer perceptron network model to obtain a first sub-vector, and mapping transformation processing is carried out on the feature map of the local image based on a second multi-layer perceptron network model to obtain a second sub-vector; and determining the first characteristic vector according to the first sub-vector and the second sub-vector.
  2. 2. The method of claim 1, wherein updating parameters of the first model to be trained based on the first feature vector, the second feature vector, and each of the third feature vectors comprises: Determining a first calculation parameter according to the first feature vector, the second feature vector and a preset super-parameter coefficient, and determining a second calculation parameter according to the first feature vector, each third feature vector and the preset super-parameter coefficient; establishing a decision function according to the first calculation parameter and the second calculation parameter; And carrying out gradient feedback processing on the decision function so as to update parameters of the first model to be trained.
  3. 3. The method of claim 1, wherein the second feature vector is obtained by processing the three-dimensional image to be processed based on a second model to be trained, and wherein updating parameters of the first model to be trained based on the first feature vector, the second feature vector, and each of the third feature vectors further comprises: updating parameters of the second model to be trained according to the updated first model to be trained; the second model to be trained after reaching the preset condition is used for extracting a feature map of the three-dimensional image to be analyzed.
  4. 4. A method according to claim 3, wherein updating parameters of the second model to be trained based on the updated first model to be trained comprises: And updating the parameters of the second model to be trained according to the updated first model to be trained, the original parameters in the second model to be trained and the preset updating momentum.
  5. 5. The method of claim 1, wherein acquiring a three-dimensional image to be processed, a mask image to be processed corresponding to the three-dimensional image to be processed, comprises: acquiring the three-dimensional image to be processed, and preprocessing the three-dimensional image to be processed to obtain a local image corresponding to the three-dimensional image to be processed; And determining the mask image to be processed based on the three-dimensional image to be processed and the local image, wherein the size of the mask image to be processed is the same as the size of the three-dimensional image to be processed.
  6. 6. The method according to claim 5, wherein preprocessing the three-dimensional image to be processed to obtain a partial image corresponding to the three-dimensional image to be processed, comprises: Performing segmentation processing on the three-dimensional image to be processed to obtain a plurality of segmented images; Determining the variance of each segmented image, and removing segmented images with variances smaller than a preset threshold value to obtain filtered segmented images; And randomly determining one of the filtered segmented images as the local image.
  7. 7. The method of claim 6, wherein a size of the first feature vector is equal to or greater than a total number of the plurality of segmented images.
  8. 8. The method of claim 1, wherein determining the first feature vector from the first sub-vector and the second sub-vector comprises: performing restoration mapping projective transformation processing on the first sub-vector and the second sub-vector based on a third multi-layer perceptron network model to obtain the first feature vector; Or performing feature stitching processing on the first sub-vector and the second sub-vector to obtain the first feature vector.
  9. 9. The method according to any one of claims 1-8, wherein determining a second eigenvector of the three-dimensional image to be processed comprises: Performing feature extraction processing on the three-dimensional image to be processed based on a second model to be trained to obtain a second feature map; And mapping and transforming the second feature map based on a fourth multi-layer perceptron network model to obtain the second feature vector.
  10. 10. The method according to any one of claims 1-8, wherein the first model to be trained is a convolutional network model or a deep learning model, and the second model to be trained for processing the three-dimensional image to be processed is a convolutional network model or a deep learning model.
  11. 11. A method for extracting a feature map of a three-dimensional image, the method comprising: Acquiring a three-dimensional image to be analyzed, and inputting the three-dimensional image to be analyzed into a feature extraction model to obtain a feature image of the three-dimensional image to be analyzed, wherein the feature extraction model is a first model to be trained after the preset condition is reached in the method according to any one of claims 1-10; And determining the recognition result of the three-dimensional image to be analyzed according to the feature map of the three-dimensional image to be analyzed.
  12. 12. The method according to claim 11, wherein determining the recognition result of the three-dimensional image to be analyzed according to the feature map of the three-dimensional image to be analyzed comprises: Determining a feature vector of the three-dimensional image to be analyzed according to the feature map of the three-dimensional image to be analyzed; inputting the feature vector of the three-dimensional image to be analyzed into a preset recognition model to obtain a recognition result of the three-dimensional image to be analyzed; the identification result is any one of the category of the three-dimensional image to be analyzed, the characteristic of the three-dimensional image to be analyzed and the structure segmentation result of the three-dimensional image to be analyzed.
  13. 13. A training device for a pre-training model applied to a three-dimensional image, the device comprising: A first acquisition unit is provided for acquiring a first data stream, the method comprises the steps of acquiring a three-dimensional image to be processed and a mask image to be processed corresponding to the three-dimensional image to be processed; A second acquisition unit configured to acquire at least one other three-dimensional image different from the three-dimensional image to be processed; The execution unit is used for repeating the following units until reaching the preset condition: a first determining unit, configured to determine a first feature vector of the mask image to be processed based on a first model to be trained; a second determining unit, configured to determine a second feature vector of the three-dimensional image to be processed; A third obtaining unit, configured to obtain a third feature vector of each of the other three-dimensional images, where the third feature vector characterizes global features of the other three-dimensional images; The first updating unit is used for updating parameters of the first model to be trained according to the first feature vector, the second feature vector and each third feature vector; the first model to be trained after reaching the preset condition is used for extracting a feature map of the three-dimensional image to be analyzed; The first determination unit includes: The third determining module is used for determining a feature map of the mask image to be processed based on a first model to be trained and determining a feature map of a local image corresponding to the three-dimensional image to be processed based on the first model to be trained, wherein the mask image to be processed is obtained according to the three-dimensional image to be processed and the local image, and the size of the mask image to be processed is the same as that of the three-dimensional image to be processed; The third processing module is used for carrying out mapping transformation processing on the feature map of the mask image to be processed based on the first multi-layer perceptron network model to obtain a first sub-vector, and carrying out mapping transformation processing on the feature map of the local image based on the second multi-layer perceptron network model to obtain a second sub-vector; and the fourth determining module is used for determining the first characteristic vector according to the first sub-vector and the second sub-vector.
  14. 14. The apparatus of claim 13, wherein the first updating unit comprises: the first determining module is used for determining a first calculation parameter according to the first feature vector, the second feature vector and a preset super-parameter coefficient, and determining a second calculation parameter according to the first feature vector, each third feature vector and the preset super-parameter coefficient; the establishing module is used for establishing a decision function according to the first calculation parameter and the second calculation parameter; And the updating module is used for carrying out gradient return processing on the decision function so as to update the parameters of the first model to be trained.
  15. 15. The apparatus of claim 13, wherein the second feature vector is obtained by processing the three-dimensional image to be processed based on a second model to be trained, the apparatus further comprising: The second updating unit is used for updating the parameters of the second model to be trained according to the updated first model to be trained after the first updating unit updates the parameters of the first model to be trained according to the first feature vector, the second feature vector and each third feature vector; the second model to be trained after reaching the preset condition is used for extracting a feature map of the three-dimensional image to be analyzed.
  16. 16. The apparatus according to claim 15, wherein the second updating unit is specifically configured to: And updating the parameters of the second model to be trained according to the updated first model to be trained, the original parameters in the second model to be trained and the preset updating momentum.
  17. 17. The apparatus of claim 13, wherein the first acquisition unit comprises: The acquisition module is used for acquiring the three-dimensional image to be processed, preprocessing the three-dimensional image to be processed, and obtaining a local image corresponding to the three-dimensional image to be processed; and the second determining module is used for determining the mask image to be processed based on the three-dimensional image to be processed and the local image, wherein the size of the mask image to be processed is the same as that of the three-dimensional image to be processed.
  18. 18. The apparatus of claim 17, wherein the obtaining module is specifically configured to: Performing segmentation processing on the three-dimensional image to be processed to obtain a plurality of segmented images; Determining the variance of each segmented image, and removing segmented images with variances smaller than a preset threshold value to obtain filtered segmented images; And randomly determining one of the filtered segmented images as the local image.
  19. 19. The apparatus of claim 18, wherein a size of the first feature vector is equal to or greater than a total number of the plurality of segmented images.
  20. 20. The apparatus according to claim 13, wherein the fourth determining module is specifically configured to: performing restoration mapping projective transformation processing on the first sub-vector and the second sub-vector based on a third multi-layer perceptron network model to obtain the first feature vector; Or performing feature stitching processing on the first sub-vector and the second sub-vector to obtain the first feature vector.

Description

Training method, device and equipment applied to pre-training model of three-dimensional image Technical Field The application relates to the technical field of artificial intelligence, in particular to a training method, device and equipment of a pre-training model applied to a three-dimensional image. Background With the development of artificial intelligence technology, three-dimensional images can be identified. For example, three-dimensional medical images are identified, image information, identification results, and the like in the three-dimensional medical images are obtained. In the prior art, a three-dimensional image can be processed based on a self-supervision mode to obtain a trained self-supervision model, the self-supervision model can be used for determining a characteristic diagram of the three-dimensional image, and the three-dimensional image is obtained after different data amplification is carried out on each three-dimensional image adopted in the training process. And further carrying out a follow-up identification process of the three-dimensional image based on the feature map of the three-dimensional image. However, in the prior art, since three-dimensional images of different objects under the same target object are very similar, and even if data amplification is performed on each three-dimensional image, the obtained three-dimensional images are very similar, in the above manner, only different data amplification is performed on each three-dimensional image, the amplified three-dimensional images are directly applied to the self-supervision learning process, a good self-supervision model cannot be obtained according to the three-dimensional images, and the feature map of the three-dimensional image obtained by the obtained self-supervision model is inaccurate and cannot be used in the subsequent image recognition process. Disclosure of Invention The embodiment of the application provides a training method, a device and equipment for a pre-training model applied to a three-dimensional image, which can solve the problem that a good self-monitoring model cannot be obtained according to the three-dimensional image after data amplification in the existing training process of a self-monitoring mode, and the characteristic diagram of the three-dimensional image obtained by the obtained self-monitoring model is inaccurate. The technical scheme is as follows: In a first aspect, an embodiment of the present application provides a training method of a pre-training model applied to a three-dimensional image, where the method includes: Acquiring a three-dimensional image to be processed, a mask image to be processed corresponding to the three-dimensional image to be processed and at least one other three-dimensional image different from the three-dimensional image to be processed, and repeating the following steps until a preset condition is reached: determining a first feature vector of the mask image to be processed and a second feature vector of the three-dimensional image to be processed based on a first model to be trained, and acquiring a third feature vector of each other three-dimensional image, wherein the third feature vector characterizes global features of the other three-dimensional images; Updating parameters of the first model to be trained according to the first feature vector, the second feature vector and each third feature vector; The first model to be trained after reaching the preset condition is used for extracting the feature map of the three-dimensional image to be analyzed. In a possible implementation manner, updating parameters of the first model to be trained according to the first feature vector, the second feature vector and each third feature vector includes: Determining a first calculation parameter according to the first feature vector, the second feature vector and a preset super-parameter coefficient, and determining a second calculation parameter according to the first feature vector, each third feature vector and the preset super-parameter coefficient; establishing a decision function according to the first calculation parameter and the second calculation parameter; And carrying out gradient feedback processing on the decision function so as to update parameters of the first model to be trained. In a possible implementation manner, the second feature vector is obtained by processing the three-dimensional image to be processed based on a second model to be trained, and after updating parameters of the first model to be trained according to the first feature vector, the second feature vector and each third feature vector, the method further comprises: updating parameters of the second model to be trained according to the updated first model to be trained; the second model to be trained after reaching the preset condition is used for extracting a feature map of the three-dimensional image to be analyzed. In a possible implementation manner, updating parameters