JP-7856586-B2 - Methods for generating pre-trained models, machine learning systems, and programs

JP7856586B2JP 7856586 B2JP7856586 B2JP 7856586B2JP-7856586-B2

Inventors

工藤彰

Assignees

富士フイルム株式会社

Dates

Publication Date: 20260511
Application Date: 20220121
Priority Date: 20210127

Claims (13)

A method for generating a trained model that transforms the domain of an input medical image and outputs a generated image of a different domain, A first generator is configured using a first convolutional neural network that accepts medical image input from a first domain and outputs a first generated image from a second domain different from the first domain, A first discriminator configured using a second convolutional neural network accepts data input including first image data, which is either the first generated image generated by the first generator or a medical image of the second domain included in the training dataset, and coordinate information indicating coordinate values corresponding to the positions of each of a plurality of unit elements constituting the first image data, which indicates coordinate values in a human body coordinate system, which is an absolute coordinate system defined based on the anatomical positions of parts of the human body, and determines the authenticity of the input image. Using a learning model that has the structure of a generative adversarial network including, Computers Multiple training data sets are obtained, including medical images from the first domain and medical images from the second domain. This includes performing a learning process to train the first generator and the first discriminator in an adversarial manner based on the aforementioned plurality of training data, Method for generating a pre-trained model.
When the first generated image is input to the first discriminator, the coordinate information corresponding to the first generated image is coordinate information that indicates the coordinate values of the human body coordinate system determined in accordance with the positions of each of the multiple unit elements constituting the medical image of the first domain, which is the source image input to the first generator when generating the first generated image . A method for generating a trained model according to claim 1.
The first image data is three-dimensional data, The coordinate information includes x-coordinate information, y-coordinate information, and z-coordinate information that specify the position of each voxel, which is the unit element, in three-dimensional space. The x-coordinate information, y-coordinate information, and z-coordinate information are used as channels and combined with the channels of the first image data or the feature map of the first image data and provided to the first discriminator. A method for generating a trained model according to claim 1 or 2.
For each of the medical images used as training data, the coordinate information indicating the position of each unit element within the image is associated with it. A method for generating a trained model according to any one of claims 1 to 3.
The aforementioned computer, This includes generating coordinate information corresponding to each of the unit elements in the image for each of the medical images used as training data. A method for generating a trained model according to claim 4.
The aforementioned second convolutional neural network includes an input layer, an intermediate layer, and an output layer, wherein the intermediate layer includes a plurality of layers that perform convolution processing. The coordinate information is input to one of the intermediate layers of the second convolutional neural network . A method for generating a trained model according to any one of claims 1 to 5.
The aforementioned learning model further, A second generator configured using a third convolutional neural network that accepts medical image input from the second domain and outputs a second generated image from the first domain, The system includes a second discriminant configured using a fourth convolutional neural network that accepts input data including a second image data which is either a second generated image generated by the second generator or a medical image of the first domain included in a training dataset, and coordinate information of the human body coordinate system which indicates coordinate values corresponding to the positions of each of the multiple unit elements constituting the second image data, and determines the authenticity of the input image, The learning process includes a process for training the second generator and the second discriminator in an adversarial manner. A method for generating a trained model according to any one of claims 1 to 6.
When the second generated image is input to the second discriminator, the coordinate information corresponding to the second generated image is coordinate information that indicates the coordinate values of the human body coordinate system , which are determined in accordance with the positions of each of the multiple unit elements constituting the medical image of the second domain, which is the source image input to the second generator when generating the second generated image . A method for generating a trained model according to claim 7.
The aforementioned computer, A process to calculate the first reconstruction loss of a conversion process using the first and second generators in that order, based on the first reconstructed generated image output from the second generator by inputting the first generated image of the second domain output from the first generator to the second generator, The process includes inputting the second generated image of the first domain output from the second generator into the first generator, and then calculating the second reconstruction loss of the conversion process using the second generator and the first generator in that order, based on the second reconstructed generated image output from the first generator. A method for generating a trained model according to claim 7 or 8.
The medical image in the first domain is a first modality image taken using a first modality, which is a medical device. The medical images in the second domain are second modality images taken using a second modality, which is a different type of medical device from the first modality. The learning model is trained to receive the first modality image as input and generate a pseudo-second modality generated image that has the features of an image taken using the second modality. A method for generating a trained model according to any one of claims 1 to 9.
A machine learning system for training a learning model that transforms the domain of an input medical image to generate a generated image of a different domain, At least one first processor, The system comprises at least one first storage device in which a program executed by the at least one first processor is stored, The aforementioned learning model, A first generator is configured using a first convolutional neural network that accepts medical image input from a first domain and outputs a first generated image from a second domain different from the first domain, A first discriminator configured using a second convolutional neural network accepts data input including first image data, which is either the first generated image generated by the first generator or a medical image of the second domain included in the training dataset, and coordinate information indicating coordinate values corresponding to the positions of each of a plurality of unit elements constituting the first image data, which indicates coordinate values in a human body coordinate system, which is an absolute coordinate system defined based on the anatomical positions of parts of the human body, and determines the authenticity of the input image. It has the structure of an adversarial generative network that includes, The at least one first processor executes the instructions of the program, Multiple training data sets are obtained, including medical images from the first domain and medical images from the second domain. Based on the aforementioned plurality of training data, a learning process is performed to train the first generator and the first discriminator in an adversarial manner. Machine learning system.
A program that causes a computer to perform a process of training a learning model that transforms the domain of an input medical image to generate a generated image of a different domain, The aforementioned learning model, A first generator is configured using a first convolutional neural network that accepts medical image input from a first domain and outputs a first generated image from a second domain different from the first domain, A first discriminator configured using a second convolutional neural network accepts data input including first image data, which is either the first generated image generated by the first generator or a medical image of the second domain included in the training dataset, and coordinate information indicating coordinate values corresponding to the positions of each of a plurality of unit elements constituting the first image data, which indicates coordinate values in a human body coordinate system, which is an absolute coordinate system defined based on the anatomical positions of parts of the human body, and determines the authenticity of the input image. It has the structure of an adversarial generative network that includes, To the aforementioned computer, Multiple training data sets are obtained, including medical images from the first domain and medical images from the second domain. Based on the aforementioned plurality of training data, a learning process is executed to train the first generator and the first discriminator in an adversarial manner. program.
A non-temporary and computer-readable recording medium on which the program described in claim 12 is recorded.

Description

This invention relates to a method for generating a trained model, a machine learning system, a program, and a medical image processing device, and more particularly to machine learning techniques and image processing techniques for handling medical images. In the medical field, image diagnosis is performed using medical images acquired by various modalities such as CT (Computed Tomography) and MRI (Magnetic Resonance Imaging) devices. In recent years, the development of artificial intelligence (AI) that uses deep learning to extract organs and other body parts from medical images, detect disease areas, and classify diseases has been progressing. Patent Document 1 describes a diagnostic support system that uses AI to extract organ regions from medical images. Patent Document 2 describes a machine learning method for obtaining a learning model that generates MR (Magnetic Resonance) estimated images by estimating MR images from CT images. Non-patent document 1 describes a method for generating T2-weighted MRI images from CT images using machine learning. Non-patent document 2 proposes a method for incorporating positional information into a convolutional neural network by adding a channel representing the coordinate information of each pixel in the image. Non-patent document 3 discloses a technique that uses a network combining two Generative Adversarial Networks (GANs) configurations to learn the interconversion of heterogeneous domain images using domain-specific datasets, without using paired images as training data. Japanese Patent Publication No. 2019-149094Japanese Patent Publication No. 2020-54579 Cheng-Bin Jin, Hakil Kim, Mingjie Liu, Wonmo Jung, Seongu Joo, Eunsik Park, Young Saem Ahn, In Ho Han, Jae Il Lee, Xuenan Cui, "Deep CT to MR Synthesis Using Paired and Unpaired Data", Sensors 2019.19(10), 2361Rosanne Liu, Joel Lehman, Piero Molino, Felipe Petroski Such, Eric Frank, Alex Sergeev, Jason Yosinski, "An intriguing failing of convolutional neural networks and the CoordConv solution", ArXiv: 1807.03247Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros, "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks", ArXiv: 1703.10593 Figure 1 is an explanatory diagram illustrating the challenges in modality conversion of medical images.Figure 2 shows examples of MR images included in an MR dataset and CT images included in a CT dataset.Figure 3 shows an example of an image converted from MR to CT.Figure 4 is a conceptual diagram showing an overview of the processing in the machine learning system according to the first embodiment.Figure 5 is an explanatory diagram of the human body coordinate system applied to the first embodiment.Figure 6 shows an example of coordinate information added to an image.Figure 7 is a functional block diagram showing an example configuration of a machine learning system according to the first embodiment.Figure 8 is a functional block diagram showing an example of the configuration of the learning data generation unit.Figure 9 shows an example of a pseudo-MR image generated by a trained model trained by a learning process using the machine learning system according to the first embodiment.Figure 10 is a functional block diagram showing an example configuration of a machine learning system according to the second embodiment.Figure 11 is a schematic diagram showing the processing flow when CT input occurs in the machine learning system according to the second embodiment.Figure 12 is a schematic diagram showing the processing flow when MR input occurs in the machine learning system according to the second embodiment.Figure 13 is a block diagram showing an example configuration of an information processing device applied to a machine learning system.Figure 14 is a block diagram showing an example configuration of a medical image processing device to which a trained model generated by performing a learning process using a machine learning system is applied.Figure 15 is a block diagram showing an example of a computer hardware configuration. Preferred embodiments of the present invention will be described below with reference to the attached drawings. [Explanation of medical images] Typical examples of devices used to acquire medical images include CT scanners and MRI scanners. In these modalities, the basic idea is to obtain three-dimensional data showing the three-dimensional form of an object by sequentially acquiring two-dimensional slice images. In this specification, the term "three-dimensional data" includes the concept of a collection of sequentially acquired two-dimensional slice images and is synonymous with a three-dimensional image. The term "image" includes the meaning of image data. A collection of sequential two-dimensional slice images is sometimes called a "two-dimensional image sequence" or "two-dimensional image series." The term "two-dimensional image" includes the concept of a two-dimensional slice image extracted from three-dimens