US-12620465-B2 - Learning apparatus, learning method, trained model, and program
Abstract
The learning apparatus includes a processor ( 129 ), a memory ( 114 ), and a learning model ( 126 ). The processor ( 129 ) performs processing of inputting a pseudo simple X-ray image ( 204 ), which is generated by projecting an X-ray CT image ( 202 ), to the learning model ( 126 ), processing of generating a second interpretation report ( 208 ) with respect to the pseudo simple X-ray image ( 204 ) by converting a first interpretation report ( 206 ), processing of acquiring an error between an estimation report ( 210 ) with respect to the pseudo simple X-ray image ( 204 ) output by the learning model ( 126 ) on the basis of the input pseudo simple X-ray image ( 204 ), and the second interpretation report ( 208 ), and processing of training the learning model ( 126 ) by using the error.
Inventors
- Yuta HIASA
Assignees
- FUJIFILM CORPORATION
Dates
- Publication Date
- 20260505
- Application Date
- 20230723
- Priority Date
- 20210126
Claims (15)
- 1 . A learning apparatus comprising: a processor; a memory that stores a training data set of an X-ray CT image having three dimensional information and a first interpretation report with respect to the X-ray CT image; and a learning model that generates an interpretation report from a simple X-ray image having two dimensional information, wherein the processor performs: processing of acquiring a pseudo simple X-ray image, which is generated by projecting the X-ray CT image; processing of generating a second interpretation report with respect to the pseudo simple X-ray image by converting the first interpretation report; processing of inputting the pseudo simple X-ray image to the learning model and acquiring an estimation report with respect to the pseudo simple X-ray image; processing of acquiring an error between the estimation report and the second interpretation report; and processing of training the learning model by using the error.
- 2 . The learning apparatus according to claim 1 , wherein in the processing of generating the second interpretation report, the second interpretation report is generated from the first interpretation report by converting an organ label included in the first interpretation report into an organ label of the second interpretation report.
- 3 . The learning apparatus according to claim 1 , wherein in the processing of generating the second interpretation report, the second interpretation report is generated from the first interpretation report by converting a disease label included in the first interpretation report into a disease label of the second interpretation report.
- 4 . The learning apparatus according to claim 1 , wherein in the processing of generating the second interpretation report, a first knowledge graph corresponding to the first interpretation report is converted into a second knowledge graph corresponding to the second interpretation report, and the second interpretation report is generated on the basis of the conversion.
- 5 . The learning apparatus according to claim 1 , wherein in a case in which the memory stores the X-ray CT image obtained by imaging a subject in a first posture and the learning model generates an interpretation report from the simple X-ray image obtained by imaging the subject in a second posture, in the processing of inputting the pseudo simple X-ray image, the pseudo simple X-ray image in the second posture is generated from the X-ray CT image in the first posture, and the pseudo simple X-ray image in the second posture is input to the learning model.
- 6 . The learning apparatus according to claim 1 , wherein in the processing of inputting the pseudo simple X-ray image, the pseudo simple X-ray image projected in a first direction and the pseudo simple X-ray image projected in a second direction are generated from the X-ray CT image, and the pseudo simple X-ray image projected in the first direction and the pseudo simple X-ray image projected in the second direction are input to the learning model.
- 7 . The learning apparatus according to claim 1 , wherein the memory stores an additional training data set of the simple X-ray image and a disease label of the simple X-ray image, and in the processing of acquiring the error, an error between the estimation report with respect to the pseudo simple X-ray image output by the learning model with reference to the disease label, and the second interpretation report is acquired.
- 8 . The learning apparatus according to claim 1 , wherein the memory stores an additional training data set of the simple X-ray image and a third interpretation report with respect to the simple X-ray image, and in the processing of acquiring the error, the error between the estimation report with respect to the pseudo simple X-ray image output by the learning model on the basis of the input pseudo simple X-ray image, and the second interpretation report and an error between an estimation report with respect to the simple X-ray image output by the learning model on the basis of the input simple X-ray image, and the third interpretation report are acquired.
- 9 . The learning apparatus according to claim 1 , wherein the processor generates the second interpretation report by converting a text stated in the first interpretation report on a basis of a predefined conversion list.
- 10 . A learning method in which a processor trains a learning model, which generates an interpretation report from a simple X-ray image having two dimensional information, by using a training data set of an X-ray CT image having three dimensional information and a first interpretation report with respect to the X-ray CT image stored in a memory, the learning method comprising: acquiring a pseudo simple X-ray image, which is generated by projecting the X-ray CT image; generating a second interpretation report with respect to the pseudo simple X-ray image by converting the first interpretation report; inputting the pseudo simple X-ray image to the learning model and acquiring an estimation report with respect to the pseudo simple X-ray image; acquiring an error between the estimation report and the second interpretation report; and training the learning model by using the error.
- 11 . The learning method according to claim 10 , wherein the second interpretation report is generated from the first interpretation report by converting an organ label included in the first interpretation report into an organ label of the second interpretation report.
- 12 . The learning method according to claim 10 , wherein the second interpretation report is generated from the first interpretation report by converting a disease label included in the first interpretation report into a disease label of the second interpretation report.
- 13 . The learning method according to claim 10 , wherein a first knowledge graph corresponding to the first interpretation report is converted into a second knowledge graph corresponding to the second interpretation report, and the second interpretation report is generated on the basis of the conversion.
- 14 . A non-transitory, computer-readable tangible recording medium on which a program for causing, when read by a computer, a processor of the computer to execute the learning method according to claim 10 is recorded.
- 15 . A trained model trained by the learning method according to claim 10 .
Description
CROSS-REFERENCE TO RELATED APPLICATIONS The present application is a Continuation of PCT International Application No. PCT/JP2022/001350 filed on Jan. 17, 2022 claiming priority under 35 U.S.C § 119(a) to Japanese Patent Application No. 2021-010381 filed on Jan. 26, 2021. Each of the above applications is hereby expressly incorporated by reference, in its entirety, into the present application. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a learning apparatus, a learning method, a trained model, and a program, and particularly to a learning apparatus, a learning method, a trained model, and a program that perform learning regarding output of an interpretation report. 2. Description of the Related Art In the related art, a disease or the like has been interpreted from a simple X-ray image by doctors or others, and interpretation results have been compiled into an interpretation report. However, interpretation of the simple X-ray image is not easy, even for a doctor, and the accuracy of the interpretation report may be low. Here, the simple X-ray image is a two dimensional image obtained by emitting X-rays and capturing a shadow on a plane. In recent years, a trained model, which has been trained to output an interpretation report with respect to an input of a simple X-ray image using a machine learning technique, has been proposed. For example, Yuan, Jianbo, et al., “Automatic radiology report generation based on multi-view image fusion and medical concept enrichment.”, MICCAI, 2019 and Li, Christy Y., et al. “Knowledge-driven encode, retrieve, paraphrase for medical image report generation.”, AAAI, 2019 describe a technique related to machine learning in which a chest X-ray image (simple X-ray image) is input and an interpretation report is output. SUMMARY OF THE INVENTION Here, in the techniques described in Yuan, Jianbo, et al., “Automatic radiology report generation based on multi-view image fusion and medical concept enrichment.”, MICCAI, 2019 and Li, Christy Y., et al. “Knowledge-driven encode, retrieve, paraphrase for medical image report generation.”, AAAI, 2019, a simple X-ray image having two dimensional information and an interpretation report thereof are used as training data. As described above, creation of the interpretation report of the simple X-ray image is not easy, even for a doctor, and the accuracy of the interpretation report may be low. One of the reasons for this is that since organs or the like originally having a three dimensional shape are shown as a two dimensional image in the simple X-ray image, the organs may be shown in an overlapping manner, or the original shape of the organs may be difficult to grasp. In addition, a trained model that has been trained by using such an interpretation report with low accuracy may not be able to output an interpretation report with high accuracy. The present invention has been made in view of such circumstances, and an object of the present invention is to provide a learning apparatus, a learning method, and a program for generating a trained model that outputs an interpretation report with high accuracy by using high-quality training data with high accuracy, and a trained model trained by the learning method. According to an aspect of the present invention for achieving the object, there is provided a learning apparatus comprising a processor, a memory that stores a training data set of an X-ray CT image having three dimensional information and a first interpretation report with respect to the X-ray CT image, and a learning model that generates an interpretation report from a simple X-ray image having two dimensional information, in which the processor performs processing of inputting a pseudo simple X-ray image, which is generated by projecting the X-ray CT image, to the learning model, processing of generating a second interpretation report with respect to the pseudo simple X-ray image by converting the first interpretation report, processing of acquiring an error between an estimation report with respect to the pseudo simple X-ray image output by the learning model on the basis of the input pseudo simple X-ray image, and the second interpretation report, and processing of training the learning model by using the error. According to the aspect, the pseudo simple X-ray image and the second interpretation report with respect to the pseudo simple X-ray image are generated from the training data set of the X-ray CT image having three dimensional information and the first interpretation report with respect to the X-ray CT image, and learning is performed by using the pseudo simple X-ray image and the second interpretation report. Accordingly, in the aspect, since learning is performed using the pseudo X-ray image and the second interpretation report based on the X-ray CT image having a large amount of information and the first interpretation report, learning can be performed such that an interpretation r