Search

JP-2026075281-A - Risk prediction device, risk prediction method, and program

JP2026075281AJP 2026075281 AJP2026075281 AJP 2026075281AJP-2026075281-A

Abstract

[Challenge] To predict risks with high accuracy. [Solution] In the risk prediction device, the acquisition means acquires data from multiple different modalities for a single subject. The completion means acquires relationship information from a storage unit that stores relationship information indicating the relationships between the probability distribution data of each modality. Then, if data for at least one modality among the data of multiple modalities is missing, the completion means generates probability distribution data for the missing modality based on the relationship information. The encoder converts the data of each modality into probability distribution data showing the probability distribution in latent space. The integration unit integrates the probability distribution data of each modality and generates integrated probability distribution data. The predictor predicts the risk based on the integrated probability distribution data. By using the risk prediction device to predict disease risk, it is possible to support decision-making regarding the lifestyle habits of the subject. [Selection Diagram] Figure 6

Inventors

  • 黄 晨暉
  • 我田 健介
  • 二瓶 史行

Assignees

  • 日本電気株式会社

Dates

Publication Date
20260508
Application Date
20241022

Claims (10)

  1. A means for acquiring data on multiple different modalities for a single object, A storage unit that stores relationship information showing the relationships between probability distribution data for each modality, A means for compensating for the absence of data for at least one of the aforementioned modalities, based on the relationship information, when data for at least one modality is missing, and which generates probability distribution data for the missing modality. An encoder that converts data from each modality into probability distribution data that shows the probability distribution in the latent space, An integration unit that combines probability distribution data from each modality and generates unified probability distribution data, A predictor that predicts risk based on the aforementioned integrated probability distribution data, A risk prediction device equipped with the following features.
  2. The risk prediction device according to claim 1, wherein the aforementioned relevance information includes the covariance between the probability distribution data of each modality.
  3. The risk prediction device according to claim 2, wherein the complementation means generates probability distribution data of the missing modality based on random numbers and the relevance information.
  4. The risk prediction device according to claim 1, comprising a relationship information generation means that generates the relationship information based on data from multiple modalities of multiple different subjects.
  5. The aforementioned probability distribution data includes the mean and standard deviation. The risk prediction device according to claim 1, wherein the relevance information includes the covariance of the mean of each modality and the covariance of the standard deviation of each modality.
  6. The risk prediction device according to claim 4, comprising a learning means for optimizing the encoder, the integration unit, and the predictor based on a first loss indicating the similarity between the probability distribution corresponding to each modality and a predetermined reference distribution, and a second loss indicating the error between the prediction result by the predictor and a pre-prepared true value.
  7. The risk prediction device according to claim 6, wherein the relational information generation means generates the relational information using the encoder, the integration unit, and the predictor after optimization by the learning means.
  8. The risk prediction device according to claim 1, wherein the predictor predicts the disease risk of a subject based on data from multiple modalities related to the subject's health, using a pre-trained machine learning model.
  9. A risk prediction method performed by a computer, Obtain data from multiple different modalities for a single object, If data for at least one of the aforementioned modalities is missing, relationship information indicating the relationships between the probability distribution data of each modality is stored from the storage unit, and based on this relationship information, probability distribution data for the missing modality is generated. The data for each modality is converted into probability distribution data that shows the probability distribution in the latent space. The probability distribution data for each modality is integrated to generate unified probability distribution data. A risk prediction method that predicts risk based on the aforementioned integrated probability distribution data.
  10. Obtain data from multiple different modalities for a single object, If data for at least one of the aforementioned modalities is missing, relationship information indicating the relationships between the probability distribution data of each modality is stored from the storage unit, and based on this relationship information, probability distribution data for the missing modality is generated. The data for each modality is converted into probability distribution data that shows the probability distribution in the latent space. The probability distribution data for each modality is integrated to generate unified probability distribution data. A program that causes a computer to perform a process to predict risk based on the aforementioned integrated probability distribution data.

Description

This disclosure relates to risk forecasting. Techniques for predicting disease risk using machine learning models are known. For example, Patent Document 1 describes a multimodal machine learning model that predicts the progression of dementia using multiple types of input data. International Publication WO2023/276976 This disclosure shows the overall configuration of the risk prediction device.This is a block diagram showing the hardware configuration of a risk prediction device.This is a block diagram showing the functional configuration of a learning device for a risk prediction model.This is a diagram illustrating expert relevance information.This is a flowchart of the learning process.This is a block diagram showing the functional configuration of a risk prediction device.This is a diagram illustrating how to compensate for missing modalities.This is a diagram illustrating how to compensate for missing modalities.This is a flowchart for the risk prediction process.This block shows other functional configurations of the risk prediction device.This is a flowchart for other risk prediction processes. Preferred embodiments of this disclosure will be described below with reference to the drawings. <First Embodiment> [Overall structure] Figure 1 shows the overall configuration of the risk prediction device according to this disclosure. The risk prediction device 100 predicts the disease risk of a subject based on data related to the subject's health. Specifically, the risk prediction device 100 receives multimodal data, that is, data from multiple different modalities. A modality refers to a method or means for representing information, and multimodal data refers to data in different data formats, such as text, images, audio, and sensor data. In this embodiment, the multimodal data includes various data obtained from health checkups, such as the subject's height, weight, gender, blood pressure, BMI (Body Mass Index), body fat percentage, triglyceride levels, smoking status and amount, and alcohol consumption status and amount. As shown in Figure 1, the risk prediction device 100 receives multiple data points from different modalities (in this example, data points D1 to D4). The risk prediction device 100 converts the data from each input modality into probability distributions in the latent space and generates a combined probability distribution (also called the "combined probability distribution" or "latent representation z"). The risk prediction device 100 then predicts and outputs the disease risk based on the combined probability distribution. During the learning process, the risk prediction device 100 learns to minimize the error between the predicted disease risk value obtained based on the integrated probability distribution and the true disease risk value pre-prepared as training data. Simultaneously, the risk prediction device 100 learns to make the integrated probability distribution approach a predetermined reference distribution (e.g., a normal distribution). Upon completion of the learning process, the risk prediction device 100 stores correlation information, indicating the relationships between the probability distribution data obtained during the learning process, in a memory or other storage unit. On the other hand, when predicting risk, the risk prediction device 100 predicts the disease risk of the subject based on multimodal data related to the subject's health. If data is missing for some of the modalities among the multiple modalities, the risk prediction device 100 uses relationship information between probability distribution data stored in the memory unit to generate probability distribution data for the missing modality (hereinafter also referred to as the "missing modality"), thereby supplementing the missing modality data. Then, the risk prediction device 100 predicts the subject's disease risk using the data from multiple modalities, including the supplemented modality data. This enables the risk prediction device 100 to predict disease risk with high accuracy even if data for some modalities is missing. The risk prediction device 100 can be suitably applied to the medical or healthcare fields. For example, the risk prediction device 100 can be used to predict the risk of lifestyle-related diseases based on data obtained from regular health checkups. [Hardware Configuration] Figure 2 is a block diagram showing the hardware configuration of the risk prediction device 100. As shown in the figure, the risk prediction device 100 comprises a processor 11, an interface (IF) 12, a ROM (Read Only Memory) 13, a RAM (Random Access Memory) 14, a database (DB) 15, and a storage medium 16. Each component is connected to the others, for example, via a bus 18. The processor 11 is a computer such as a CPU (Central Processing Unit), and controls the entire risk prediction device 100 by executing a pre-prepared program. Specifically, the processor 11 can be a CPU, GPU (Graphics Processing Unit), D