CN-121709125-B - Multimode medical recommendation method and related device based on double-layer gating mechanism
Abstract
The invention discloses a multi-mode medical recommendation method and a related device based on a double-layer gating mechanism, and relates to the technical field of artificial intelligence medical treatment. The method comprises the steps of obtaining text data, numerical data and category data of a patient, preprocessing the text data, numerical data and category data to obtain a data set, inputting the data set into a multi-task learning model for training until training rounds are achieved or preset conditions are met, obtaining a trained multi-task learning model, respectively extracting the text data, the numerical data and the category data through text feature branches, numerical feature branches and category feature branches by the multi-task learning model, obtaining text features, numerical features and category features, simultaneously inputting the text features, the numerical features and the category features into a double-layer gating network for mode-level gating and feature-level gating processing to obtain final feature representation, and inputting the final feature representation into a multi-task head network to obtain classification prediction results of a plurality of tasks. The invention can provide accurate personalized medical recommendation for patients.
Inventors
- WU YUANYUAN
- HUANG WUYANG
- HUANG MENGXING
- FENG ZIKAI
Assignees
- 海南大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260211
Claims (8)
- 1. A multi-mode medical recommendation method based on a double-layer gating mechanism is characterized by comprising the following steps: The method comprises the steps of obtaining multi-modal medical data of a patient, preprocessing the multi-modal medical data to obtain a data set, wherein the multi-modal medical data comprise text data, numerical data and category data, and dividing the data set into a training set and a testing set according to a preset proportion; inputting the data set into a preset multi-task learning model based on a double-layer gating mechanism for iterative training until training rounds are reached or preset conditions are met, and obtaining a trained multi-task learning model; Inputting patient data to be predicted into a trained multi-task learning model for prediction, and obtaining personalized medical recommendation schemes of patients, wherein the personalized medical recommendation schemes comprise hypertension risk prediction, BMI classification, diet suggestion generation, exercise suggestion generation, rehabilitation suggestion generation and health state assessment; the multi-task learning model comprises a text feature branch, a numerical feature branch, a category feature branch, a double-layer gating network and a multi-task head network, wherein the processing process of the multi-task learning model comprises the steps of respectively carrying out feature extraction on text data, numerical data and category data through the text feature branch, the numerical feature branch and the category feature branch to obtain text features, numerical features and category features; The processing process of the double-layer gating network comprises the following steps of modal gating, element-by-element multiplication of the text feature, the numerical feature, the class feature and the corresponding modal gating coefficient based on the text feature, the numerical feature and the class feature, and then splicing operation to obtain a fusion feature, wherein the feature gating introduces channel attention, and after global average pooling is carried out on the input fusion feature, a feature gating coefficient between 0 and 1 is distributed for each dimension of the fusion feature through two layers of full-connection layers and a Sigmoid function; The multi-task learning model comprises the following steps of determining evaluation indexes, wherein the evaluation indexes comprise accuracy, F1 fraction and area under a curve, introducing a hierarchical learning rate strategy, carrying out iterative training on the multi-task learning model based on a double-layer gating mechanism by utilizing a training set, calculating a loss value based on a loss function, dynamically adjusting the learning rate according to the loss value through a AdamW optimizer and cosine annealing learning rate scheduling, carrying out mixed precision training to improve training efficiency and reduce display memory occupation, carrying out gradient cutting to prevent gradient explosion, and carrying out performance evaluation on the trained model by utilizing a testing set according to the evaluation indexes, wherein the loss function has the following formula: Wherein Final total loss for a batch for the model; Scaling the factor for the dynamic confidence; Wherein confidence is the average confidence of the current batch prediction, beta is the adjustment super parameter, N is the total number of samples in the current batch, C is the total category number of the task; balance weight for category c; A focus loss weighting factor for category c; the prediction probability of the model for the sample i belonging to the category c; Adjusting parameters for focus loss; To indicate a function.
- 2. The multi-modal medical recommendation method based on a dual-layer gating mechanism according to claim 1, wherein the preprocessing comprises the steps of: The text data comprises diagnosis description and symptom description, the text data is subjected to standardized processing through a BERT-Base-Chinese word segmentation device, and special characters and redundant spaces are cleaned and removed; The numerical data comprise age, BMI, systolic pressure, diastolic pressure, blood sugar, heart rate, body temperature, administration quantity, chronic disease quantity, operation times and risk score, and the numerical data are subjected to missing value filling and Z-score standardization treatment.
- 3. The multi-modal medical recommendation method based on a double-layer gating mechanism according to claim 1, wherein the text feature, the numerical feature and the category feature are obtained by feature extraction of the text data, the numerical data and the category data respectively through the text feature branch, the numerical feature branch and the category feature branch, and the method comprises the following steps: Extracting features of text data by adopting a BERT encoder to obtain extracted features, and sequentially processing the extracted features through a linear transformation layer, a batch normalization layer, a modified linear unit activation function and a random inactivation layer to obtain text features; processing the numerical data by a multi-layer perceptron introducing a light-weight multi-head attention mechanism to obtain numerical characteristics; And mapping the category data into a digital index by using a tag encoder, converting the digital index into a dense vector by an embedding layer, and sequentially processing the dense vector by a linear transformation layer, a batch normalization layer, a modified linear unit activation function and a random deactivation layer to obtain category characteristics.
- 4. The method of claim 1, wherein the multi-modality medical recommendation network comprises 6 independent 2-layer MLP classification headers.
- 5. The multi-modal medical recommendation method based on the double-layer gating mechanism according to claim 1, further comprising data enhancement operation, wherein the data enhancement operation comprises the steps of randomly deleting and randomly replacing text features, performing Gaussian perturbation processing on the numeric features, performing random masking strategy on category features, and masking the category features into unknown categories with preset probability.
- 6. A multi-modal medical recommendation system based on a dual-layer gating mechanism, comprising: The system comprises an acquisition module, a test module and a data acquisition module, wherein the acquisition module is used for acquiring multi-modal medical data of a patient and preprocessing the multi-modal medical data to acquire a data set, wherein the multi-modal medical data comprises text data, numerical data and category data, and the data set is divided into a training set and a test set according to a preset proportion; The training module is used for inputting a data set into a preset multi-task learning model based on a double-layer gating mechanism to carry out iterative training until a training round is reached or a preset condition is met, so as to obtain a trained multi-task learning model, wherein the multi-task learning model comprises a text feature branch, a numerical feature branch, a category feature branch, a double-layer gating network and a multi-task head network, and text data, numerical data and category data are respectively subjected to feature extraction through the text feature branch, the numerical feature branch and the category feature branch to obtain text features, numerical features and category features; The processing process of the double-layer gating network comprises the following steps of distributing a modal gating coefficient corresponding to task conditions for text features, numerical features and category features through a Softmax function, multiplying the text features, the numerical features and the category features by the corresponding modal gating coefficient element by element, then performing splicing operation to obtain a fusion feature, introducing channel attention into the feature gating, and distributing a feature gating coefficient between 0 and 1 for each dimension of the fusion feature through two layers of full-connection layers and Sigmoid functions after global average pooling of the input fusion feature; The multi-task learning model comprises the following steps of determining evaluation indexes, wherein the evaluation indexes comprise accuracy, F1 fraction and area under a curve, introducing a layered learning rate strategy, carrying out iterative training on the multi-task learning model based on a double-layer gating mechanism by utilizing a training set, calculating a loss value based on a loss function, dynamically adjusting the learning rate according to the loss value through a AdamW optimizer and cosine annealing learning rate scheduling, improving training efficiency by mixing precision training, reducing display memory occupation, preventing gradient explosion by gradient cutting, and carrying out performance evaluation on the trained model by utilizing a testing set according to the evaluation indexes, wherein the loss function comprises the following formula: Wherein Final total loss for a batch for the model; Scaling the factor for the dynamic confidence; Wherein confidence is the average confidence of the current batch prediction, beta is the adjustment super parameter, N is the total number of samples in the current batch, C is the total category number of the task; balance weight for category c; A focus loss weighting factor for category c; the prediction probability of the model for the sample i belonging to the category c; Adjusting parameters for focus loss; is an indication function; The prediction module is used for inputting patient data to be predicted into the trained multi-task learning model to predict, and obtaining personalized medical recommendation schemes of the patient, wherein the personalized medical recommendation schemes comprise hypertension risk prediction, BMI classification, diet suggestion generation, exercise suggestion generation, rehabilitation suggestion generation and health state assessment.
- 7. A computer device comprising a memory for storing a computer program, and a processor for implementing the method according to any one of claims 1 to 5 when the computer program is executed.
- 8. A readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the method according to any of claims 1 to 5.
Description
Multimode medical recommendation method and related device based on double-layer gating mechanism Technical Field The invention relates to the technical field of artificial intelligence medical treatment, in particular to a multi-mode medical recommendation method and a related device based on a double-layer gating mechanism. Background In the field of intelligent medical treatment, personalized medical recommendation is a key technology for improving medical service quality and optimizing medical resource allocation. The traditional medical recommendation system mainly depends on a rule engine and an expert knowledge base, and has the problems of limited recommendation precision, insufficient individuation degree, difficulty in processing complex multi-mode medical data and the like. With the development of deep learning technology, a medical recommendation system based on a neural network is gradually raised, but the existing method still has challenges in the aspects of processing multi-task learning, data imbalance, feature fusion and the like. The existing multi-mode medical data fusion method mainly adopts a single-layer gating strategy in the application of a gating mechanism, namely, information screening is carried out only from a single granularity (a mode level or a feature level). Such methods have the following limitations: (1) The method has the limitation that the prior method can only realize the cross-modal global weight distribution, and distributes relative importance weights for different modes (such as texts, numerical values and categories) through a Softmax function (normalized exponential function), but cannot carry out fine granularity screening on the fused feature vectors, so that noise dimension and effective features participate in subsequent calculation equally, and the performance of the model is influenced. (2) The limitation of feature level single-layer gating is that the existing method can only select at a dimension level, and a weighting between 0 and 1 is distributed to each dimension of the fusion feature vector through a Sigmoid function (a logic function ), but the lack of perception of the overall importance of the input mode possibly causes that key mode information is diluted in the fusion stage, so that the performance of the model is reduced. (3) The limitation of single granularity gating is that in a multitasking scenario, a single granularity gating strategy is difficult to meet the following requirements simultaneously (a) different tasks depend on different modes differently (such as a task of 'hypertension risk' depends on numerical characteristics more and a task of 'diet proposal' depends on text description more), b) importance differences of different characteristic dimensions (such as key indexes of blood pressure, blood sugar and the like should obtain higher weights) under the same task, and c) balance of noise suppression and key signal retention (the requirement of simultaneous suppression of modal noise and characteristic noise) is met. (4) Other technical problems include serious unbalance of medical data, difficult study of few types of samples, lack of an effective multi-modal feature fusion mechanism, difficulty in fully utilizing multi-modal information such as text description and numerical indexes, lack of interpretability of recommendation results, difficulty in obtaining trust of doctors and patients, interference among tasks in multi-task study, and complex interaction among different medical prediction tasks. Disclosure of Invention In order to solve the technical problems, the invention provides a multi-mode medical recommendation method and a related device based on a double-layer gating mechanism, which dynamically integrates texts, values and classification features through a two-stage collaborative screening mechanism (a first layer is a mode-level Softmax gating to realize cross-mode global importance distribution, and a second layer is a feature-level Sigmoid gating to realize dimension-by-dimension fine-granularity screening of fusion features), so that layering information fusion from coarse granularity to fine granularity is realized, the limitation of single-layer gating in a multi-mode multi-task scene is effectively solved, and accurate personalized medical recommendation is provided for patients. In order to achieve the above purpose, the technical scheme of the invention is as follows: a multi-mode medical recommendation method based on a double-layer gating mechanism comprises the following steps: The method comprises the steps of obtaining multi-modal medical data of a patient, preprocessing the multi-modal medical data to obtain a data set, wherein the multi-modal medical data comprise text data, numerical data and category data, and dividing the data set into a training set and a testing set according to a preset proportion; inputting the data set into a preset multi-task learning model based on a double-layer gatin