CN-122020056-A - Model determination method for medical examination report processing

CN122020056ACN 122020056 ACN122020056 ACN 122020056ACN-122020056-A

Abstract

The application relates to the technical field of medical examination report processing, in particular to a model determination method for medical examination report processing. The method comprises the steps of obtaining a first feature vector of a preset keyword corresponding to a comprehensive keyword, obtaining similarity between the first feature vector and the feature vector of each historical task in a preset meta-knowledge base, selecting k historical tasks which are most similar to a current task from the meta-knowledge base, constructing a similar task set according to the k historical tasks which are most similar to the current task, obtaining historical performance data of each candidate model on the similar task set from the meta-knowledge base, obtaining expected performance of each candidate model on the current task based on the historical performance data of each candidate model on the similar task set, and determining an optimal model corresponding to the current task according to the expected performance of each candidate model in the candidate model set on the current task. The application enables the selection of an adapted model for a specific medical examination report processing analysis task.

Inventors

LU CHENGYE
LIU JING
ZHAO XIANGYU
HUANG XIAOJUN
LIU LIHONG
LIU LIYU
ZHANG JUN
CHEN BAILU
HU LIANG

Assignees

北京大学人民医院
生命奇点(北京)科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260204

Claims (8)

1. A model determination method for medical examination report processing, the method comprising the steps of: S100, acquiring a first feature vector F 1 ;F 1 of a preset keyword corresponding to a comprehensive keyword, wherein the first feature vector F 1 ;F 1 comprises the number of the preset keywords corresponding to the comprehensive keyword, data type distribution information of the preset keywords corresponding to the comprehensive keyword and the number of corresponding training samples, and the preset keyword is a keyword of which the corresponding value is to be extracted from a medical examination report; S200, obtaining the similarity between F 1 and the feature vector of each historical task in a preset metadata repository, wherein the preset metadata repository comprises the feature vectors of a plurality of historical tasks and the performance data of each candidate model in a candidate model set corresponding to each historical task; S300, based on the similarity, selecting k historical tasks which are most similar to the current task from the meta-knowledge base, and constructing a similar task set according to the k historical tasks which are most similar to the current task, wherein k is a preset selection number, and the current task predicts the value corresponding to the comprehensive keyword according to the value of the preset keyword corresponding to the comprehensive keyword; S400, acquiring historical performance data of each candidate model on the similar task set from the meta-knowledge base, and acquiring expected performance of each candidate model for a current task based on the historical performance data of each candidate model on the similar task set, wherein the expected performance at least comprises expected accuracy and expected calculation cost; s500, determining an optimal model corresponding to the current task according to expected performance of each candidate model in the candidate model set on the current task.
2. The model determination method for medical examination report processing of claim 1, wherein obtaining expected performance of each candidate model for a current task based on historical performance data of each candidate model on the set of similar tasks comprises: S410, obtaining the similarity between each similar task in the similar task set and the current task; S420, carrying out normalization processing on the similarity between each similar task in the similar task set and the current task, and determining the normalization processing result as the weight of the historical performance data of the corresponding similar task, wherein the sum of the weights of the historical performance data of the similar tasks in the similar task set is 1; s430, multiplying the accuracy of any candidate model on each similar task by the corresponding weight, and then summing to obtain the expected accuracy of the candidate model on the current task; S440, multiplying the calculation cost of any candidate model on each similar task by the corresponding weight, and summing to obtain the expected calculation cost of the candidate model for the current task.
3. The model determination method for medical examination report processing according to claim 1, wherein S500 comprises: S510, acquiring a priority value of each candidate model in the candidate model set according to the expected performance of each candidate model in the candidate model set for the current task, wherein the priority value of any candidate model is positively correlated with the expected accuracy of the candidate model for the current task, and the priority value of any candidate model is negatively correlated with the expected calculation cost of the candidate model for the current task; and S520, determining the candidate model with the highest priority value as the optimal model corresponding to the current task.
4. A model determination method for medical examination report processing according to claim 3, wherein the priority value of the i-th candidate model in the candidate model set is u i ,u i = α x a '+ β x (1-C'), wherein a 'is a value obtained by normalizing a, a is an expected accuracy of the i-th candidate model in the candidate model set for the current task, C' is a value obtained by normalizing C, C is an expected calculation cost of the i-th candidate model in the candidate model set for the current task, α is a weight of a preset expected accuracy, α >0, β is a weight of a preset expected calculation cost, β >0, α+β = 1;i is a range of 1 to n, and n is the number of candidate models in the candidate model set.
5. The model determination method for medical examination report processing according to claim 1, wherein the data type distribution information of the preset keywords corresponding to the integrated keywords is a duty ratio of preset keywords of different data types in the preset keywords corresponding to the integrated keywords.
6. The model determination method for medical examination report processing according to claim 1, wherein the data type includes a numeric type, a category type, a text type.
7. The method of claim 1, wherein the computational overhead includes at least one of a space occupied by the model storage and a memory occupied by the model when reasoning.
8. The model determination method for medical examination report processing according to claim 1, wherein the similarity is obtained using a cosine similarity algorithm.

Description

Model determination method for medical examination report processing Technical Field The invention relates to the technical field of medical examination report processing, in particular to a model determination method for medical examination report processing. Background In modern clinical diagnosis and treatment and medical research, automatic analysis of medical examination reports by using a machine learning model has become a key technology for assisting doctors in decision making and improving diagnosis and treatment efficiency. One of the core tasks of such analysis is to predict or generate a value for a specific integrated keyword that is not the raw data in the report, but a high-order medical index derived by a model based on a plurality of raw data in the report. The complexity of processing tasks of different medical examination reports is different, the reasoning accuracy and the calculation cost of different models on the same task are also different, and how to select an adaptive model for a specific medical examination report processing analysis task is a problem to be solved urgently. Disclosure of Invention The object of the invention is to provide a model determination method for medical examination report processing, which selects an adapted model for a specific medical examination report processing analysis task. According to the present invention, there is provided a model determination method for medical examination report processing, the method comprising the steps of: s100, acquiring a first feature vector F 1;F1 of a preset keyword corresponding to a comprehensive keyword, wherein the first feature vector F 1;F1 comprises the number of the preset keywords corresponding to the comprehensive keyword, data type distribution information of the preset keywords corresponding to the comprehensive keyword and the number of corresponding training samples, the preset keyword is a keyword of which the corresponding value is to be extracted from a medical examination report, and the comprehensive keyword is a keyword of which the corresponding value is to be predicted according to the value of the preset keyword. S200, obtaining the similarity between F 1 and the feature vector of each historical task in a preset metadata repository, wherein the preset metadata repository comprises the feature vectors of a plurality of historical tasks and the performance data of each candidate model in the candidate model set corresponding to each historical task. S300, based on the similarity, selecting k historical tasks which are most similar to the current task from the meta-knowledge base, and constructing a similar task set according to the k historical tasks which are most similar to the current task, wherein k is a preset selection number, and the current task predicts the value corresponding to the comprehensive keyword according to the value of the preset keyword corresponding to the comprehensive keyword. S400, acquiring historical performance data of each candidate model on the similar task set from the meta-knowledge base, and acquiring expected performance of each candidate model for a current task based on the historical performance data of each candidate model on the similar task set, wherein the expected performance at least comprises expected accuracy and expected calculation cost. S500, determining an optimal model corresponding to the current task according to expected performance of each candidate model in the candidate model set on the current task. Compared with the prior art, the invention has at least the following beneficial effects: According to the invention, the similarity between the first feature vector of the preset keyword corresponding to the comprehensive keyword and the feature vector of the historical task in the meta-knowledge base is obtained, the historical task similar to the current task is identified, and recommendation is performed based on the performance of different candidate models in the candidate model set under the historical task similar to the current task. Moreover, the method and the device can quickly complete model recommendation by calculating the similarity of the tasks without actually running all candidate models on the current task, and have higher efficiency. Drawings In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. FIG. 1 is a flow chart of a model determination method for medical examination report processing provided by an embodiment of the present invention. Detailed Description The following description of the embodiments of the present invention will be made c