Search

CN-121979986-A - Training method and device for medical question-answering model, electronic equipment, storage medium and program product

CN121979986ACN 121979986 ACN121979986 ACN 121979986ACN-121979986-A

Abstract

The invention discloses a training method, a training device, electronic equipment, a storage medium and a program product of a medical question-answering model. A training method of a medical question-answering model comprises the steps of obtaining an initial medical question-answering model to be trained, carrying out model parameter adjustment of a first stage on the initial medical question-answering model based on medical theory question-answering pairs to obtain the first medical question-answering model, carrying out model parameter adjustment of a second stage on the first medical question-answering model based on medical record sample data to obtain a second medical question-answering model, and carrying out model parameter adjustment of a third stage on the second medical question-answering model based on a medical question-answering sample set to obtain a third medical question-answering model, wherein the medical question-answering sample set comprises medical questions and forward answer information and negative answer information of the medical questions, multi-stage training of the medical question-answering model is achieved, and accuracy of the medical question-answering model is improved.

Inventors

  • JIANG SHENG
  • WANG LANYE
  • ZHAO DINGYI

Assignees

  • 微医云(杭州)控股有限公司

Dates

Publication Date
20260505
Application Date
20260116

Claims (10)

  1. 1. A method of training a medical question-answering model, comprising: acquiring an initial medical question-answering model to be trained; based on a medical theory question-answer pair, carrying out model parameter adjustment of a first stage on the initial medical question-answer model to obtain a first medical question-answer model; Based on medical record sample data, performing model parameter adjustment of a second stage on the first medical question-answering model to obtain a second medical question-answering model; And carrying out model parameter adjustment of a third stage on the second medical question-answer model based on a medical question-answer sample set to obtain a third medical question-answer model, wherein the medical question-answer sample set comprises medical questions and positive answer information and negative answer information of the medical questions.
  2. 2. The method according to claim 1, wherein the performing a first stage of model parameter adjustment on the initial medical question-answer model based on the medical theory question-answer pair to obtain a first medical question-answer model includes: Acquiring an actual entity and an actual entity attribute in the medical theory question-answer pair, wherein the actual entity attribute comprises an entity position and an entity type; Inputting the medical theory question-answer pair into the initial medical question-answer model to obtain a predicted entity and predicted entity attributes in the processing process of the medical theory question-answer pair by the initial medical question-answer model; And generating a first loss function based on the actual entity, the actual entity attribute, the predicted entity and the predicted entity attribute, and performing model parameter adjustment of a first stage on the initial medical question-answer model to obtain a first medical question-answer model.
  3. 3. The method according to claim 2, wherein the method further comprises: identifying a weight value corresponding to each actual entity based on entity weight configuration information of a preset value; And generating a second loss function to perform model parameter adjustment of the initial medical question-answer model in a first stage based on the actual entity, the actual entity attribute, the predicted entity attribute and the weight value corresponding to the actual entity, so as to obtain a first medical question-answer model.
  4. 4. The method of claim 3, wherein the entity types include medical entities and non-medical entities, wherein the weight value of the medical entity is greater than the weight value of the non-medical entity; the determining mode of the entity type comprises the following steps: And matching the actual entity in a medical entity set to obtain a matching result, and determining the entity type based on the matching result.
  5. 5. The method according to claim 1, wherein the performing a second stage of model parameter adjustment on the first medical question-answer model based on the medical record sample data to obtain a second medical question-answer model includes: acquiring historical medical record data and theoretical analysis link data in medical record sample data; Inputting the historical medical record data into the first medical question-answering model to obtain training analysis link data output by the first medical question-answering model; and generating a third loss function based on the theoretical analysis link data and the training analysis link data, and performing model parameter adjustment of a second stage on the first medical question-answer model to obtain a second medical question-answer model.
  6. 6. The method according to claim 1, wherein the performing a third stage of model parameter adjustment on the second medical question-answer model based on the medical question-answer sample set to obtain a third medical question-answer model includes: Inputting the medical question-answer sample group into the second medical question-answer model to obtain a selection result of the second medical question-answer model on the positive answer information and the negative answer information in the medical question-answer sample group, generating a fourth loss function based on the selection result, and performing model parameter adjustment of a third stage on the second medical question-answer model based on the fourth loss function to obtain a third medical question-answer model.
  7. 7. A training device for a medical question-answering model, comprising: The initial medical question-answering model acquisition module is used for acquiring an initial medical question-answering model to be trained; The first-stage training module is used for carrying out first-stage model parameter adjustment on the initial medical question-answer model based on the medical theory question-answer pair to obtain a first medical question-answer model; the second-stage training module is used for carrying out second-stage model parameter adjustment on the first medical question-answering model based on medical record sample data to obtain a second medical question-answering model; and the third-stage training module is used for carrying out third-stage model parameter adjustment on the second medical question-answering model based on a medical question-answering sample set to obtain a third medical question-answering model, wherein the medical question-answering sample set comprises medical questions and positive answer information and negative answer information of the medical questions.
  8. 8. An electronic device, the electronic device comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the training method of the medical question-answering model of any one of claims 1-6.
  9. 9. A computer readable storage medium storing computer instructions for causing a processor to perform the method of training the medical question-answering model of any one of claims 1-6 when executed.
  10. 10. A computer program product, characterized in that it comprises a computer program which, when executed by a processor, implements a training method of a medical question-answering model according to any one of claims 1-6.

Description

Training method and device for medical question-answering model, electronic equipment, storage medium and program product Technical Field The present invention relates to the field of medical technology, and in particular, to a training method, apparatus, electronic device, storage medium, and program product for a medical question-answering model. Background In the medical field, a Transformers architecture-based large language model (Large Language Model, LLM) has an important role in a medical question-answering scenario. Therefore, training of LLM has an important impact on the accuracy of LLM. In the prior art, the obtained multi-source heterogeneous data such as basic theory type training data and clinical case type training data are mixed, and then LLM is subjected to single-stage training. In the single-stage training process, LLM cannot conduct targeted training aiming at different types of training data, so that the LLM has the problem of low accuracy. Disclosure of Invention The invention provides a training method, a device, electronic equipment, a storage medium and a program product of a medical question-answering model, so as to realize multi-stage training of the medical question-answering model. According to an aspect of the present invention, there is provided a training method of a medical question-answering model, including: acquiring an initial medical question-answering model to be trained; based on the medical theory question-answer pair, carrying out model parameter adjustment of a first stage on the initial medical question-answer model to obtain a first medical question-answer model; based on the medical record sample data, performing model parameter adjustment of a second stage on the first medical question-answering model to obtain a second medical question-answering model; And performing model parameter adjustment of the second medical question-answer model at a third stage based on the medical question-answer sample set to obtain a third medical question-answer model, wherein the medical question-answer sample set comprises medical questions and positive answer information and negative answer information of the medical questions. According to another aspect of the present invention, there is provided a training apparatus of a medical question-answering model, including: The initial medical question-answering model acquisition module is used for acquiring an initial medical question-answering model to be trained; the first-stage training module is used for carrying out first-stage model parameter adjustment on the initial medical question-answer model based on the medical theory question-answer pair to obtain a first medical question-answer model; The second stage training module is used for carrying out second stage model parameter adjustment on the first medical question-answering model based on the medical record sample data to obtain a second medical question-answering model; And the third-stage training module is used for carrying out third-stage model parameter adjustment on the second medical question-answering model based on the medical question-answering sample set to obtain a third medical question-answering model, wherein the medical question-answering sample set comprises medical questions and positive answer information and negative answer information of the medical questions. According to another aspect of the present invention, there is provided an electronic device including: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the training method of the medical question-answering model of any one of the embodiments of the present invention. According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a training method for implementing a medical question-answering model provided by any one of the embodiments of the present invention. According to another aspect of the invention, there is provided a training method comprising a computer program which, when executed by a processor, implements a medical question-answering model as provided by any one of the embodiments of the invention. According to the technical scheme, a model foundation is provided for subsequent analysis and processing by acquiring an initial medical question-answer model to be trained, a first-stage model parameter adjustment is carried out on the initial medical question-answer model based on a medical theory question-answer pair to obtain a first medical question-answer model, the first-stage training is achieved, the accuracy of the first medical question-answer model in a medical entity identification dimension is improved, a second-stage model parameter adjustment is carried out on the first medical question-a