CN-121983303-A - Artificial intelligence-based generation type artificial pre-inspection system and method

CN121983303ACN 121983303 ACN121983303 ACN 121983303ACN-121983303-A

Abstract

The invention relates to the technical field of intelligent pre-examination and discloses a generation type artificial pre-examination system and method based on artificial intelligence, wherein the system comprises a multi-mode information acquisition module, a detection module and a detection module, wherein the multi-mode information acquisition module is used for acquiring text, facial images and voice information of a patient; the system comprises a collaborative reasoning analysis module, a decision output module and a final diagnosis-separating scheme, wherein the collaborative reasoning analysis module is used for carrying out feature extraction and weighted fusion on text, facial images and voice information to form multi-mode feature representation, then inputting the multi-mode feature representation into a generated large language model for reasoning and outputting a preliminary diagnosis-separating scheme with scheme confidence, and the decision output module is used for generating inquiry prompt information when the scheme confidence is insufficient and carrying out reasoning again according to supplementary information fed back by a user to output the final diagnosis-separating scheme. The method corresponds to the system. The invention assists medical staff to rapidly and accurately finish pre-examination triage through multi-mode information fusion and intelligent interactive verification, and can improve the efficiency and reliability of triage work.

Inventors

QIAO PANPAN
Yang Haomei
LU MEI

Assignees

广州医科大学附属妇女儿童医疗中心

Dates

Publication Date: 20260505
Application Date: 20251226

Claims (10)

1. An artificial intelligence based generation type artificial pre-inspection system, which is characterized by comprising: The multi-mode information acquisition module is configured to acquire text complaint information, facial image information and voice information of a patient; the collaborative reasoning analysis module is respectively connected with the multi-mode information acquisition module, and comprises: The information feature extraction unit is configured to perform feature extraction on the text complaint information, the facial image information and the voice information to respectively obtain a text feature vector, a facial feature vector and a voice feature vector; The multi-mode fusion reasoning unit is configured to carry out weighted fusion on the text feature vector, the facial feature vector and the voice feature vector to form unified multi-mode feature representation, and input the multi-mode feature representation into a pre-trained generation type large language model, wherein the generation type large language model carries out collaborative reasoning based on the multi-mode feature representation, and outputs a preliminary diagnosis division scheme with diagnosis division departments, emergency levels and treatment suggestions and a scheme comprehensive confidence degree corresponding to the preliminary diagnosis division scheme; The decision output module is connected with the collaborative reasoning analysis module and is configured to generate and return at least one inquiry prompt message to the user based on the multi-mode characteristic representation when the scheme comprehensive confidence of the preliminary triage scheme is lower than a confidence threshold value, receive the supplementary information fed back by the user based on the inquiry prompt message, trigger the collaborative reasoning analysis module to reasoner based on the supplementary information and the multi-mode characteristic representation, generate a final triage scheme and output.
2. The artificial intelligence based generation type artificial pre-examination system of claim 1, wherein the information feature extraction unit comprises a text feature extraction subunit, a facial feature extraction subunit, and a voice feature extraction subunit arranged in parallel, wherein: The facial feature extraction subunit adopts a small sample learning neural network based on an attention mechanism and is configured to receive the facial image information, extract a facial action unit coding sequence and code the facial action unit coding sequence into the facial feature vector; the text feature extraction subunit adopts a clinical text embedding model and is configured to receive the text complaint information, extract semantic features and encode the semantic features into the text feature vectors; the speech feature extraction subunit employs an audio spectral feature extraction network configured to receive the speech information, extract acoustic features and encode into the speech feature vectors.
3. The artificial intelligence based generation type artificial pre-inspection system according to claim 1, wherein the process of weighted fusion comprises: Invoking a weight distribution model, wherein the weight distribution model dynamically calculates weight coefficients corresponding to the text feature vector, the facial feature vector and the voice feature vector based on semantic integrity scores of the text complaint information, key point definition scores of the facial image information and signal-to-noise ratio scores of the voice information; And carrying out weighted summation on the text feature vector, the facial feature vector and the voice feature vector according to the weight coefficient to generate the multi-modal feature representation.
4. The artificial intelligence based generation type artificial pre-examination system of claim 1, further comprising a retrieval enhancement generation module, the retrieval enhancement generation module comprising: A pediatric medical knowledge base storing structured pediatric disease symptom maps and triage rules; The knowledge retrieval module is connected with the pediatric medical knowledge base and the generated large language model and is configured to retrieve relevant medical knowledge fragments from the pediatric medical knowledge base by taking the text feature vector and the facial feature vector as query basis before the generated large language model performs collaborative reasoning; wherein the generative large language model is further configured to infer the medical knowledge segments as enhanced contexts in combination with the multimodal feature representation to generate the preliminary triage scheme.
5. The artificial intelligence based generation type artificial pre-examination system of claim 4, wherein the constructing process of the pediatric medical knowledge base comprises: Extracting the logical relationship among symptom entities, disease entities and entities from standard pediatric diagnosis and treatment guidelines and historical electronic medical records, and constructing a pediatric symptom and disease knowledge graph; And converting entity, relation and associated text descriptions in the pediatric symptom and disease knowledge graph into high-dimensional vectors, and storing the high-dimensional vectors in a vector database for similarity retrieval by the retrieval enhancement unit.
6. The artificial intelligence based generation type artificial pre-examination system of claim 4, wherein the knowledge retrieval module is further configured to: Taking the text feature vector and the facial feature vector as initial query vectors, and performing first-round retrieval from the pediatric medical knowledge base to obtain an initial medical knowledge fragment set; inputting the initial medical knowledge segment set and the multi-modal feature representation into the generated large language model together, evaluating the content relevance of each segment in the initial medical knowledge segment set by the generated large language model, and generating one or more deepened query vectors based on segments whose content relevance is below a preset relevance threshold; Taking the deepened query vector as a new query basis, and carrying out second-round retrieval from the pediatric medical knowledge base to obtain a supplementary medical knowledge fragment set; wherein the generative large language model is ultimately used for the medical knowledge segments of reasoning, including the initial medical knowledge segment set and the supplemental medical knowledge segment set.
7. The artificial intelligence based generation type artificial pre-examination system according to claim 1, wherein the generating process of the inquiry prompt includes: performing feature dimension reliability analysis on the multi-modal feature representation, and identifying feature dimensions with reliability lower than a preset dimension threshold or dimension combinations with logic conflicts; and constructing the problem content for clarifying or supplementing the dimension information based on the feature dimension with reliability lower than a preset dimension threshold or the dimension combination with logic conflict, and taking the problem content as the inquiry prompt information.
8. The artificial intelligence based generation type artificial pre-inspection system according to claim 1, further comprising a continuous learning optimization module configured to: collecting input data of the multi-mode information acquisition module, the final diagnosis and treatment scheme output by the system and the final diagnosis and treatment result of the patient recorded in the hospital information system; when the final diagnosis and treatment scheme is inconsistent with the final diagnosis and treatment result, corresponding input data and the final diagnosis and treatment result form a training sample, and an incremental training set is added; And using the incremental training set, and adopting a parameter efficient fine tuning method to perform periodic parameter fine tuning on the generated large language model.
9. The artificial intelligence based generation type artificial pre-inspection system according to claim 1, wherein the multi-modality information acquisition module is further configured to: When the patient is identified as a child, a graphical child physical symptom selection interface is provided to assist the guardian in carrying out complaint description, and a preset special audio processing channel aiming at the voice characteristics of the child is activated to optimize the noise reduction and the feature extraction of crying and ambiguous pronunciation of the child.
10. The utility model provides a generating type manual pre-checking method based on artificial intelligence which is characterized in that the method comprises the following steps: Acquiring text complaint information, facial image information and voice information of a patient; The collaborative reasoning analysis step is to extract the characteristics of the text main complaint information, the facial image information and the voice information to respectively obtain a text characteristic vector, a facial characteristic vector and a voice characteristic vector, to carry out weighted fusion on the text characteristic vector, the facial characteristic vector and the voice characteristic vector to form a unified multi-modal characteristic representation; The decision output step comprises judging whether the scheme comprehensive confidence level of the preliminary triage scheme is lower than a confidence level threshold value, if so, generating and returning at least one inquiry prompt message to a user based on the multi-mode feature representation, receiving supplementary information fed back by the user based on the inquiry prompt message, and carrying out reasoning again based on the supplementary information and the multi-mode feature representation to generate and output a final triage scheme.

Description

Artificial intelligence-based generation type artificial pre-inspection system and method Technical Field The invention relates to the technical field of intelligent pre-inspection, in particular to a generation type artificial pre-inspection system and method based on artificial intelligence. Background Currently, the pre-examination and triage work of the hospital emergency department is highly dependent on the manual experience judgment of nurses. In the face of increasing clinic volume and complicated illness state, nurses need to make grading decisions based on scattered information such as complaints, expressions, physical signs and the like of patients in a short time, the working pressure is huge, and the diagnosis accuracy is easily influenced by personal experience, communication capacity and fatigue state. An automatic triage system designed for adults is difficult to directly apply to children, because children often cannot accurately describe symptoms, and the disease states (such as specific facial expressions and crying) are unique, and the existing system lacks effective recognition and interpretation means. In addition, the prior art focuses on the processing of a single information source (such as text complaints), and fails to effectively integrate and cooperatively analyze multidimensional information of facial expressions, voice intonation and the like of patients, so that diagnosis basis is incomplete and auxiliary decision-making effect is limited. Therefore, an automatic pre-examination diagnosis and diagnosis scheme which can intelligently integrate multi-mode information, adapt to pediatric characteristics and can be effectively and interactively verified with medical staff is needed to assist nurses in working, and diagnosis efficiency and accuracy are improved. Disclosure of Invention The invention aims to provide a generation type artificial pre-examination system and method based on artificial intelligence, which are used for solving the technical problems in the background technology. In order to achieve the above purpose, the present invention discloses the following technical solutions: in a first aspect, the present invention discloses an artificial intelligence based generation type artificial pre-examination system, comprising: The multi-mode information acquisition module is configured to acquire text complaint information, facial image information and voice information of a patient; the collaborative reasoning analysis module is respectively connected with the multi-mode information acquisition module, and comprises: The information feature extraction unit is configured to perform feature extraction on the text complaint information, the facial image information and the voice information to respectively obtain a text feature vector, a facial feature vector and a voice feature vector; The multi-mode fusion reasoning unit is configured to carry out weighted fusion on the text feature vector, the facial feature vector and the voice feature vector to form unified multi-mode feature representation, and input the multi-mode feature representation into a pre-trained generation type large language model, wherein the generation type large language model carries out collaborative reasoning based on the multi-mode feature representation, and outputs a preliminary diagnosis division scheme with diagnosis division departments, emergency levels and treatment suggestions and a scheme comprehensive confidence degree corresponding to the preliminary diagnosis division scheme; The decision output module is connected with the collaborative reasoning analysis module and is configured to generate and return at least one inquiry prompt message to the user based on the multi-mode characteristic representation when the scheme comprehensive confidence of the preliminary triage scheme is lower than a confidence threshold value, receive the supplementary information fed back by the user based on the inquiry prompt message, trigger the collaborative reasoning analysis module to reasoner based on the supplementary information and the multi-mode characteristic representation, generate a final triage scheme and output. Optionally, the information feature extraction unit includes a text feature extraction subunit, a facial feature extraction subunit and a voice feature extraction subunit that are arranged in parallel, where: The facial feature extraction subunit adopts a small sample learning neural network based on an attention mechanism and is configured to receive the facial image information, extract a facial action unit coding sequence and code the facial action unit coding sequence into the facial feature vector; the text feature extraction subunit adopts a clinical text embedding model and is configured to receive the text complaint information, extract semantic features and encode the semantic features into the text feature vectors; the speech feature extraction subunit employs an audio spectr