CN-122025111-A - Medical multi-mode data fusion analysis method and system based on cross-mode alignment
Abstract
The application provides a medical multi-mode data fusion analysis method and system based on cross-mode alignment, relates to the field of artificial intelligence, and solves the technical problems of uneven data quality, semantic alignment steps, inflexible fusion weight distribution and the like in the prior art. The method comprises the steps of obtaining multi-mode medical data and carrying out quality grading evaluation to obtain a mode quality score of each mode, carrying out semantic alignment on the multi-mode data subjected to quality grading evaluation by adopting a dual-stage alignment algorithm of intra-mode feature enhancement and cross-mode semantic calibration to obtain a semantic alignment result, carrying out feature fusion on the semantic alignment result by adopting a dynamic weight model based on knowledge graph relevancy, clinical guideline priority and the mode quality score, generating a structured diagnosis report on the fusion result by adopting an artificial intelligence generation content AIGC model, carrying out clinical compliance checksum sensitivity analysis, and feeding back a checksum analysis result to a feature fusion step for optimization.
Inventors
- ZHAO RAN
- HU HONGXIANG
- WANG JIEPING
- HU ZHIYUAN
Assignees
- 杭州象限数智科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (10)
- 1. The medical multi-mode data fusion analysis method based on cross-mode alignment is characterized by comprising the following steps of: acquiring multi-mode medical data and carrying out quality grading evaluation to obtain the modal quality scores of all modes, wherein the multi-mode medical data at least comprises medical image data, electronic medical record text data and physiological signal time sequence data; Carrying out semantic alignment on the multi-mode data subjected to quality grading evaluation by adopting a dual-stage alignment algorithm of intra-mode feature enhancement and cross-mode semantic calibration to obtain a semantic alignment result; Based on the knowledge graph association degree, the clinical guideline priority and the modal quality score, carrying out feature fusion on the semantic alignment result by adopting a dynamic weight model; and generating a structural diagnosis report for the fusion result through an artificial intelligence generation content AIGC model, carrying out clinical compliance checksum sensitivity analysis, and feeding back the checksum analysis result to the feature fusion step for optimization.
- 2. The cross-modality alignment based medical multi-modality data fusion analysis method of claim 1, wherein the acquiring multi-modality medical data and performing quality grading assessment comprises: For medical image data, calculating space coverage based on the ratio of the number of pixels actually covered by a focus area to the total number of pixels of the whole medical image, calculating a first information density score based on an information entropy algorithm, and carrying out weighted summation on the space coverage and the first information density score to obtain a comprehensive score of the image data; For the text data of the electronic medical record, calculating a first text index based on the ratio of the filling number of the core fields to the total number of the core fields, calculating a second text index based on the ratio of the matching number of the text terms in the medical record to the preset medical standard terms to the total number of all related terms in the medical record, calculating a third text index based on the ratio of the number of fields of the text format of the electronic medical record, which accords with the HL7 medical data exchange standard, to the total number of the fields of the medical record, calculating a second information density score based on an information entropy algorithm, and carrying out weighted summation on the first text index, the second text index, the third text index and the second information density score to obtain the comprehensive score of the text data; For physiological signal time sequence data, calculating a time sequence continuity score based on the ratio of the continuous signal duration to the total signal acquisition duration in the physiological signal acquisition time period, calculating a third information density score based on an information entropy algorithm, and carrying out weighted summation on the time sequence continuity score and the third information density score to obtain a comprehensive score of the time sequence data; And classifying the quality of each mode data according to the comprehensive scores of each mode based on the grade threshold value, and setting a differentiated post-processing strategy for each grade.
- 3. The cross-modal alignment-based medical multi-modal data fusion analysis method as claimed in claim 1, wherein the dual-stage alignment algorithm includes intra-modal feature enhancement stage and cross-modal semantic calibration stage, wherein, The intra-mode feature enhancement stage is used for receiving the medical image data, the electronic medical record text data and the physiological signal time sequence data after quality grading evaluation, enhancing the features of each mode through a pre-trained neural network model, and obtaining enhanced feature blocks of each mode; And the cross-mode semantic calibration stage is used for carrying out clinical semantic association mapping and consistency verification on the reinforced feature block combined with medical knowledge graph embedding, enhancing semantic association among different modes and obtaining a unified semantic vector conforming to clinical logic.
- 4. The cross-modal alignment-based medical multi-modal data fusion analysis method as claimed in claim 3, wherein the cross-modal semantic calibration stage includes a medical memory generation layer, a cross-modal attention layer and a semantic consistency check layer, wherein, The medical memory generation layer is used for receiving a medical knowledge graph and medical pairing data, and learning a clinical association rule through a generator for generating an antagonism network GAN to obtain a three-dimensional medical memory matrix, wherein matrix elements of the three-dimensional medical memory matrix comprise a modal type, a standard term and a disease name, are used for representing association strength of modal characteristics and medical entities and providing medical constraint for semantic mapping, and the medical pairing data is a clinical pairing triplet of image data-electronic medical record text data-physiological signal time sequence data, and each triplet is provided with a diagnosis association label confirmed by a father clinical expert; The cross-modal attention layer is used for receiving the enhancement feature blocks of all modes and the three-dimensional medical memory matrix, constructing key vectors and value vectors of an attention mechanism by the enhancement feature blocks, constructing query vectors by embedding target disease candidate set entities in a medical knowledge graph, and realizing cross-modal semantic mapping by taking the three-dimensional medical memory matrix as attention weight constraint basis to obtain preliminary unified semantic vectors; The semantic consistency check layer is used for receiving the preliminary unified semantic vector and ideal alignment vectors of all modes, calculating semantic consistency scores and dynamically calibrating to obtain final unified semantic vectors, namely semantic alignment results.
- 5. The cross-modal alignment-based medical multi-modal data fusion analysis method of claim 4, wherein the medical knowledge graph construction process comprises: collecting medical data sources, including national comprehensive cancer network NCCN clinical guidelines, chinese clinical diagnosis and treatment guidelines, medical journal literature, clinical diagnosis and treatment data and expert consensus; Extracting multiple types of entities and clinical association relations among the entities from the medical data source, wherein the entities comprise diseases, symptoms, inspection indexes, image features, medicines and physiological signals; Uniformly labeling the entities by adopting a preset medical standard term, ICD-10 disease codes and LOINC test index codes, and carrying out grade assignment on clinical incidence relations according to clinical evidences, wherein the clinical evidences represent clinical research data amount supporting the incidence relations and expert consensus approval; and storing the entity and the clinical association relationship by adopting a graph database to obtain the medical knowledge graph.
- 6. The cross-modality alignment-based medical multi-modality data fusion analysis method of claim 5, wherein the ideal alignment vector of each modality is obtained based on the entity and clinical association relationship in the medical knowledge graph, comprising: extracting entities related to the target diseases from the medical knowledge graph to form a core associated entity set; Binding the core associated entity set with a corresponding mode to form a mode-entity mapping table; marking the enhancement features of each mode obtained by the historical multi-mode medical data as basic feature vectors of each mode; Pre-training the medical knowledge graph through TransE algorithm to obtain the knowledge graph embedding vector of each entity in the core associated entity set; Extracting the association relation grade of the core association entity set and the target disease, and converting the association relation grade into training weights; Construction of a loss function Wherein, the method comprises the steps of, A modal basis feature vector bound for the ith core entity, Embedding vectors for knowledge patterns of the ith core entity, n is the number of entities of the core associated entity set, Training weights for the ith core entity; Inputting the modal basic feature vector, the knowledge graph embedded vector and the training weight into a modal feature alignment training network constructed based on a full-connection layer, minimizing the loss function through a gradient descent method, and performing alignment training on the modal basic feature vector to obtain a modal pair Ji Xiangliang corresponding to each core entity; According to the formula Calculating weight ratio of each core entity The alignment vectors of all the core entities under the same mode are weighted and fused to obtain an ideal pair Ji Xiangliang of the mode i 。
- 7. The cross-modal alignment-based medical multi-modal data fusion analysis method of claim 4, wherein the cross-modal attention layer calculation flow is as follows: According to the formula Computing a base attention weight Wherein, the method comprises the steps of, Embedding a query vector for the target disease candidate set entity, A key vector spliced for the enhancement feature blocks of each modality, Is a feature dimension; Based on the base attention weight and the three-dimensional medical memory matrix And (3) carrying out weight adjustment, wherein the calculation formula is as follows: Wherein, the method comprises the steps of, A attention weight matrix after medical constraint; Calculating a preliminary unified semantic vector based on the attention weight matrix after medical constraint The calculation formula is as follows: Wherein, the method comprises the steps of, And a value vector spliced for the enhancement feature blocks of each mode.
- 8. The cross-modality alignment based medical multi-modality data fusion analysis method of claim 4, wherein the semantic consistency score The calculation formula of (2) is as follows: Wherein, the method comprises the steps of, For an ideal alignment vector for class i modalities, For the operation of the inner product of the vector, Is vector modulo long.
- 9. The cross-modal alignment-based medical multi-modal data fusion analysis method according to claim 1, wherein the feature fusion of the semantic alignment result by using a dynamic weight model comprises: Obtaining a unified semantic vector corresponding to the semantic alignment result Simultaneously extracting a medical knowledge graph relevance vector R and a clinical guideline priority vector And a modal quality score vector Q, wherein R is the clinical association strength of each modal and the target disease, extracted from the medical knowledge graph, Recommending priorities to guidelines of each mode, and setting based on diagnosis and treatment guidelines; setting an initial weight vector Based on dynamic weight formula Calculating to obtain final modal weight ; Feature fusion is carried out on the attention network by adopting a fusion gating mechanism, and the attention network is characterized by a formula Calculating a gating coefficient G, and passing through the formula Outputting a fusion characteristic vector, wherein, Representing embedding vectors into the knowledge graph of each core entity based on the clinical association strength of the target disease and the core association entity in the medical knowledge graph The knowledge-graph inference vector obtained by the weighted aggregation, The association relation grade of the ith core entity and the target disease; Representing an attention calculating operation.
- 10. The medical multi-mode data fusion analysis system based on cross-mode alignment is characterized by comprising a quality grading module, a semantic alignment module, a feature fusion module and a report generation module, wherein, The quality grading module is used for acquiring multi-mode medical data and carrying out quality grading evaluation to obtain the modal quality scores of all modes, wherein the multi-mode medical data at least comprises medical image data, electronic medical record text data and physiological signal time sequence data; The semantic alignment module is used for carrying out semantic alignment on the multi-mode data subjected to quality grading evaluation by adopting a dual-stage alignment algorithm of intra-mode feature enhancement and cross-mode semantic calibration to obtain a semantic alignment result; The feature fusion module is used for carrying out feature fusion on the semantic alignment result by adopting a dynamic weight model based on the knowledge graph association degree, the clinical guideline priority and the modal quality score; The report generation module is used for generating a structured diagnosis report for the fusion result through the artificial intelligence generation content AIGC model, carrying out clinical compliance checksum sensitivity analysis, and feeding back the checksum analysis result to the feature fusion step for optimization.
Description
Medical multi-mode data fusion analysis method and system based on cross-mode alignment Technical Field The invention belongs to the field of artificial intelligence, and particularly relates to a medical multi-mode data fusion analysis method and system based on cross-mode alignment. Background Along with the rapid development of medical informatization construction, massive multi-mode medical data are generated in the daily diagnosis and treatment process of medical institutions, and the medical information management system has important values in the aspects of clinical diagnosis, treatment scheme formulation, disease prediction and the like. However, the current fusion method of medical multi-modal data has significant disadvantages. Firstly, because medical data sources are various and formats are different, the traditional method lacks a unified quality evaluation standard, and data with different quality grades cannot be effectively identified and distinguished, so that high-quality data and low-quality data are treated equally in a fusion process, and the reliability of a final analysis result is seriously influenced. Secondly, when processing data of different modes in the prior art, a simple characteristic splicing or shallow fusion strategy is often adopted, and deep semantic association is difficult to establish, so that obvious semantic gaps exist among the information of different modes, and the accuracy of clinical diagnosis is affected. In addition, the existing multi-mode fusion method mostly adopts a static weight distribution strategy, and cannot be dynamically adjusted according to specific disease types, data quality changes and clinical guideline requirements, so that the fusion result lacks clinical adaptability. Disclosure of Invention The application provides a medical multi-mode data fusion analysis method and a system based on cross-mode alignment, which solve the technical problems of uneven data quality, semantic alignment steps and inflexible fusion weight distribution in the prior art. In order to achieve the above purpose, the application adopts the following technical scheme: in a first aspect, a medical multi-modal data fusion analysis method based on cross-modal alignment is provided, including: acquiring multi-mode medical data and carrying out quality grading evaluation to obtain the modal quality scores of all modes, wherein the multi-mode medical data at least comprises medical image data, electronic medical record text data and physiological signal time sequence data; Carrying out semantic alignment on the multi-mode data subjected to quality grading evaluation by adopting a dual-stage alignment algorithm of intra-mode feature enhancement and cross-mode semantic calibration to obtain a semantic alignment result; Based on the knowledge graph association degree, the clinical guideline priority and the modal quality score, carrying out feature fusion on the semantic alignment result by adopting a dynamic weight model; and generating a structural diagnosis report for the fusion result through an artificial intelligence generation content AIGC model, carrying out clinical compliance checksum sensitivity analysis, and feeding back the checksum analysis result to the feature fusion step for optimization. Based on the technical scheme, the cross-modal alignment-based medical multi-modal data fusion analysis method provided by the application automatically identifies and quantifies the quality difference of each modal data through the quality grading evaluation of multi-modal medical data, ensures the reliability and consistency of input data, and lays a solid foundation for subsequent processing. Secondly, a dual-stage alignment algorithm of intra-mode feature enhancement and cross-mode semantic calibration is adopted, so that the problem of semantic gap among multi-source heterogeneous data is effectively solved, the relevance among different modes is enhanced, and the semantic alignment precision is improved. Then, based on the knowledge graph association degree, the clinical guideline priority and the dynamic weight model of the modal quality score, intelligent fusion of multi-factor coordination is realized, so that the fusion process not only accords with a medical knowledge system, but also adapts to specific clinical scenes, and the scientificity of decision making is improved. And finally, generating a structural diagnosis report through a AIGC model, combining clinical compliance verification and sensitivity analysis, ensuring the compliance and robustness of an output result, and forming a closed-loop system through a feedback optimization mechanism at the same time, so that the overall performance is continuously improved. The method integrally realizes the full-flow optimization from acquisition to output of the multi-mode data, and provides a high-efficiency and reliable solution for intelligent medical application. Further, the acquiring multi-mode medical data