CN-122019718-A - Multi-mode large model knowledge retrieval and question answering method and device for fault diagnosis of intelligent operation and maintenance teaching equipment
Abstract
The application discloses a multi-mode large model knowledge retrieval and question answering method and device for fault diagnosis of intelligent operation and maintenance teaching equipment. The method comprises the steps of obtaining multi-modal query data of a user, generating comprehensive query vectors and keyword sets according to the multi-modal query data of the user, obtaining multi-modal knowledge fragments highly related to the current fault situation from a vector database according to the comprehensive query vectors and the keyword sets, and inputting the multi-modal knowledge fragments highly related to the current fault situation and the multi-modal query data of the user into a trained large model so as to obtain comprehensive teaching reply information. The application changes the limitation of the traditional single text description by fusing the multidimensional information such as text, image, sound, data and the like. The positioning accuracy of the fault root causes is improved to be more than 90 percent (based on the testing of a specific equipment data set), and the accuracy of the traditional keyword retrieval (about 65 percent) or the single large model question-answering (about 75 percent) is far superior.
Inventors
- ZHAO JIAN
Assignees
- 北京竞业达数码科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260128
Claims (10)
- 1. The multi-mode large model knowledge retrieval and question answering method for intelligent operation and maintenance teaching equipment fault diagnosis is characterized by comprising the following steps of: acquiring a preset vector database; acquiring multi-mode query data of a user; generating a comprehensive query vector and a keyword set according to multi-mode query data of a user; acquiring a multi-modal knowledge fragment highly related to the current fault situation from a vector database according to the comprehensive query vector and the keyword set; and inputting the multi-modal knowledge segments with high correlation to the current fault situation and the multi-modal query data of the user into the trained large model, so as to obtain comprehensive teaching reply information.
- 2. The multi-modal large model knowledge retrieval and question-answering method for fault diagnosis of intelligent operation and maintenance teaching equipment according to claim 1, wherein the obtaining the multi-modal knowledge segments highly relevant to the current fault situation from the vector database according to the comprehensive query vector and the keyword set comprises: Generating a fault situation effective dimension set and a confidence coefficient score table of each dimension according to the comprehensive query vector and the keyword set; Generating a contextualized weighted query vector and a single-mode normalized weight table according to the comprehensive query vector and each dimension confidence score table; According to contextualized weighted query vectors and keyword sets, semantic-keyword set bidirectional expansion retrieval and confidence filtering are carried out on the vector database, so that a bidirectional expansion retrieval effective set is obtained; Performing cross-modal knowledge segment association intensity quantization and complementarity evaluation according to the bidirectional expansion retrieval effective set and the vector database, thereby generating a high-association cross-modal knowledge set; carrying out conflict reconciliation of context priority guidance according to the high-association cross-modal knowledge set and each dimension confidence score table, thereby obtaining a conflict-free core knowledge set; and carrying out teaching suitability multidimensional weighted sequencing according to the conflict-free core knowledge group set and the keyword set, thereby generating a sequenced high-quality multi-modal knowledge fragment set as a multi-modal knowledge fragment highly related to the current fault situation.
- 3. The multi-modal large model knowledge retrieval and question answering method for intelligent operation and maintenance teaching equipment fault diagnosis according to claim 2, wherein the generating a fault situation effective dimension set and each dimension confidence score table according to the comprehensive query vector and the keyword set comprises: The key word set K is used as a search condition, the Top-100 related knowledge fragment is recalled in a vector database, and the characteristic vector is extracted Associated text description ; Description of associated text Extracting semantic features by BiLSTM model, and generating candidate context dimension set by combining TF-IDF algorithm ; Acquiring a dimension domain weight set and a dimension history hit rate set; according to the eigenvector Dimension domain weight set, dimension history hit rate set, comprehensive query vector Keyword set The vector database generates a fault situation effective dimension set and a confidence score table of each dimension.
- 4. The multi-modal large model knowledge retrieval and question answering method for intelligent operation and maintenance teaching equipment fault diagnosis as claimed in claim 3, wherein the said feature vector based method Candidate context dimension set Comprehensive query vector Keyword set Generating a fault situation effective dimension set by the vector database and a confidence score table of each dimension comprises: respectively acquiring a semantic similarity factor, a dimension text importance factor and a keyword-dimension association factor of each knowledge segment according to the Top-100 related knowledge segments; Calculating the confidence of each candidate context dimension by the following formula : ; Wherein, the Confidence for each candidate context dimension, Is a comprehensive query vector, Is a feature vector, For the j-th candidate context dimension Weight coefficient in the operation and maintenance field, Is the j candidate context dimension, Is the description of the ith text, K is a keyword set, For the j-th candidate dimension Hit rate in historical failure diagnosis cases.
- 5. The multi-modal large model knowledge retrieval and question answering method for intelligent operation and maintenance teaching equipment fault diagnosis according to claim 4, wherein the multi-modal query data of the user comprises user query text, user uploading fault images and user uploading abnormal audio; the generating the comprehensive query vector and the keyword set according to the multi-modal query data of the user further comprises: Encoding the user query text, the user uploading fault image and the user uploading abnormal audio respectively, thereby obtaining text feature vectors Image feature vector Abnormal audio feature vector ; The generating the contextualized weighted query vector and the unimodal normalized weight table according to the integrated query vector and the respective dimension confidence score table comprises: From text feature vectors Text integrity determination by keyword set ; From image feature vectors Image sharpness determination by keyword set ; From image feature vectors Determining audio signal-to-noise ratio for a set of keywords ; According to text integrity Definition of image Audio signal to noise ratio Generating modal mass vectors ; Acquiring a preset mode-dimension association mapping table; Defining a modal identification set m epsilon { t, i, a }, wherein t represents text, i represents an image, and a represents audio; The following is done for each modality m: Based on the mode-dimension association mapping table, counting the number of dimensions strongly associated with the mode in the effective dimension set of the fault situation ; Calculating a context adaptation coefficient of the modality: Wherein The total number of the effective dimension sets of the fault situation is 0, 1; Integrating the situation adaptation coefficients of all modes to generate a situation adaptation vector Wherein, the method comprises the steps of, Context adaptation coefficients for text modality, Adapting coefficients to the situation of the image mode, Adapting coefficients for the context of the audio modality; for each modality m, calculating a modality-dimension semantic association mean: Traversing a set of fault context effective dimensions Each dimension of (a) Calculating a unimodal feature vector by adopting a cosine similarity algorithm And dimension feature vector Semantic similarity of (2) The value range is [ -1,1]; combining dimension confidence Calculating a weighted semantic association mean: ; According to the weighted semantic association mean value, calculating the un-normalized weight of each mode : ; Normalizing the non-normalized weights to obtain final normalized weights of each mode ; Based on the final normalized weights for each modality Fusion text feature vector Image feature vector Abnormal audio feature vector Generating contextualized weighted query vectors And a single-mode normalized weight table.
- 6. The multi-modal large model knowledge retrieval and question-answering method for intelligent operation and maintenance teaching equipment fault diagnosis according to claim 5, wherein the performing semantic-keyword set bidirectional expansion retrieval and confidence filtering on the vector database according to contextualized weighted query vectors and keyword sets, thereby obtaining a bidirectional expansion retrieval effective set comprises: Acquiring a trained WordNet semantic network; Contextualized weighted query vectors through a trained WordNet semantic network Performing semantic expansion so as to obtain a semantic expansion vector set; acquiring a trained BERT model; Inputting a trained BERT model for each keyword in the keyword set, calculating the semantic similarity between the keyword and all terms in the term base in the operation and maintenance field, and screening the terms with the similarity ranking of 5 as terms Is related to the expansion of the word; Calculating the semantic similarity of each expansion related word and the original keyword by adopting a cosine similarity algorithm, and taking the similarity value as the weight of the related word The weight value range is 0,1, and all original keywords, the expansion related words and the corresponding weights are integrated to form an expansion keyword set ; Taking all vectors in the semantic expansion vector set as search conditions, executing cosine similarity matching in a vector database, recalling knowledge segments with similarity of more than or equal to 0.6 with any expansion vector, and recording the knowledge segments as candidate knowledge segment sets Each segment in the set comprises three attributes, namely a feature vector, an original resource and a similarity value with each expansion vector; Comprehensive confidence coefficient calculation, namely substituting the following formula into each segment in the candidate knowledge segment set to calculate the comprehensive confidence coefficient : ; Wherein, the The comprehensive confidence level of any knowledge segment s in the candidate knowledge segment set; weighted query vectors for contextualization, Is the characteristic vector of the segment s, For the weight value of the x-th layer expansion vector, the higher the expansion layer number x is, the smaller the weight value is, A preset fixed value is 0.95; to expand a keyword set; the weight value of the keyword k is expanded; a fixed value is preset to be 0.15 for the distance penalty coefficient; A set of fault context active dimensions; efficient dimension set for fault situation The number of elements; selecting fragments exceeding the comprehensive confidence threshold from the calculated comprehensive confidence levels to form a bidirectional extended search active set 。
- 7. The multi-modal large model knowledge retrieval and question-answering method for intelligent operation and maintenance teaching equipment fault diagnosis according to claim 6, wherein the cross-modal knowledge segment association strength quantization and complementarity evaluation according to the bidirectional expansion retrieval effective set and the vector database, thereby generating a high-association cross-modal knowledge group set comprises: Acquiring a preset fault topic classification dictionary, wherein the fault topic classification dictionary comprises topic labels such as fault types, equipment parts, processing flows and the like; traversing a two-way expansion search active set Extracting core information of the fragments according to each knowledge fragment s in the knowledge database, and labeling 1 core fault topic label for each fragment according to labels in a fault topic classification dictionary; Grouping knowledge segments marked with the same core fault topic labels into a group to form each topic group set Where p is the number of subject groups; For each topic group, all cross-modal knowledge pieces contained in the group are arranged and recorded as Wherein q is the number of fragments within the subject group; defining total mode number as 3 kinds, respectively, including text mode, image mode and audio mode, for every theme group, counting number of different mode types actually contained in said group, recording as Calculate the modal coverage The value range is ; Substituting the following formula to calculate the modal complementary coefficient Comp (g) of the theme group g: ; supplementing each topic group with modal complementarity coefficients Thereby obtaining a theme group collection G with modal complementary coefficients; for each topic group g, the following calculations are performed: calculating an arithmetic mean of the integrated confidence scores of all segments in a group ; Traversing any two different fragments in the group, and calculating the association factors of the two fragments ; Calculating the comprehensive association strength of each topic group g ; Setting an association strength threshold, screening out a theme group with comprehensive association strength larger than the association strength threshold, and forming a high-association cross-modal knowledge group set 。
- 8. The multi-modal large model knowledge retrieval and question-answering method for intelligent operation and maintenance teaching equipment fault diagnosis according to claim 7, wherein the performing context priority oriented conflict reconciliation based on the high-association cross-modal knowledge group set and the respective dimension confidence score tables, thereby obtaining a conflict-free core knowledge group set comprises: acquiring an operation and maintenance domain fault probability library, wherein the operation and maintenance domain fault probability library comprises occurrence probabilities of different fault root cause types ; From a set of fault context active dimensions Confidence is extracted from The dimension more than or equal to 0.85 is taken as each core dimension to form a core dimension set ; High-association cross-modal knowledge group set Extracting core fault root cause of each topic group in the database, calculating each topic group and core dimension set Binding the core dimension corresponding to each theme group according to the semantic association degree of each dimension; comparing the fault root causes of all the theme groups in the same core dimension, if the root cause expression is inconsistent, classifying the groups into conflict groups to form a conflict group set Conflict-free group composition temporary core group set ; Core dimension associated with conflicting groups Generating priority weights according to confidence level normalization, and ensuring the sum of the weights to be 1; For conflict group collection Respectively calculating the topic group-dimension semantic matching degree, the fault occurrence probability, the intra-group model quality average value and the intra-group redundancy coefficient average value of each conflict group; Generating a comprehensive reconciliation score of the conflict group according to the topic group-dimension semantic matching degree, the fault occurrence probability, the intra-group model quality average value and the intra-group redundancy coefficient average value; For conflict group collection The groups in (a) are arranged in descending order of the integrated reconciliation score to obtain a set of collision-free core knowledge groups.
- 9. The multi-modal large model knowledge retrieval and question answering method for intelligent operation and maintenance teaching equipment fault diagnosis as claimed in claim 8, wherein said pair conflict group is integrated The groups of (a) are arranged in descending order of the integrated reconciliation score, such that obtaining a set of conflict-free core knowledge groups comprises: presetting a score difference threshold; If the difference value of the integrated harmonic score of the highest-obtained group and the integrated harmonic score of the second highest-obtained group is more than or equal to a preset score difference value threshold, the group with the highest score is reserved, other conflict groups are removed, and the reserved group is classified into a conflict-free core knowledge group set.
- 10. The multi-mode large model knowledge retrieval and question answering device for intelligent operation and maintenance teaching equipment fault diagnosis is characterized by comprising: the vector database acquisition module is used for acquiring a preset vector database; the multi-mode query data acquisition module is used for acquiring multi-mode query data of a user; the comprehensive query vector and keyword set acquisition module is used for generating a comprehensive query vector and keyword set according to multi-modal query data of a user; The multi-modal knowledge segment acquisition module is used for acquiring multi-modal knowledge segments highly relevant to the current fault situation from a vector database according to the comprehensive query vector and the keyword set; and the comprehensive teaching reply information acquisition module is used for inputting the multi-modal knowledge fragments and the multi-modal query data of the users, which are highly relevant to the current fault situation, into the trained large model so as to acquire the comprehensive teaching reply information.
Description
Multi-mode large model knowledge retrieval and question answering method and device for fault diagnosis of intelligent operation and maintenance teaching equipment Technical Field The application relates to the technical field of data processing, in particular to a multi-mode large model knowledge retrieval and question answering method and device for fault diagnosis of intelligent operation and maintenance teaching equipment. Background The prior art has the following disadvantages: The teaching efficiency is low, when the intelligent classroom is in operation and maintenance, and the intelligent classroom is faced with faults of teaching equipment (classroom platform central control, server and the like), the intelligent classroom is difficult to quickly locate knowledge from massive unstructured equipment manuals, drawings and historical maintenance records, depends on one-way instruction of teachers, and has a steep learning curve. The diagnosis is not visual, the traditional method mainly relies on text to describe faults, the equipment state is often reflected by multi-dimensional information such as an indicator light, abnormal sound, vibration and the like, the multi-modal information cannot be effectively utilized by a plain text question-answering, and the diagnosis accuracy is limited. Knowledge island, namely, dispersing knowledge such as equipment documents, sensor data, maintenance cases and the like in different systems and formats to form an information island, wherein students cannot perform associated inquiry and comprehensive analysis. Disclosure of Invention The invention aims to provide a multi-mode large model knowledge retrieval and question-answering method for fault diagnosis of intelligent operation and maintenance teaching equipment, which is used for solving at least one technical problem. The invention provides a multi-mode large model knowledge retrieval and question answering method for fault diagnosis of intelligent operation and maintenance teaching equipment, which comprises the following steps: acquiring a preset vector database; acquiring multi-mode query data of a user; generating a comprehensive query vector and a keyword set according to multi-mode query data of a user; acquiring a multi-modal knowledge fragment highly related to the current fault situation from a vector database according to the comprehensive query vector and the keyword set; and inputting the multi-modal knowledge segments with high correlation to the current fault situation and the multi-modal query data of the user into the trained large model, so as to obtain comprehensive teaching reply information. Optionally, the obtaining the multi-modal knowledge segment highly relevant to the current fault situation from the vector database according to the comprehensive query vector and the keyword set includes: Generating a fault situation effective dimension set and a confidence coefficient score table of each dimension according to the comprehensive query vector and the keyword set; Generating a contextualized weighted query vector and a single-mode normalized weight table according to the comprehensive query vector and each dimension confidence score table; According to contextualized weighted query vectors and keyword sets, semantic-keyword set bidirectional expansion retrieval and confidence filtering are carried out on the vector database, so that a bidirectional expansion retrieval effective set is obtained; Performing cross-modal knowledge segment association intensity quantization and complementarity evaluation according to the bidirectional expansion retrieval effective set and the vector database, thereby generating a high-association cross-modal knowledge set; carrying out conflict reconciliation of context priority guidance according to the high-association cross-modal knowledge set and each dimension confidence score table, thereby obtaining a conflict-free core knowledge set; and carrying out teaching suitability multidimensional weighted sequencing according to the conflict-free core knowledge group set and the keyword set, thereby generating a sequenced high-quality multi-modal knowledge fragment set as a multi-modal knowledge fragment highly related to the current fault situation. Optionally, the generating the fault situation effective dimension set and each dimension confidence score table according to the comprehensive query vector and the keyword set includes: The key word set K is used as a search condition, the Top-100 related knowledge fragment is recalled in a vector database, and the characteristic vector is extracted Associated text description; Description of associated textExtracting semantic features by BiLSTM model, and generating candidate context dimension set by combining TF-IDF algorithm; From a set of candidate context dimensionsGenerating a dimension domain weight set and a dimension history hit rate set; according to the eigenvector Dimension domain weight set, dimens