CN-122019603-A - Pressurized water reactor system equipment fault information acquisition method based on large language model

CN122019603ACN 122019603 ACN122019603 ACN 122019603ACN-122019603-A

Abstract

A pressurized water reactor system equipment fault information acquisition method based on a large language model relates to the field of fault tree analysis. The method comprises the steps of carrying out text block preprocessing on text materials to form text block sets with the length not exceeding a preset value, obtaining a semantic vector database for quick retrieval according to a deep neural network embedded model, constructing a query vector according to equipment names or query problems, positioning a text fragment set most relevant to the query, inputting the text fragment set obtained by the query and the retrieval into a preset prompt template to generate complete input context, submitting the input context to a large language model, and outputting a structural result comprising the failure mode, failure mechanism and tracing information. The method is suitable for efficient acquisition and knowledge management of equipment fault information in complex systems such as pressurized water reactors.

Inventors

DING MING
HAO XIAOTIAN
YANG YONGYONG
CAO XIAXIN
GUO ZEHUA
MENG ZHAOMING
HE YU
WANG YANKAI

Assignees

哈尔滨工程大学

Dates

Publication Date: 20260512
Application Date: 20251223

Claims (10)

1. The method for acquiring the equipment fault information of the pressurized water reactor system based on the large language model is characterized by comprising the following steps of: performing text block preprocessing on text data of a design manual, operation and maintenance records, accident analysis and operation logs of the pressurized water reactor system to form a text block set with the length not exceeding a preset length; inputting the text block set into a deep neural network embedded model, generating a corresponding high-dimensional semantic vector, and storing the high-dimensional semantic vector into a vector database to obtain a semantic vector database for quick retrieval; constructing a query vector according to the equipment name or the query problem, performing cosine similarity calculation with the high-dimensional semantic vector in the semantic vector database, and positioning a text fragment set most relevant to the query; inputting the text fragment set obtained by inquiring and searching into a preset prompt template to generate a complete input context; and submitting the input context to a large language model, and outputting a structured result containing failure modes, failure mechanisms and traceability information.
2. The method for acquiring the fault information of the pressurized water reactor system equipment based on the large language model according to claim 1, wherein the text block preprocessing adopts a segmentation mode based on sentence boundaries, and the maximum length of each text block is not more than five hundred characters.
3. The method for obtaining fault information of pressurized water reactor system equipment based on a large language model according to claim 1, wherein the deep neural network embedded model maps the text block into a high-dimensional semantic vector with a fixed dimension by performing semantic coding on the text block.
4. The method for obtaining fault information of pressurized water reactor system equipment based on a large language model according to claim 1, wherein the semantic vector database optimizes the storage and index of large-scale vector data, and can support the step of efficient similarity retrieval.
5. The method for obtaining fault information of pressurized water reactor system equipment based on large language model as set forth in claim 1, wherein said semantic search adopts cosine similarity as a measure for calculating similarity between query vector and text vector.
6. The method for obtaining fault information of pressurized water reactor system equipment based on large language model as set forth in claim 1, wherein said prompting template is preset to include a combination format of query content and search fragment for generating a complete context meeting the input requirement of the large language model.
7. The utility model provides a pressurized water reactor system equipment trouble information acquisition device based on big language model which characterized in that includes: Text data of a design manual, operation and maintenance records, accident analysis and operation logs of the pressurized water reactor system are subjected to text block preprocessing to form a module of a text block set with the length not exceeding a preset length; Inputting the text block set into a deep neural network embedded model, generating a corresponding high-dimensional semantic vector, and storing the high-dimensional semantic vector into a vector database to obtain a semantic vector database module for quick retrieval; A module for constructing a query vector according to the equipment name or the query problem, performing cosine similarity calculation with the high-dimensional semantic vector in the semantic vector database, and positioning a text fragment set most relevant to the query; Inputting the text fragment set obtained by inquiring and searching into a preset prompt template to generate a complete module for inputting the context; And submitting the input context to a large language model, and outputting a structural result containing failure modes, failure mechanisms and traceability information.
8. Computer storage medium for storing a computer program, characterized in that the computer performs the method of claim 1 when the computer program is read by the computer.
9. A computer comprising a processor and a storage medium, characterized in that the computer performs the method of claim 1 when the processor reads a computer program stored in the storage medium.
10. Computer program product, as a computer program, characterized in that the method of claim 1 is implemented when the computer program is executed.

Description

Pressurized water reactor system equipment fault information acquisition method based on large language model Technical Field Relates to the field of fault tree analysis, in particular to a pressurized water reactor system equipment fault information acquisition method based on a large language model. Background In the high-risk industrial fields of nuclear energy systems, aerospace, chemical industry and the like, acquisition and management of equipment fault information are basic links for fault tree analysis and reliability evaluation. The prior art mainly relies on the following modes: On the one hand, manual review and experience accumulation are still common approaches. The researchers or operation and maintenance engineers manually extract and integrate the failure modes and mechanisms of the equipment by referring to the documents of design manuals, operation records, accident analysis reports and the like and combining the experience of experts. Although the method has certain feasibility in a small-scale or single equipment system, when facing to the scenes of various equipment types and complex system coupling such as pressurized water reactors, the manual processing is extremely low in efficiency, and omission and subjective deviation are extremely easy to occur. On the other hand, an automated information extraction method based on keyword search and rule matching has also appeared in recent years. For example, some studies have attempted to quickly retrieve operation and maintenance records and incident reports using a pool of terms of art, regular matching rules, to initially locate text segments that may be relevant to equipment failure. The method improves the retrieval speed to a certain extent, but due to the diversity of natural language expression and the complexity of the semantics of the professional field, the method is difficult to cover all effective information, and the redundancy and irrelevant content in the retrieval result have higher proportion, so that the subsequent analysis workload is still larger. With the development of artificial intelligence, research and development have been conducted to explore the use of deep learning models for semantic analysis of industrial text, such as text classification based on Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), and event extraction techniques. However, these methods have limited ability to understand context and correlate knowledge across documents, and are difficult to meet the deep knowledge mining needs in a multi-source heterogeneous information environment such as a pressurized water reactor system. Especially when multidimensional data such as operation logs, accident cases, design constraints and the like are associated, the traditional method lacks logic reasoning and knowledge traceability capability, and cannot automatically construct a complete equipment failure mode system. In summary, the prior art has the defects of low efficiency of a manual method, insufficient semantic understanding of an automatic method, lack of multi-source knowledge integration and reasoning capability and difficulty in realizing efficient conversion from unstructured text to a failure mode of structured equipment. Disclosure of Invention In order to solve the defects that in the prior art, the efficiency of a manual method is low, the semantic understanding of an automatic method is insufficient, the integration and reasoning capability of multi-source knowledge is lacking, and the efficient conversion from unstructured text to a failure mode of structured equipment is difficult to realize, the technical scheme provided by the invention is as follows: a method for acquiring equipment fault information of a pressurized water reactor system based on a large language model comprises the following steps: performing text block preprocessing on text data of a design manual, operation and maintenance records, accident analysis and operation logs of the pressurized water reactor system to form a text block set with the length not exceeding a preset length; inputting the text block set into a deep neural network embedded model, generating a corresponding high-dimensional semantic vector, and storing the high-dimensional semantic vector into a vector database to obtain a semantic vector database for quick retrieval; constructing a query vector according to the equipment name or the query problem, performing cosine similarity calculation with the high-dimensional semantic vector in the semantic vector database, and positioning a text fragment set most relevant to the query; inputting the text fragment set obtained by inquiring and searching into a preset prompt template to generate a complete input context; and submitting the input context to a large language model, and outputting a structured result containing failure modes, failure mechanisms and traceability information. Further, there is provided a preferred embodiment