Search

CN-121981085-A - Engineering technology standard intelligent identification and verification method based on multi-source information fusion

CN121981085ACN 121981085 ACN121981085 ACN 121981085ACN-121981085-A

Abstract

The invention discloses an engineering technical standard intelligent recognition and verification method based on multi-source information fusion, which is used for receiving engineering documents submitted by users and obtaining pure text content extracted from original engineering documents after cleaning; the method comprises the steps of adopting multi-path cooperation, intelligently identifying potential engineering technical standards, intelligently removing duplication, comparing and missing analysis on an identified engineering technical standard list to obtain a comprehensive engineering technical standard list to be checked, networking to verify the validity of the engineering technical standard to be checked, and integrating all networking verification information to obtain a structured compliance inspection report. The method overcomes the defects of traditional engineering technical standard retrieval, and solves the technical problems of low efficiency, easy omission and poor accuracy of manual verification caused by scattered technical standard information, heterogeneous text and frequent updating in engineering technical scheme design, calculation and compliance examination.

Inventors

  • ZHANG BO
  • WEI TONG
  • JIANG XINJIAN
  • HAN LIN
  • YANG SHAOWU
  • TAN XINGYU

Assignees

  • 中国二十冶集团有限公司

Dates

Publication Date
20260505
Application Date
20260105

Claims (2)

  1. 1. The engineering technical standard intelligent identification and verification method based on multi-source information fusion is characterized by comprising the following steps of: Step one, acquiring and preprocessing an engineering document, receiving the engineering document submitted by a user, reading all text contents of the engineering document, and performing basic cleaning including irrelevant format codes and blank character cleaning to obtain pure text contents extracted from the original engineering document; Step two, multi-path collaborative intelligent recognition potential engineering technical standards, The first path is interactive question-answer recognition, wherein an intelligent question-answer module based on a large language model recognizes names and numbers of all engineering technical standards, specifications or atlas explicitly referenced in plain text content, and an engineering technical standard reference preliminary list A is generated; a second path, namely automatically analyzing and extracting the full text file, scanning the plain text content by using a text analysis tool, searching all text fragments which accord with the engineering technical standard names, numbers or similar formats, and generating an engineering technical standard extraction list B in the plain text content; the third path is that intelligent reasoning supplements the suggestion, based on the related engineering technology of the pure text content, the intelligent question-answering module carries out reasoning based on the knowledge base, analyzes key engineering technology standards which are not mentioned in the text, and generates a suggestion supplements engineering technology standard list C; step three, intelligent duplicate removal, comparison and deletion analysis, the three lists are intelligently compared, and the method comprises the following steps: Standardized processing, namely unifying the formats of all engineering technical standard numbers and eliminating the engineering technical standard with repeated numbers; Semantic comparison, namely judging whether engineering technical standard names in different lists have the same core meaning or not, and eliminating engineering technical standards with the same core meaning; Cross analysis, namely paying special attention to the difference set of the list B and the list A to form a suspected omission engineering technology standard set D; Obtaining a comprehensive engineering technical standard list E to be checked, which is combined, de-duplicated and marked with suspected missing items; Step four, automatically networking to verify the validity, automatically accessing one or more authoritative standard database websites, and acquiring official state information for each engineering technical standard in the list E, wherein the official state information comprises whether the current validity, the latest version number, the release date and whether the current validity, the latest version number and the release date are replaced; Generating a structured inspection report, integrating all networking verification information to give an inspection report, wherein the inspection report comprises all the discovered engineering technical standards and source paths thereof, the official validity state of each standard, the important prompt of the abolished or non-current engineering technical standards, the special prompt of the discovered suspected missing engineering technical standards, and the final acquisition of the structured compliance inspection report.
  2. 2. The intelligent recognition and verification method of engineering technical standards based on multi-source information fusion according to claim 1, wherein the engineering document submitted by the user in the first step comprises Word, PDF or text format.

Description

Engineering technology standard intelligent identification and verification method based on multi-source information fusion Technical Field The invention relates to the technical field of information, in particular to an engineering technical standard intelligent identification and verification method based on multi-source information fusion. Background In the field of engineering construction and design, technical standards and specifications are fundamental basis for ensuring engineering safety, quality and compliance. However, such standard information typically exhibits three typical features of source dispersion (distributed throughout regulatory documents, design manuals, history projects, etc.), text heterogeneity (which may occur in full, shorthand, numbered, or spoken descriptions in documents), and dynamic updates (national standards, industry standards may be revoked, replaced, or revised over time). These objectively existing, typical characteristics make systematic identification, aggregation, and verification of the integrity and validity of all applicable standards in a design or a computer book a long-standing and complex technical problem. At present, aiming at the problems, the industry mainly relies on two modes, namely, completely manual checking, namely, engineering technicians manually extract standards from documents through experience, and query and verification are carried out through an authoritative database or a website one by one. The method is time-consuming and labor-consuming, and depends on personal experience seriously, and risks such as standard omission, version outdated and the like are easily caused by omission or untimely information updating. Secondly, a single automated tool is employed to assist, for example, using a simple keyword search or applying web crawler tools to query for a particular data source. The method can improve local efficiency, but often lacks intelligent understanding capability of heterogeneous texts, and cannot associate the same standard of different expressions from a semantic level, and meanwhile, is usually an isolated and single-point solution, and fails to integrate three key links of standard identification (from where to find) "," list fusion (how to remove duplicate complement) "and" validity verification (whether latest valid) into a closed-loop and systematic technical process, so that the comprehensiveness and reliability of the result cannot be guaranteed. Disclosure of Invention The technical problem to be solved by the invention is to provide an intelligent engineering technical standard identification and verification method based on multi-source information fusion, which overcomes the defects of traditional engineering technical standard retrieval and solves the technical problems of low manual verification efficiency, easiness in omission and poor accuracy caused by scattered technical standard information, heterogeneous texts and frequent updating in engineering technical scheme design, calculation and compliance examination. In order to solve the technical problems, the engineering technical standard intelligent identification and verification method based on multi-source information fusion comprises the following steps: Step one, acquiring and preprocessing an engineering document, receiving the engineering document submitted by a user, reading all text contents of the engineering document, and performing basic cleaning including irrelevant format codes and blank character cleaning to obtain pure text contents extracted from the original engineering document; Step two, multi-path collaborative intelligent recognition potential engineering technical standards, The first path is interactive question-answer recognition, wherein an intelligent question-answer module based on a large language model recognizes names and numbers of all engineering technical standards, specifications or atlas explicitly referenced in plain text content, and an engineering technical standard reference preliminary list A is generated; a second path, namely automatically analyzing and extracting the full text file, scanning the plain text content by using a text analysis tool, searching all text fragments which accord with the engineering technical standard names, numbers or similar formats, and generating an engineering technical standard extraction list B in the plain text content; the third path is that intelligent reasoning supplements the suggestion, based on the related engineering technology of the pure text content, the intelligent question-answering module carries out reasoning based on the knowledge base, analyzes key engineering technology standards which are not mentioned in the text, and generates a suggestion supplements engineering technology standard list C; step three, intelligent duplicate removal, comparison and deletion analysis, the three lists are intelligently compared, and the method comprises the following steps: Standardized