Search

CN-121980055-A - Method, device, equipment and medium for structured extraction of industrial drawing

CN121980055ACN 121980055 ACN121980055 ACN 121980055ACN-121980055-A

Abstract

The application discloses a method, a device, equipment and a medium for structurally extracting industrial drawings, which relate to the technical field of drawing processing and comprise the steps of obtaining industrial drawing files; analyzing the industrial drawing file to obtain candidate fields and corresponding evidence source information, wherein the evidence source information comprises evidence texts and coordinate positions in drawings, calculating evidence coverage rate of each candidate field according to the evidence source information, judging whether the evidence coverage rate is larger than or equal to a coverage rate threshold value, searching in a standard knowledge base according to the candidate fields to obtain rule fragments if the evidence coverage rate is larger than or equal to the coverage rate threshold value, compiling the rule fragments into constraint sets, and generating structured data according to the candidate fields meeting all constraints in the constraint sets if the candidate fields meet all constraints in the constraint sets. The method can accurately convert the industrial engineering drawing into the structural data.

Inventors

  • ZHANG DIANXING
  • LI SHENG
  • LI GANG
  • LI JINGWEI
  • Lu Jiaye

Assignees

  • 北京神州数码云计算有限公司

Dates

Publication Date
20260505
Application Date
20260408

Claims (10)

  1. 1. A method for structured drawing of an industrial drawing, the method comprising: acquiring an industrial drawing file; Analyzing the industrial drawing file to obtain candidate fields and corresponding evidence source information, wherein the evidence source information comprises evidence texts and coordinate positions in drawings; calculating evidence coverage rate of each candidate field according to the evidence source information; Judging whether the coverage rate of the evidence is larger than or equal to a coverage rate threshold value or not; if the evidence coverage rate is greater than or equal to a coverage rate threshold value, searching in a standard knowledge base according to the candidate field to obtain a rule segment; compiling the rule segments into a constraint set; and if the candidate fields meet all the constraints in the constraint set, generating structured data according to the candidate fields meeting all the constraints in the constraint set.
  2. 2. The method according to claim 1, wherein the method further comprises: If the evidence coverage rate is smaller than the coverage rate threshold value, positioning a corresponding first area in the drawing according to the candidate field, and re-analyzing the first area to obtain an updated candidate field and corresponding updated evidence source information; if the candidate field does not meet any constraint in the constraint set, positioning a corresponding second area in the drawing according to the unsatisfied constraint, and re-analyzing the second area to obtain an updated candidate field and corresponding updated evidence source information.
  3. 3. The method of claim 1, wherein said calculating evidence coverage for each candidate field from said evidence source information comprises: And obtaining evidence coverage rate according to the jth word element in the candidate field, the candidate evidence unit matched with the jth word element in the evidence source information, the spatial distance between the jth word element and the corresponding candidate evidence unit on the drawing plane and the weight coefficient of the jth word element.
  4. 4. The method of claim 1, wherein the coverage threshold is obtained by: acquiring a basic coverage rate threshold, a drawing page quality score, a current parser confidence coefficient and a field risk level coefficient; and determining a coverage rate threshold according to the basic coverage rate threshold, the drawing page quality score, the current parser confidence level and the field risk level coefficient.
  5. 5. The method of claim 4, wherein the drawing page quality score is obtained by: acquiring the ambiguity, noise intensity, compression artifact degree and effective text area ratio of a drawing page; and obtaining the quality score of the drawing page according to the ambiguity, the noise intensity, the compression artifact degree and the effective text area ratio.
  6. 6. The method of claim 4, wherein the current resolver confidence is obtained by: Obtaining text matching probability, region positioning intersection ratio, structural integrity score and analysis consistency score output by an analyzer; And obtaining the confidence of the current parser according to the text matching probability, the region positioning intersection ratio, the structural integrity score and the parsing consistency score.
  7. 7. The method of claim 4, wherein the field risk level coefficients are obtained by: acquiring field types, engineering importance levels, standard constraint intensity and historical error frequency; and obtaining a field risk level coefficient according to the field type, the engineering importance level, the standard constraint strength and the historical error frequency.
  8. 8. An apparatus for structured drawing of an industrial drawing, the apparatus comprising: The acquisition module is used for acquiring an industrial drawing file; the processing module is used for analyzing the industrial drawing file to obtain candidate fields and corresponding evidence source information, wherein the evidence source information comprises evidence texts and coordinate positions in drawings; The judging module is used for judging whether the evidence coverage rate is larger than or equal to a coverage rate threshold value, searching in a standard knowledge base according to the candidate fields to obtain rule fragments if the evidence coverage rate is larger than or equal to the coverage rate threshold value, compiling the rule fragments into constraint sets, and generating structured data according to the candidate fields meeting all constraints in the constraint sets if the candidate fields meet all constraints in the constraint sets.
  9. 9. A computing device comprising a memory and a processor; Wherein one or more computer programs are stored in the memory, the one or more computer programs comprising instructions, which when executed by the processor, cause the computing device to perform the method of any of claims 1-7.
  10. 10. A computer-readable storage medium, characterized in that the computer-readable storage medium is for storing a computer program, the computer program for performing the method of any one of claims 1 to 7.

Description

Method, device, equipment and medium for structured extraction of industrial drawing Technical Field The application relates to the technical field of drawing processing, in particular to a method, a device, equipment and a medium for structured extraction of industrial drawings. Background The industrial engineering drawing is taken as a technical document for product design, manufacture and inspection, and the structured extraction and compliance calibration are key links of intelligent manufacture and engineering data management. Along with the deep digital transformation of enterprises, massive historical drawings and newly added drawings exist in the forms of scanning PDF, mixed PDF and the like, and how to efficiently and accurately transform key fields such as title bar information, detail sheets, technical requirements, form and position tolerance and the like contained in the drawings into auditable structured data becomes an important bottleneck for restricting the engineering data to be assets. The current industrial drawing analysis technology mainly adopts the following scheme that firstly, field extraction is realized through character recognition and preset rules based on an analysis pipeline of optical character recognition (Optical Character Recognition, OCR) and a rule template, secondly, structured output is directly generated from an image based on an end-to-end document understanding model of OCR-free, thirdly, field recognition and information extraction are completed by utilizing semantic understanding capability of a visual language model based on an intelligent analysis method of a multi-mode large model, and thirdly, the analysis result is checked and corrected by engineers according to experience by relying on standard compliance verification of manual review. However, the prior art does not accurately convert industrial engineering drawings into structured data. Disclosure of Invention The application provides a method, a device, equipment and a medium for structured extraction of an industrial drawing, which can accurately convert the industrial engineering drawing into structured data. In order to achieve the above purpose, the application adopts the following technical scheme: in a first aspect, the present application provides a method for structured drawing of an industrial drawing, the method comprising: acquiring an industrial drawing file; Analyzing the industrial drawing file to obtain candidate fields and corresponding evidence source information, wherein the evidence source information comprises evidence texts and coordinate positions in drawings; calculating evidence coverage rate of each candidate field according to the evidence source information; Judging whether the coverage rate of the evidence is larger than or equal to a coverage rate threshold value or not; if the evidence coverage rate is greater than or equal to a coverage rate threshold value, searching in a standard knowledge base according to the candidate field to obtain a rule segment; compiling the rule segments into a constraint set; and if the candidate fields meet all the constraints in the constraint set, generating structured data according to the candidate fields meeting all the constraints in the constraint set. Optionally, the method further comprises: If the evidence coverage rate is smaller than the coverage rate threshold value, positioning a corresponding first area in the drawing according to the candidate field, and re-analyzing the first area to obtain an updated candidate field and corresponding updated evidence source information; if the candidate field does not meet any constraint in the constraint set, positioning a corresponding second area in the drawing according to the unsatisfied constraint, and re-analyzing the second area to obtain an updated candidate field and corresponding updated evidence source information. Optionally, the calculating the evidence coverage rate of each candidate field according to the evidence source information includes: And obtaining evidence coverage rate according to the jth word element in the candidate field, the candidate evidence unit matched with the jth word element in the evidence source information, the spatial distance between the jth word element and the corresponding candidate evidence unit on the drawing plane and the weight coefficient of the jth word element. Optionally, the coverage threshold is obtained by: acquiring a basic coverage rate threshold, a drawing page quality score, a current parser confidence coefficient and a field risk level coefficient; and determining a coverage rate threshold according to the basic coverage rate threshold, the drawing page quality score, the current parser confidence level and the field risk level coefficient. Optionally, the drawing page quality score is obtained by the following method: acquiring the ambiguity, noise intensity, compression artifact degree and effective text area ratio of a drawing p