Search

CN-115631503-B - Method and system for automatically extracting table data in two-dimensional drawing

CN115631503BCN 115631503 BCN115631503 BCN 115631503BCN-115631503-B

Abstract

The invention provides a method and a system for automatically extracting table data in a two-dimensional drawing, which belong to the technical field of table data extraction, wherein straight line segments and text entity elements in the two-dimensional drawing are automatically extracted and classified, then all straight line segment entities are matched according to the requirement of building a table to obtain a table segment set capable of building the obtained table and a blank table is built, further characters in each cell are determined through the coverage area of each cell in the blank table and the coordinates of all text entities, and binding is carried out to finally obtain the table containing text data; the method realizes the automatic extraction of the table data in the two-dimensional drawing, and then the contents of the table data can be directly stored in the data management system in other modes, so that the workload of a designer for re-making the table and filling the data in the data management system is reduced.

Inventors

  • WU LIN
  • PENG WEI

Assignees

  • 杭州新迪数字工程系统有限公司

Dates

Publication Date
20260512
Application Date
20220908

Claims (8)

  1. 1. The method for automatically extracting the table data in the two-dimensional drawing is characterized by comprising the following steps of: extracting straight line segment entities and text entities in a two-dimensional drawing to obtain a straight line segment set and a text set; Grouping all straight line segment entities in the straight line segment set according to the extending direction to obtain a horizontal line segment set and a vertical line segment set; Grouping all straight line segment entities in the horizontal line segment set according to the starting point coordinates of the straight line segment entities and the lengths of the straight line segment entities to obtain a plurality of horizontal line segment subsets; Grouping all straight line segment entities in the vertical line segment set according to the starting point coordinates of the straight line segment entities and the lengths of the straight line segment entities to obtain a plurality of vertical line segment subsets; According to each horizontal line segment subset and each vertical line segment subset, determining a plurality of straight line segment entities capable of forming a table to obtain a table line segment set; constructing a blank table according to the table line segment set, wherein the blank table comprises a plurality of blank cells; Binding each text entity into a corresponding blank cell according to the coordinates of each text entity in the text set and the coverage area of each blank cell in the blank table to obtain a table containing data; according to each horizontal line segment subset and each vertical line segment subset, a plurality of straight line segment entities which can form a table are obtained to obtain a table line segment set, and the method specifically comprises the following steps: determining the vertical line segment subset with the maximum average length value of the straight line segment entity as an initial table line segment set from all the vertical line segment subsets; Traversing each horizontal line segment subset, and adding a straight line segment entity intersected with a straight line segment entity in the initial table line segment set into the initial table line segment set to obtain an intermediate table line segment set; Traversing each vertical segment subset, adding a straight line segment entity crossing the straight line segment entity in the middle table segment set into the middle table segment set to obtain a table segment set, or Determining a horizontal line segment subset with the maximum average length value of straight line segment entities as an initial table line segment set in each horizontal line segment subset; traversing each vertical line segment subset, and adding a line segment entity intersected with a line segment entity in the initial table segment set into the initial table segment set to obtain an intermediate table segment set; traversing each horizontal line segment subset, and adding a straight line segment entity intersected with a straight line segment entity in the middle table line segment set into the middle table line segment set to obtain a table line segment set.
  2. 2. The method for automatically extracting table data in a two-dimensional drawing according to claim 1, wherein the method for automatically extracting table data in a two-dimensional drawing further comprises, before the extracting of the straight line segment entity and the text entity in the two-dimensional drawing: judging whether the obtained entity is a minimum entity or not to obtain a minimum entity judging result, wherein the minimum entity is a line segment entity or a character entity, and the line segment entity comprises a straight line segment entity and a curve line segment entity; If the minimum entity judging result is yes, no further processing is needed for the entity; and if the minimum entity judgment result is negative, performing explosion treatment on the entity until the entity is converted into a plurality of minimum entities.
  3. 3. The method for automatically extracting table data in a two-dimensional drawing according to claim 1, wherein after grouping all straight line segment entities in the straight line segment set according to the extending direction to obtain a horizontal line segment set and a vertical line segment set, the method for automatically extracting table data in a two-dimensional drawing further comprises: In the horizontal line segment set, judging whether the distance between the end point coordinates of any line segment entity and the start point coordinates of other line segment entities is smaller than a threshold value or not according to any line segment entity, and if so, connecting the two line segment entities into one line segment entity; In the vertical line segment set, judging whether the distance between the end point coordinates of any line segment entity and the start point coordinates of other line segment entities is smaller than a threshold value or not according to any line segment entity, if so, connecting the two line segment entities into one line segment entity, and if so, taking the end point of the vertically extended line segment entity as the lowest point.
  4. 4. The method for automatically extracting table data in two-dimensional drawing according to claim 1, wherein the grouping all straight line segment entities in the set of horizontal line segments according to the start coordinates of the straight line segment entities and the lengths of the straight line segment entities to obtain a plurality of horizontal line segment subsets specifically comprises: and taking a plurality of straight line segment entities with the same starting point y-axis coordinates and the same length in the horizontal line segment set as a horizontal line segment subset to obtain a plurality of horizontal line segment subsets.
  5. 5. The method for automatically extracting table data in a two-dimensional drawing according to claim 1, wherein the grouping all straight line segment entities in the vertical line segment set according to the start coordinates of the straight line segment entities and the lengths of the straight line segment entities to obtain a plurality of vertical line segment subsets specifically comprises: and taking a plurality of straight line segment entities with the same starting point x-axis coordinates and the same length in the vertical line segment set as a vertical line segment subset to obtain a plurality of vertical line segment subsets.
  6. 6. The method for automatically extracting table data in two-dimensional drawing according to claim 1, wherein binding each text entity into a corresponding blank cell according to the coordinates of each text entity in the text set and the area size of each blank cell in the blank table to obtain a table containing data, specifically comprising: For any blank cell, determining a coverage area of the blank cell according to two-point coordinates of a diagonal line of the blank cell; and determining the text entities in the blank cells according to the coordinates of the text entities and the coverage area of the blank cells.
  7. 7. The system for automatically extracting the table data in the two-dimensional drawing is characterized by comprising the following components: the entity extraction module is used for extracting straight line segment entities and text entities in the two-dimensional drawing to obtain a straight line segment set and a text set; The system comprises a line segment grouping module, a plurality of horizontal line segment subsets, a vertical line segment subset, a plurality of vertical line segment subsets and a plurality of horizontal line segment subsets, wherein the line segment grouping module is used for grouping all line segment entities in a line segment set according to the extending direction to obtain a horizontal line segment set and a vertical line segment set; the table segment set determining module is used for determining a plurality of straight line segment entities capable of forming a table according to each horizontal segment subset and each vertical segment subset to obtain a table segment set; the blank table construction module is used for constructing a blank table according to the table line segment set, wherein the blank table comprises a plurality of blank cells; The character binding module is used for binding each character entity into a corresponding blank cell according to the coordinates of each character entity in the character set and the coverage area of each blank cell in the blank table to obtain a table containing data; The table segment set determining module is used for determining a vertical segment subset with the largest average length value of straight line segment entities as an initial table segment set in the vertical segment subsets, traversing each horizontal segment subset, adding straight line segment entities intersected with the straight line segment entities in the initial table segment set into the initial table segment set to obtain an intermediate table segment set, traversing each vertical segment subset, adding straight line segment entities intersected with the straight line segment entities in the intermediate table segment set into the intermediate table segment set to obtain a table segment set, or The method comprises the steps of determining a horizontal line segment subset with the largest average length value of straight line segment entities in each horizontal line segment subset to be an initial table line segment set, traversing each vertical line segment subset, adding straight line segment entities intersected with the straight line segment entities in the initial table line segment set to obtain an intermediate table line segment set, traversing each horizontal line segment subset, and adding straight line segment entities intersected with the straight line segment entities in the intermediate table line segment set to obtain the table line segment set.
  8. 8. The system for automatically extracting table data in a two-dimensional drawing according to claim 7, wherein the system for automatically extracting table data in a two-dimensional drawing further comprises a minimum entity judging module and an explosion processing module; The system comprises a minimum entity judging module, an explosion processing module and a character processing module, wherein the minimum entity judging module is used for judging whether an obtained entity is a minimum entity to obtain a minimum entity judging result, the minimum entity comprises a line segment entity and a character entity, the line segment entity comprises a straight line segment entity and a curve line segment entity, if the minimum entity judging result is yes, no further processing is needed for the entity, and if the minimum entity judging result is no, the explosion processing module is called; and the explosion processing module is used for carrying out explosion processing on the entity until the entity is converted into a plurality of minimum entities.

Description

Method and system for automatically extracting table data in two-dimensional drawing Technical Field The invention relates to the technical field of table data extraction, in particular to a method and a system for automatically extracting table data in a two-dimensional drawing. Background Along with the rapid development of industrialization, paper version drawings are often replaced by electronic drawings on some engineering projects, and general drawing designers can append some technical parameters on the drawings so as to be convenient for processing and visually see the information of relevant processing parameters, such as the information of size, specification and the like. However, with the advancement of digitization, technical information on drawings needs to be presented not only on the drawings, but also to be input synchronously to some data management systems such as ERP, SAP, etc. Meanwhile, some historical drawing data also need to be manually extracted and input into the data management system, so that the workload is extremely huge, and if a designer is allowed to manually input the data into the data management system, the workload of the designer can be increased definitely. Disclosure of Invention The invention aims to provide a method and a system for automatically extracting table data in a two-dimensional drawing, which realize the automatic extraction of the table data in the two-dimensional drawing and reduce the workload of a designer for re-making the table and filling the data in a data management system. In order to achieve the above object, the present invention provides the following solutions: A method for automatically extracting table data in a two-dimensional drawing comprises the following steps: extracting straight line segment entities and text entities in a two-dimensional drawing to obtain a straight line segment set and a text set; Grouping all straight line segment entities in the straight line segment set according to the extending direction to obtain a horizontal line segment set and a vertical line segment set; Grouping all straight line segment entities in the horizontal line segment set according to the starting point coordinates of the straight line segment entities and the lengths of the straight line segment entities to obtain a plurality of horizontal line segment subsets; Grouping all straight line segment entities in the vertical line segment set according to the starting point coordinates of the straight line segment entities and the lengths of the straight line segment entities to obtain a plurality of vertical line segment subsets; According to each horizontal line segment subset and each vertical line segment subset, determining a plurality of straight line segment entities capable of forming a table to obtain a table line segment set; constructing a blank table according to the table line segment set, wherein the blank table comprises a plurality of blank cells; And binding each text entity into a corresponding blank cell according to the coordinates of each text entity in the text set and the coverage area of each blank cell in the blank table to obtain a table containing data. Optionally, before the extracting the straight line segment entity and the text entity in the two-dimensional drawing, the method for automatically extracting the table data in the two-dimensional drawing further includes: judging whether the obtained entity is a minimum entity or not to obtain a minimum entity judging result, wherein the minimum entity is a line segment entity or a character entity, and the line segment entity comprises a straight line segment entity and a curve line segment entity; If the minimum entity judging result is yes, no further processing is needed for the entity; and if the minimum entity judgment result is negative, performing explosion treatment on the entity until the entity is converted into a plurality of minimum entities. Optionally, after grouping all the straight line segment entities in the straight line segment set according to the extending direction to obtain the horizontal line segment set and the vertical line segment set, the method for automatically extracting the table data in the two-dimensional drawing further includes: In the horizontal line segment set, judging whether the distance between the end point coordinates of any line segment entity and the start point coordinates of other line segment entities is smaller than a threshold value or not according to any line segment entity, and if so, connecting the two line segment entities into one line segment entity; In the vertical line segment set, judging whether the distance between the end point coordinates of any line segment entity and the start point coordinates of other line segment entities is smaller than a threshold value or not according to any line segment entity, if so, connecting the two line segment entities into one line segment entity, and if so, taking the end