US-12620251-B2 - Systems and methods for automatic detection of features on a sheet

US12620251B2US 12620251 B2US12620251 B2US 12620251B2US-12620251-B2

Abstract

A computer-implemented method for detecting a target object on a document page that includes detecting a sample target area on a sample document page, generating an image by overlapping a plurality of sample document pages with one another, and detecting one or more cells within the sample target area on the image. The sample target area includes a sample target object. The method further includes extracting one or more informational features from each of the one or more cells. The one or more informational features define characteristics of a corresponding cell of the one or more cells. A machine learning model is trained using the one or more informational features extracted from each of the one or more cells, to detect the sample target object. A target object on a document page is detected using the trained machine learning model.

Inventors

Jae Min Lee
JOSEPH W. WEZOREK

Assignees

BLUEBEAM, INC.

Dates

Publication Date: 20260505
Application Date: 20240308

Claims (20)

1 . A computer-implemented method for detecting a target object on a document page, the method comprising: scanning a sample document page to detect one or more first lines having a longitudinal length substantially equal to one or more terminal sides of the sample document page and located within a predetermined distance from each of the one or more terminal sides, wherein the one or more terminal sides define an outer boundary edge of the sample document page that collectively defines a center region of the sample document page; detecting a sample target area within the center region of the sample document page by identifying the one or more first lines located within the predetermined distance from the one or more terminal sides of the sample document page with the longitudinal length that is substantially equal to the one or more terminal sides; generating an image by overlapping a plurality of sample document pages with one another; detecting one or more cells within the sample target area on the image, wherein the sample target area includes a sample target object; extracting one or more informational features from each of the one or more cells, wherein the one or more informational features define characteristics of a corresponding cell of the one or more cells; training a machine learning model, using the one or more informational features extracted from each of the one or more cells, to detect the sample target object; and detecting the target object on the document page using the trained machine learning model.
2 . The computer-implemented method of claim 1 , wherein detecting the sample target area on the sample document page further comprises: detecting one or more textual characters positioned adjacent to the one or more first lines.
3 . The computer-implemented method of claim 2 , wherein prior to scanning the sample document page, the method comprises: selectively adjusting the predetermined distance along a ratio of a total surface area of the sample document page, such that the predetermined distance corresponds to the ratio of the total surface area of the sample document page.
4 . The computer-implemented method of claim 3 , wherein the ratio includes 8% to 15% of the total surface area of the sample document page.
5 . The computer-implemented method of claim 1 , wherein detecting the sample target area within the center region of the sample document page comprises: detecting a first line of the one or more first lines within the predetermined distance from a first terminal side of the one or more terminal sides of the sample document page; detecting a second line of the one or more first lines within the predetermined distance from a second terminal side of the one or more terminal sides of the sample document page; determining a first offset distance between the first line of the one or more first lines and the first terminal side; determining a second offset distance between the second line of the one or more first lines and the second terminal side; and comparing the first offset distance and the second offset distance.
6 . The computer-implemented method of claim 5 , wherein detecting the sample target area on the sample document page further comprises: determining the first offset distance is greater than the second offset distance; and detecting one or more textual characters positioned adjacent to the first line upon determining the first offset distance is greater than the second offset distance.
7 . The computer-implemented method of claim 1 , wherein detecting the sample target area within the center region of the sample document page comprises: detecting zero lines within the predetermined distance from a first terminal side and a second terminal side of the sample document page; and detecting one or more textual characters positioned adjacent to at least one of the first terminal side or the second terminal side and within the predetermined distance.
8 . The computer-implemented method of claim 1 , wherein generating the image by overlapping the plurality of sample document pages comprises: determining a pixel overlap count for each of a plurality of locations along a surface of each of the plurality of sample document pages.
9 . The computer-implemented method of claim 8 , wherein extracting the one or more informational features from each of the one or more cells comprises: determining the pixel overlap count for each of at least a subset of the plurality of locations positioned within each of the one or more cells; and assigning one or more groupings to each of the one or more cells based on the pixel overlap count for each of at least the subset of the plurality of locations positioned within the one or more cells, such that locations having the same pixel overlap count are assigned to the same grouping.
10 . The computer-implemented method of claim 9 , wherein extracting the one or more informational features from each of the one or more cells further comprises: determining a size ratio of each of the one or more groupings; and using the size ratio of each grouping as part of the one or more informational features.
11 . The computer-implemented method of claim 9 , wherein extracting the one or more informational features from each of the one or more cells further comprises: determining a randomness of each of at least the subset of the plurality of locations positioned within each or the one or more cells; and using the randomness of each location as part of the one or more informational features.
12 . The computer-implemented method of claim 1 , wherein extracting the one or more informational features from each of the one or more cells comprises: determining a width of each of the one or more cells; determining a height of each of the one or more cells; and determining a size of each of the one or more cells based on the width and the height of the one or more cells.
13 . The computer-implemented method of claim 12 , wherein extracting the one or more informational features from each of the one or more cells further comprises: using the size of each cell as part of the one or more informational features.
14 . A system for detecting a target object on a document page, comprising: one or more processors; and at least one non-transitory computer readable medium storing instructions which, when executed by the one or more processors, cause the one or more processors to perform operations comprising: scanning a sample document page to detect one or more first lines having a longitudinal length similar to one or more terminal sides of the sample document page and located within a predetermined distance from each of the one or more terminal sides, wherein the one or more terminal sides define an outer boundary edge of the sample document page that collectively defines a center region of the sample document page; detecting a sample target area within the center region of the sample document page; generating an image by overlapping a plurality of sample document pages with one another; detecting one or more cells within the sample target area on the image by identifying one or more second lines located between the one or more first lines and the one or more terminal sides of the sample document page, wherein the sample target area includes a sample target object; extracting one or more informational features from each of the one or more cells, wherein the one or more informational features define characteristics of a corresponding cell of the one or more cells; training a machine learning model, using the one or more informational features extracted from each of the one or more cells, to detect the sample target object; and detecting the target object on the document page using the trained machine learning model.
15 . The system of claim 14 , wherein generating the image by overlapping the plurality of sample document pages with one another comprises: determining a pixel overlap count for each of a plurality of locations along a surface of each of the plurality of sample document pages; and generating at least one color on the image at each of the plurality of locations, wherein the at least one color corresponds to the pixel overlap count at each of the plurality of locations.
16 . The system of claim 14 , wherein detecting the sample target area within the center region of the sample document page comprises: detecting a first line of the one or more first lines within the predetermined distance from at least one of a first terminal side or a second terminal side of the one or more terminal sides of the sample document page; and detecting one or more textual characters positioned adjacent to the first line of the one or more first lines.
17 . The system of claim 16 , wherein the predetermined distance corresponds to a ratio of a total surface area of the sample document page, and the ratio includes 8% to 15% of the total surface area of the sample document page.
18 . The system of claim 14 , wherein detecting the one or more cells within the sample target area on the image comprises: detecting a first line of the one or more first lines and a second line of the one or more second lines are within the sample target area having a predetermined thickness; and detecting the first line and the second line intersect with one another within the sample target area.
19 . The system of claim 14 , wherein the sample document page includes a drawing, the sample target area includes a title box on the drawing, and the sample target object includes a title or a sheet number in the title box.
20 . At least one non-transitory computer readable medium for automatically detecting a target object on a document page, the at least one non-transitory computer readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising: scanning a sample document page to detect one or more first lines (i) having a longitudinal length substantially equal to a longitudinal length of one or more terminal sides of the sample document page and (ii) located within a predetermined distance from each of the one or more terminal sides, wherein the one or more terminal sides define an outer boundary edge of the sample document page that collectively defines a center region of the sample document page; detecting a sample target area within the center region of the sample document page; generating an image by overlapping a plurality of sample document pages with one another; detecting one or more cells within the sample target area on the image, wherein the sample target area includes a sample target object; extracting one or more informational features for each of the one or more cells from the image generated by overlapping the plurality of sample document pages, wherein the one or more informational features define characteristics of a corresponding cell of the one or more cells; training a machine learning model, using the one or more informational features extracted from each of the one or more cells, to detect the sample target object; and detecting the target object on the document page using the trained machine learning model.

Description

CROSS-REFERENCE TO RELATED APPLICATION(S) This patent application is a continuation of U.S. application Ser. No. 17/177,072, filed on Feb. 16, 2021, which claims the benefit of priority to U.S. Provisional Application No. 63/093,031, filed on Oct. 16, 2020, the entireties of which are incorporated herein by reference. TECHNICAL FIELD The present disclosure relates to systems and methods for automatically detecting features on a sheet. More particularly, the present disclosure relates to systems and methods for automatically determining a location and content of certain features on a drawing sheet that are unique to the particular drawing. BACKGROUND Architecture, engineering, and construction (AEC) industries actively use drawings to represent building designs. A large number of drawing sheets are usually needed to represent various aspects of a building. Drawing sheets typically include a title box or similar section containing information used for identifying a particular drawing, which may be important to discern the context of the drawing depicted on the sheet. For example, the title box of a drawing sheet may include a title and a number, which may be indicative of the content displayed on the drawing sheet. Despite drawings having a general format for sheet titles and sheet numbers, there remains a lack of a fixed or standard format for these features in AEC industries. As a result, each originator of a drawing may provide varying custom formats, thereby resulting in a large amount of variation across the industry. This variation may cause the automatic detection of the sheet title and sheet number challenging. As such, despite the prevalence of digital representation of drawings, the detection of sheet titles and sheet numbers generally depends on manual methods. The background description provided herein is for the purpose of generally presenting the context of this disclosure. Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to be prior art, or suggestions of the prior art, by inclusion in this section. SUMMARY OF THE DISCLOSURE One embodiment provides a computer-implemented method for detecting a target object on a document page, the method comprising: detecting a sample target area on a sample document page; generating an image by overlapping a plurality of sample document pages with one another; detecting one or more cells within the sample target area on the image, wherein the sample target area includes a sample target object; extracting one or more informational features from each of the one or more cells, wherein the one or more informational features define characteristics of a corresponding cell of the one or more cells; training a machine learning model, using the one or more informational features extracted from each of the one or more cells, to detect the sample target object; and detecting the target object on the document page using the trained machine learning model. Another embodiment provides a system for detecting a target object on a document page, comprising: one or more processors; and at least one non-transitory computer readable medium storing instructions which, when executed by the one or more processors, cause the one or more processors to perform operations comprising: detecting a sample target area on a sample document page; generating an image by overlapping a plurality of sample document pages with one another; detecting one or more cells within the sample target area on the image, wherein the sample target area includes a sample target object; extracting one or more informational features from each of the one or more cells, wherein the one or more informational features define characteristics of a corresponding cell of the one or more cells; training a machine learning model, using the one or more informational features extracted from each of the one or more cells, to detect the sample target object; and detecting the target object on the document page using the trained machine learning model. Another embodiment provides at least one non-transitory computer readable medium for automatically detecting a target object on a document page, the at least one non-transitory computer readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising: detecting a sample target area on a sample document page; generating an image by overlapping a plurality of sample document pages with one another; detecting one or more cells within the sample target area on the image, wherein the sample target area includes a sample target object; extracting one or more informational features from each of the one or more cells, wherein the one or more informational features define characteristics of a corresponding cell of the one or more cells; training a machine learning model, using the one or more informational