CN-121997912-A - Large model training data labeling method, device and equipment for calculation-oriented drawing

CN121997912ACN 121997912 ACN121997912 ACN 121997912ACN-121997912-A

Abstract

The invention provides a method, a device and equipment for marking large model training data for an accounting drawing, which are used for marking key information of the accounting drawing of construction engineering cost. The method comprises the steps of displaying a to-be-marked calculation drawing, determining to-be-recognized areas, recognizing text content and position information of a text block, displaying a tag list according to a keyword field, activating interaction states of the to-be-recognized areas after a user selects tags to enable the text block to be an operable hot area, generating key value pairs by the selected tags and the hot area texts after the user selects the hot area, collecting the key value pairs of each to-be-recognized area to generate answer content, and filling the answer parts of a preset question-answer template to obtain a marking result. The method can quickly complete field and text matching, generate structured annotation data, improve annotation efficiency and consistency, and be convenient for direct use in large model training.

Inventors

Fei Guoyou
LI QINGYUN
A RUNA
ZHANG JING
SHEN SHAONAN
LIU JUNNAN
LIU ZHEN

Assignees

广联达科技股份有限公司

Dates

Publication Date: 20260508
Application Date: 20260128

Claims (10)

1. The large model training data labeling method for the calculation drawing is characterized by comprising the following steps of: displaying a to-be-annotated accounting drawing, wherein the to-be-annotated accounting drawing comprises a plurality of keyword fields; determining a region to be identified in the to-be-marked calculation drawing; Identifying the area to be identified to obtain text contents and position information of a plurality of text blocks; displaying a tag list according to the keyword fields, wherein the tag list comprises tags corresponding to each keyword field; responding to the selection operation of a user on the label, recording the selected label, and activating the interactable state of the area to be identified, wherein when the area to be identified is in the interactable state, each text block of the area to be identified is presented as an operable hot zone; Responding to the selection operation of the user on the hot zone, and generating a key value pair by utilizing the selected label and text content corresponding to the selected hot zone; Generating answer content according to the key value pairs corresponding to all the areas to be identified in the calculated drawing to be marked and And filling the answer content into an answer part of a preset question-answer template to generate a labeling result of the to-be-labeled calculation drawing.
2. The method for labeling the large model training data for the quantitative drawing according to claim 1, wherein the step of activating the interactable state of the region to be identified comprises the steps of: Overlapping and displaying text contents at the position marked by the position information in the area to be identified; and setting the calibrated position of the position information as the operable hot zone.
3. The method for labeling the large model training data oriented to the quantitative drawing according to claim 2, wherein, The step of displaying the tag list according to the keyword field comprises the following steps: displaying the label and a content input box corresponding to the label; the selection operation of the label by the user comprises the selection operation of the content input box; The step of generating the key value pair by using the selected label and the text content corresponding to the selected hot zone comprises the steps of filling the text content corresponding to the selected hot zone into a selected content input box, and generating the key value pair by using the label corresponding to the selected content input box and the text content in the selected content input box.
4. The method for labeling the large model training data for the quantitative drawing according to claim 2, wherein the step of displaying the text content superimposed on the position marked by the position information in the area to be recognized comprises the steps of: And superposing the text content at the position marked by the position information in a semitransparent highlight area mode, so that a floating layer above the area to be identified displays the text content.
5. The method for labeling large model training data for an quantitative drawing according to claim 1, wherein, Responding to the operation of a user on an area outlining tool, outlining a closed contour on the to-be-marked calculated drawing, and determining the to-be-identified area; The method further includes displaying a tag list according to the keyword field in response to the user operating an area outlining tool.
6. The method for labeling large model training data for an amount of calculation paper according to claim 5, wherein the step of identifying the area to be identified to obtain text contents and position information of a plurality of text blocks comprises: After a closed contour is drawn on the to-be-marked calculation drawing, cutting the closed contour and sending the closed contour to a text detection and recognition tool, wherein the text detection and recognition tool is used for detecting texts in the closed contour to obtain text blocks and position information of the text blocks, and cutting the text blocks and recognizing the text blocks according to the position information to obtain text contents of the text blocks; and receiving the text content and the position information of the text block returned by the text detection and recognition tool.
7. The method for labeling large model training data for an amount of drawings according to claim 1, wherein before the step of displaying the amount of drawings to be labeled, the method further comprises: Acquiring an arithmetic drawing data set, wherein the arithmetic drawing data set comprises tensor calculus more drawings, and each tensor calculus more drawings comprises the same keyword field; displaying a keyword labeling page corresponding to the to-be-labeled calculation drawing, wherein the keyword labeling page comprises an attribute name input box and an attribute type input box; Generating an attribute list comprising the attribute names and the attribute types according to the content input by the user in the attribute name input box and the attribute type input box, wherein the attribute names in the attribute list are field names of the keyword fields; The step of displaying the tag list according to the keyword field comprises displaying the tag list according to the attribute list.
8. The utility model provides a model training data annotation device towards volume of calculation drawing which characterized in that includes: the first display module is used for displaying a to-be-annotated accounting drawing, wherein the to-be-annotated accounting drawing comprises a plurality of keyword fields; The first determining module is used for determining a region to be identified in the to-be-annotated computing quantity drawing; The identification module is used for identifying the area to be identified and obtaining text contents and position information of a plurality of text blocks; the second display module is used for displaying a tag list according to the keyword fields, wherein the tag list comprises tags corresponding to each keyword field; The first response module is used for responding to the selection operation of the user on the label, recording the selected label and activating the interactable state of the area to be identified, wherein when the area to be identified is in the interactable state, each text block of the area to be identified is presented as an operable hot zone; The second response module is used for responding to the selection operation of the user on the hot zone and generating a key value pair by utilizing the selected label and text content corresponding to the selected hot zone; a first generation module for generating answer content according to the key value pairs corresponding to all the areas to be identified in the calculated amount drawing to be marked, and And the second generation module is used for filling the answer content into an answer part of a preset question-answer template to generate a labeling result of the to-be-labeled calculation drawing.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 7.

Description

Large model training data labeling method, device and equipment for calculation-oriented drawing Technical Field The invention relates to the technical field of data processing, in particular to a method, a device and equipment for labeling large model training data oriented to an amount-of-calculation drawing. Background In the construction engineering cost accounting process, information such as components, dimensions, material symbols, text descriptions and the like are often required to be identified and structurally extracted according to an accounting drawing so as to support the business such as engineering quantity calculation, inventory creation, cost analysis and the like. The calculation drawing generally comprises component collection graphs, dimension labels, material symbols and a large number of text descriptions, and has the characteristics of high information density, mixed arrangement of images and texts, various expression forms, possible correspondence of the same keyword to a plurality of texts or various labeling modes and the like. With the development of the deep learning technology, the multi-mode large model can simultaneously receive image pixels and text sequences of a calculated drawing as input and output structured key information end to end, so that the automation degree of drawing understanding and key information extraction is improved. To train such multi-modal large models, it is often necessary to construct a dataset containing images, text, structured labels, and the construction of the dataset relies on accurate labeling of the original drawing data. In the prior art, the labeling of images and text data mainly depends on manual operation, and common tools comprise Labelme, VGG Image Annotator (VIA) and the like. For example, when Labelme is adopted, after the image is imported into local software, a user is usually required to outline a single text object one by one, manually fill in text content in a bullet box, further manually add keyword fields corresponding to characters, then export JSON files, and the exported labeling result still needs post-processing to construct a key value pair type dataset for model training. For another example, when the VIA is adopted, the user generally needs to load an image on the browser end and define a keyword field in advance, pop up a keyword form after outlining a target outline, manually fill in text content corresponding to the keyword and export a JSON file, and similarly, the export result generally needs to be post-processed to meet the requirement of a training data format. To improve efficiency, some tools introduce automatic text recognition capabilities, such as hundred PaddleLabel, which can automatically detect and recognize all text in the graph and derive JSON. However, the scheme is usually focused on detecting, identifying and outputting text flows, and still lacks a keyword field and text content key value pair matching flow facing a key information extraction task, so that structured labels required by multi-modal model training are difficult to directly generate, and a user still needs to manually screen, classify and extract key value pairs on identification results. The inventor aims at the research of the existing labeling scheme to find that the problems of high labor cost, low overall efficiency because the image areas are manually framed one by one and the corresponding texts are input, low error and time consumption because the traditional labeling tool does not support or does not integrate the OCR function, and the problems of easy error and time consumption because the characters in the images are manually identified and input by human eyes, and the scheme with OCR capability can output all the texts, but still requires additional manual extraction and post-processing because key value pair matching and templated organization processes required by key information extraction are lacking. In summary, the existing labeling tools are difficult to realize automatic processing and structural organization of text information, and the generated data cannot be directly used for multi-mode large model training, and data templates required by training still need to be additionally constructed and data post-processing is still needed. Therefore, when labeling the calculation drawing, how to improve the matching efficiency of the keyword field and the text content while reducing the manual text input workload, and make the labeling result output in a structured form meeting the multi-mode large model training requirement becomes a technical problem to be solved in the field. Disclosure of Invention The invention aims to provide a large model training data labeling method, device and equipment for an amount-of-calculation drawing, which are used for solving the technical problems in the prior art. On the one hand, in order to achieve the above purpose, the invention provides a large model training data labeling me