CN-121997901-A - Automatic form filling method, device, equipment and medium based on LLM

CN121997901ACN 121997901 ACN121997901 ACN 121997901ACN-121997901-A

Abstract

An automatic form filling method, device, equipment and medium based on LLM relates to the technical field of data processing. Acquiring a form to be filled in a document to be filled, and recording the position of the form to be filled; acquiring a text to be filled input by a user; analyzing a to-be-filled table, converting the to-be-filled table into an HTML format to obtain a first HTML table, splicing the first HTML table, the to-be-filled text and a preset Prompt template, inputting the spliced first HTML table, the to-be-filled text and the pre-set Prompt template into an LLM large model, analyzing the entity-attribute relation between the header field and the to-be-filled text in the first HTML table through the LLM large model, automatically filling the first HTML table to obtain a second HTML table, and inputting the second HTML table to the recorded position to obtain a filled document. By implementing the technical scheme provided by the application, the accuracy of automatic form filling can be improved.

Inventors

CHEN SHUO
MAO XIUPING
GUAN JIYU
WANG YOUJIN

Assignees

苏州创旅天下信息技术有限公司

Dates

Publication Date: 20260508
Application Date: 20251224

Claims (10)

1. An LLM-based method for automatically filling a form, comprising: Acquiring a form to be filled in a document to be filled, and recording the position of the form to be filled; acquiring a text to be filled input by a user; analyzing the form to be filled, and converting the form to be filled into an HTML format to obtain a first HTML form; the first HTML form the text to be filled is spliced with a preset template of the template, and is input into a LLM large model; Analyzing entity-attribute relation between a header field in the first HTML form and the text to be filled through the LLM big model; Automatically filling the first HTML form based on the entity-attribute relationship to obtain a second HTML form; and inputting the second HTML form to the position of the form to be filled in the document to be filled, and obtaining the document after filling.
2. The method of claim 1, wherein the parsing the form to be filled, converting the form to be filled into an HTML format, and obtaining a first HTML form, includes: identifying cell information, a table structure, header information and the header field in the table to be filled; generating a corresponding row attribute tag and a column attribute tag based on the cell information; generating a header tag in the HTML format based on the header information; the first HTML table is generated based on the row attribute tag, the column attribute tag, the header tag, and the header field.
3. The method of claim 2, wherein said parsing, by the LLM large model, entity-attribute relationships of header fields in the first HTML table with the text to be filled in comprises: analyzing the semantics of the text to be filled through the LLM big model to obtain a plurality of attribute fields; Mapping each attribute field with the header field to obtain the entity-attribute relationship between the header field and the text to be filled.
4. The method of claim 3, wherein automatically populating the first HTML form based on the entity-attribute relationship to obtain a second HTML form in HTML format, comprising: Locating the row attribute tags and the column attribute tags of the HTML strings to be filled in the first HTML table; And filling each attribute field into the corresponding HTML character string to be filled according to the entity-attribute relationship, and reserving the row attribute tag and the column attribute tag of the HTML character string to be filled to obtain the second HTML table in the filled HTML format.
5. The method of claim 1, further comprising outputting the second HTML form as a specified format preset by the user after the inputting the second HTML form to the location of the form to be filled in the document to be filled in, resulting in a filled in document.
6. The method of any of claims 1-5, wherein the document to be filled comprises a plurality of forms to be filled, the method comprising: Analyzing each form to be filled, and converting each form to be filled into an HTML format to obtain a plurality of first HTML forms; Each of the first HTML tables the text to be filled is spliced with a preset template of the template, and is input into a LLM large model; Analyzing entity-attribute relations between header fields in each first HTML form and the text to be filled through the LLM big model; Automatically filling each first HTML form based on the entity-attribute relationship to obtain a plurality of second HTML forms; And inputting each second HTML form to the position of each form to be filled in the document to be filled in, and obtaining the document after filling.
7. The method according to any one of claims 1-5, wherein if the form to be filled is a spread form to be filled or a nested form to be filled, identifying and recording a spread structure or a nested structure of the form to be filled when converting the form to be filled into the HTML format, to obtain the first HTML form retaining the spread structure or the nested structure.
8. An LLM based automatic form filling apparatus comprising: A form to be filled obtaining module for obtaining the form to be filled in the document to be filled, recording the position of the form to be filled; the text to be filled obtaining module is used for obtaining the text to be filled input by the user; the format conversion module is used for analyzing the form to be filled, converting the form to be filled into an HTML format and obtaining a first HTML form; The data input module is used for splicing the first HTML form, the text to be filled and a preset promt template and inputting the spliced text to be filled and the preset promt template into the LLM large model; the data mapping module is used for analyzing entity-attribute relation between a header field in the first HTML form and the text to be filled through the LLM big model; The form filling module is used for automatically filling the first HTML form based on the entity-attribute relationship to obtain a second HTML form; And the document filling module is used for inputting the second HTML form to the position of the form to be filled in the document to be filled, so as to obtain the filled document.
9. An electronic device comprising a memory for storing instructions, a processor for executing the instructions stored in the memory to cause the electronic device to perform the method of any one of claims 1 to 7, a user interface, and a network interface, both for communicating with other devices.
10. A computer readable storage medium storing instructions which, when executed, perform the method of any one of claims 1 to 7.

Description

Automatic form filling method, device, equipment and medium based on LLM Technical Field The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a medium for automatically filling a table based on LLM. Background In the current age of rapid development of informatization, various forms are filled in, and become an indispensable link in daily work and life, especially in the fields with extremely high requirements on data accuracy such as finance, medical treatment, government affairs and the like, the traditional manual form filling mode is low in efficiency, and mistakes are easily caused by human negligence, so that potential risks are brought to individuals and institutions. At present, the information in the user input is extracted and filled into the corresponding form field through pattern matching and keyword recognition mainly through an automatic form filling method based on a natural language processing (Natural Language Processing, NLP) model. The method has a certain effect when processing structured or highly-standard texts, but has poor generalization capability, is difficult to understand complex semantic contexts, and particularly when expression variation, spoken language expression or cross-domain terms exist in user input, the traditional NLP model often cannot accurately capture intention, form filling errors or omission exist, and the automatic form filling accuracy is low. Disclosure of Invention The embodiment of the application provides an automatic form filling method, device, equipment and medium based on LLM (logical Link management), which are used for solving the technical problem that the accuracy of automatic form filling is low because an NLP (non-linear link protocol) model cannot accurately capture intention and form filling errors or omission exist. The technical scheme of the embodiment of the application is realized as follows: in a first aspect, an embodiment of the present application provides a method for automatically filling a table based on LLM, including: The method comprises the steps of obtaining a to-be-filled table in a to-be-filled document, recording the position of the to-be-filled table, obtaining a to-be-filled text input by a user, analyzing the to-be-filled table, converting the to-be-filled table into an HTML format to obtain a first HTML table, splicing the first HTML table, the to-be-filled text and a preset template of Prompt, inputting the first HTML table and the to-be-filled text into an LLM large model, analyzing the entity-attribute relation between a header field in the first HTML table and the to-be-filled text through the LLM large model, automatically filling the first HTML table based on the entity-attribute relation to obtain a second HTML table, and inputting the second HTML table into the position of the to-be-filled table in the to-be-filled document to obtain a filled document. The method comprises the steps of analyzing a table to be filled, converting the table to be filled into an HTML format, and obtaining a first HTML table, wherein the method specifically comprises the steps of identifying cell information, a table structure, header information and header fields in the table to be filled, generating corresponding row attribute tags and column attribute tags based on the cell information, generating the header tags in the HTML format based on the header information, and generating the first HTML table based on the row attribute tags, the column attribute tags, the header tags and the header fields. Optionally, analyzing entity-attribute relation between header fields in the first HTML table and the text to be filled through the LLM big model specifically comprises analyzing semantics of the text to be filled through the LLM big model to obtain a plurality of attribute fields, and mapping each attribute field with the header field to obtain the entity-attribute relation between the header field and the text to be filled. Optionally, based on the entity-attribute relationship, automatically filling the first HTML table to obtain a second HTML table in an HTML format, and specifically comprises positioning the row attribute tags and the column attribute tags of the HTML strings to be filled in the first HTML table, filling each attribute field into the corresponding HTML strings to be filled according to the entity-attribute relationship, and reserving the row attribute tags and the column attribute tags of the HTML strings to be filled to obtain the second HTML table in the filled HTML format. Optionally, after the second HTML form is input to the position of the form to be filled in the document to be filled, the method further includes outputting the second HTML form into a specified format preset by the user. The method comprises the steps of analyzing each to-be-filled table, converting each to-be-filled table into an HTML format to obtain a plura