Search

CN-122021627-A - Text processing model training method and text processing method

CN122021627ACN 122021627 ACN122021627 ACN 122021627ACN-122021627-A

Abstract

The embodiment of the specification provides a text processing model training method and a text processing method, wherein the text processing model training method comprises the steps of determining a target document and a corresponding target document processing result, obtaining a plurality of text data and text coordinate information of each text data in the target document from the target document, conducting coding processing on each text data to obtain a target text code of each text data, conducting coding processing on the text coordinate information to obtain a target coordinate code of the text coordinate information, inputting the target text code and the target coordinate code into a text processing model to conduct text processing to obtain a predicted text processing result corresponding to the target document, and training the text processing model according to the target document processing result and the predicted text processing result to obtain a target text processing model.

Inventors

  • ZHU ZHAOQING
  • Luo Chuwei
  • Shao Zirui
  • ZHENG QI
  • ZHANG JI

Assignees

  • 阿里巴巴(中国)有限公司

Dates

Publication Date
20260512
Application Date
20241108

Claims (16)

  1. 1. A text processing model training method, comprising: determining a target document and a corresponding target document processing result, and acquiring a plurality of text data and text coordinate information of each text data in the target document from the target document; Coding the text data to obtain target text codes of the text data, and coding the text coordinate information to obtain target coordinate codes of the text coordinate information; Inputting the target text code and the target coordinate code into a text processing model for text processing to obtain a predicted text processing result corresponding to the target document; And training the text processing model according to the target document processing result and the predicted text processing result to obtain a target text processing model.
  2. 2. The text processing model training method according to claim 1, wherein the encoding processing is performed on the text data to obtain the target text encoding of the text data, and the method comprises: coding the text data to obtain text data codes of the text data; Forming a data coding sequence according to a plurality of text data codes, and determining text position codes for each text data code in the data coding sequence, wherein the text position codes represent the positions of each text data code in the data coding sequence; And encoding the data encoding sequence and the text position as the target text of each text data.
  3. 3. The text processing model training method according to claim 1, wherein the encoding processing is performed on the text coordinate information to obtain a target coordinate code of the text coordinate information, and the method comprises: Coding the text coordinate information to obtain a coordinate information code of the text coordinate information; Determining associated text data associated with the text coordinate information from the plurality of text data, and determining coordinate information position codes of the coordinate information codes according to text position codes corresponding to the associated text data; And encoding the coordinate information and the coordinate information position as the target coordinate encoding of the text coordinate information.
  4. 4. The text processing model training method according to claim 2, wherein the encoding the text coordinate information to obtain the coordinate information code of the text coordinate information includes: and determining a model training embedded vector corresponding to the text coordinate information by utilizing a coordinate encoding unit corresponding to the text processing model, and encoding the text coordinate information based on the model training embedded vector to obtain the coordinate information code of the text coordinate information.
  5. 5. The text processing model training method according to claim 4, wherein the encoding process is performed on the text coordinate information based on the model training embedded vector to obtain the coordinate information code of the text coordinate information, comprising: Performing information processing on the text coordinate information by using a linear layer in the coordinate coding unit to obtain a key vector and a value vector corresponding to the text coordinate information, wherein the information processing comprises projection processing; determining the model training embedded vector as a query vector corresponding to the text coordinate information; And performing attention processing on the key vector, the value vector and the query vector by using an attention layer in the coordinate coding unit to obtain the coordinate information code of the text coordinate information.
  6. 6. The text processing model training method according to any one of claims 1 to 5, the target document being a target document image; the obtaining a plurality of text data and text coordinate information of each text data in the target document from the target document comprises the following steps: Carrying out document segmentation on the target document image to obtain a plurality of document image blocks; marking a text data area containing texts in the target document image block by using a text data box, and carrying out text recognition on the text data area marked by the text data box to obtain the text data, wherein the target document image block is any one of the plurality of document image blocks; and taking the coordinate information of the text data box in the target document image as the text coordinate information of the text data.
  7. 7. The text processing model training method according to any one of claims 1 to 5, wherein the target document is a document associated with a target question, the target document processing result is a target answer corresponding to the target question, and the predicted text processing result is a predicted answer corresponding to the target question; the determining the target document and the corresponding target document processing result comprises the following steps: Determining the target document associated with the target question, and determining the target answer corresponding to the target question; Inputting the target text code and the target coordinate code into a text processing model for text processing to obtain a predicted text processing result corresponding to the target document, wherein the method comprises the following steps: And inputting a target question code, the target text code and the target coordinate code corresponding to the target question into a text processing model to process the question to obtain the prediction answer, wherein the target question code is obtained by coding the target question.
  8. 8. A text processing model training method according to claim 3, wherein the determining the coordinate information position code of the coordinate information code according to the text position code corresponding to the associated text data comprises: determining character data contained in the associated text data, wherein the character data are arranged according to a semantic sequence; Determining target character data positioned at a first semantic position in the semantic sequence from the character data; And determining a target text position code corresponding to the target character data from the text position codes corresponding to the associated text data, and determining the target text position code as the coordinate information position code of the coordinate information code.
  9. 9. The text processing model training method according to any one of claims 1 to 5, wherein the training the text processing model according to the target document processing result and the predicted text processing result to obtain a target text processing model includes: determining a first loss function by using the target document processing result and the predicted text processing result; Determining target text coordinate information for the target document, and determining a second loss function based on the target text coordinate information and the text coordinate information; And adjusting model parameters of the text processing model based on the first loss function and the second loss function to obtain the target text processing model.
  10. 10. A text processing method, comprising: determining a document to be processed, and acquiring a plurality of text data and text coordinate information of each text data in the document to be processed from the document to be processed; Coding the text data to obtain target text codes of the text data, and coding the text coordinate information to obtain target coordinate codes of the text coordinate information; and inputting the target text code and the target coordinate code into a target text processing model to perform text processing, and obtaining a text processing result corresponding to the document to be processed.
  11. 11. The text processing method according to claim 10, wherein the document to be processed is a document associated with a question to be processed, and the text processing result is a question answer corresponding to the question to be processed; The determining the document to be processed comprises the following steps: Receiving a text processing request sent by a client in response to a text processing operation, wherein the text processing request carries the problem to be processed and the document to be processed associated with the problem to be processed; Inputting the target text code and the target coordinate code into a target text processing model for text processing to obtain a text processing result corresponding to the document to be processed, wherein the text processing result comprises: Inputting a to-be-processed problem code, the target text code and the target coordinate code corresponding to the to-be-processed problem into a target text processing model to process the problem, and obtaining the problem answer corresponding to the to-be-processed problem, wherein the to-be-processed problem code is obtained by coding the to-be-processed problem.
  12. 12. The text processing method according to any one of claims 10 or 11, wherein the step of inputting the target text code and the target coordinate code into a target text processing model to perform text processing, and after obtaining the text processing result corresponding to the document to be processed, further comprises: And sending the text processing result to a client so that the client displays the text processing result through a text processing interface.
  13. 13. The text processing method is applied to cloud side equipment and comprises the following steps: receiving a document to be processed sent by a terminal side device, and acquiring a plurality of text data and text coordinate information of each text data in the document to be processed from the document to be processed; Coding the text data to obtain target text codes of the text data, and coding the text coordinate information to obtain target coordinate codes of the text coordinate information; inputting the target text code and the target coordinate code into a target text processing model to perform text processing, and obtaining a text processing result corresponding to the document to be processed; and sending the text processing result to the end-side equipment.
  14. 14. A computing device, comprising: A memory and a processor; The memory is adapted to store a computer program/instruction, the processor being adapted to execute the computer program/instruction, which when executed by the processor, implements the steps of the method of any of claims 1 to 13.
  15. 15. A computer readable storage medium storing a computer program/instruction which, when executed by a processor, implements the steps of the method of any one of claims 1 to 13.
  16. 16. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any of claims 1 to 13.

Description

Text processing model training method and text processing method Technical Field The embodiment of the specification relates to the technical field of artificial intelligence, in particular to a text processing model training method. One or more embodiments of the present specification relate to two text processing methods simultaneously, a computing device, a computer-readable storage medium, and a computer program product. Background With the continuous development of artificial intelligence technology, neural network models are also applied to perform tasks in various scenes, for example, the neural network models are applied to text processing in a text processing scene. In the text processing process of the current neural network model, the text data is complex, so that the problem of inaccuracy of a text processing result exists, and therefore, how to improve the accuracy of the text processing result of the neural network model becomes a problem to be solved urgently. Disclosure of Invention In view of this, embodiments of the present disclosure provide a text processing model training method. One or more embodiments of the present specification relate to two text processing methods, a text processing model training apparatus, two text processing apparatuses, a computing device, a computer readable storage medium, and a computer program product that address the technical shortcomings of the prior art. According to a first aspect of embodiments of the present specification, there is provided a text processing model training method, including: determining a target document and a corresponding target document processing result, and acquiring a plurality of text data and text coordinate information of each text data in the target document from the target document; Coding the text data to obtain target text codes of the text data, and coding the text coordinate information to obtain target coordinate codes of the text coordinate information; Inputting the target text code and the target coordinate code into a text processing model for text processing to obtain a predicted text processing result corresponding to the target document; And training the text processing model according to the target document processing result and the predicted text processing result to obtain a target text processing model. According to a second aspect of embodiments of the present specification, there is provided a text processing model training apparatus, comprising: The data acquisition module is configured to determine a target document and a corresponding target document processing result, and acquire a plurality of text data and text coordinate information of each text data in the target document from the target document; the coding module is configured to code each text data to obtain a target text code of each text data, and code the text coordinate information to obtain a target coordinate code of the text coordinate information; The text processing module is configured to encode the target text and the target coordinates, input a text processing model for text processing, and obtain a predicted text processing result corresponding to the target document; and the model training module is configured to train the text processing model according to the target document processing result and the predicted text processing result to obtain a target text processing model. According to a third aspect of embodiments of the present specification, there is provided a text processing method, including: determining a document to be processed, and acquiring a plurality of text data and text coordinate information of each text data in the document to be processed from the document to be processed; Coding the text data to obtain target text codes of the text data, and coding the text coordinate information to obtain target coordinate codes of the text coordinate information; and inputting the target text code and the target coordinate code into a target text processing model to perform text processing, and obtaining a text processing result corresponding to the document to be processed. According to a fourth aspect of embodiments of the present specification, there is provided a text processing apparatus comprising: The data acquisition module is configured to determine a document to be processed and acquire a plurality of text data and text coordinate information of each text data in the document to be processed from the document to be processed; the coding module is configured to code each text data to obtain a target text code of each text data, and code the text coordinate information to obtain a target coordinate code of the text coordinate information; And the text processing module is configured to input the target text code and the target coordinate code into a target text processing model for text processing to obtain a text processing result corresponding to the document to be processed. According to a fifth aspect of embodim