CN-121305599-B - Training method for certificate image recognition model, electronic equipment and medium
Abstract
The invention discloses a training method, electronic equipment and medium of a certificate image recognition model, and relates to the technical field of certificate image recognition, wherein the method comprises the steps of recognizing a plurality of certificate images by using a closed source large model to obtain recognition results corresponding to the certificate images, wherein the recognition results of the certificate images comprise a plurality of key information; marking the corresponding certificate image by using the identification result to obtain a marked certificate image, wherein the marked certificate image carries a plurality of labels related to the key information, and training an initial model of open source optical character identification by using the marked certificate image as a sample set to obtain a final certificate image identification model. The invention has the functions of supporting the image recognition and attribute analysis of multilingual and multilingual certificate types, and has the advantages of high recognition speed, high accuracy, good stability and light model.
Inventors
- LI ZHENG
- XUN SHUANGGUI
Assignees
- 杭州乒乓智能技术有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20251208
Claims (9)
- 1. A training method for a document image recognition model, the training method comprising: Identifying a plurality of certificate images by using the closed source large model to obtain identification results corresponding to the certificate images, wherein the identification results of the certificate images comprise a plurality of key information; Marking the corresponding certificate image by utilizing the identification result to obtain a marked certificate image, wherein the marked certificate image carries a plurality of labels related to the key information; Training the initial model of open source optical character recognition by using the marked certificate image as a sample set to obtain a final certificate image recognition model, wherein the method comprises the following steps: Training an initial model by using a reinforcement learning algorithm by taking a marked certificate image carrying a field of the global attribute of the associated certificate image and a label of the value of the field as a sample set, so that the trained model parameters learn the integral characteristics of the certificate image; under the condition that the identification accuracy rate of the global attribute of the certificate image is detected to be greater than or equal to the preset global attribute identification accuracy rate, the trained model parameters are saved, and a global attribute identification model of the certificate image is obtained; training a global attribute identification model of the certificate image by using a reinforced learning algorithm by taking the marked certificate image with the fields and the values of the fields of the local attribute of the associated certificate image as a sample set, so that the trained model parameters learn the local characteristics of the certificate image; And under the condition that the identification accuracy rate of the local attribute of the certificate image is detected to be greater than or equal to the preset local attribute identification accuracy rate, saving the trained model parameters to obtain a final certificate image identification model.
- 2. The training method of claim 1, wherein the key information comprises a field representing a global attribute of the document image and a value thereof; the method comprises the steps of marking corresponding certificate images by utilizing the identification result to obtain marked images, wherein the marked certificate images carry labels related to the key information, and the method comprises the following steps: And marking the corresponding certificate image by using the field and the value thereof representing the global attribute of the certificate image to obtain the marked certificate image carrying the tag of the field and the value thereof associated with the global attribute of the certificate image.
- 3. The training method of claim 2, wherein the key information further comprises a field representing a local attribute of the document image and a value thereof; the method comprises the steps of marking corresponding certificate images by utilizing the identification result to obtain marked images, wherein the marked certificate images carry labels related to the key information, and the method comprises the following steps: and marking the corresponding certificate image by using the field and the value thereof which represent the local attribute of the certificate image to obtain the marked certificate image carrying the tag of the field and the value thereof associated with the local attribute of the certificate image.
- 4. The training method of claim 1 wherein the reinforcement learning algorithm comprises a result reward value and a format reward value; the training of the initial model by the reinforcement learning algorithm or training of the global attribute recognition model of the certificate image by the reinforcement learning algorithm comprises the following steps: determining a total rewarding value of each training sample according to the result rewarding value and the format rewarding value; determining the dominant value of the sample for each training according to the total reward value of the sample and the total number of the samples for each training; And optimizing and determining a loss function according to the determined dominant value of the sample and the group relative strategy, and training model parameters of an initial model or a certificate image global attribute identification model according to the loss function.
- 5. The training method according to claim 4, the training method is characterized by further comprising the following steps: updating the format rewarding value weight and the result rewarding value weight according to the field and the value of the model prediction; Increasing the format rewards value weight in the case that a non-predefined field or a predefined field is missing in the fields of the model prediction; And reducing the result rewarding value weight according to the increased format rewarding value weight.
- 6. A training method as claimed in any one of claims 1 to 3, characterized in that the training method further comprises: And performing data enhancement operation on the evidence image by using the closed-source large model, wherein the data enhancement operation comprises at least one of image clipping, image rotation, image overturning, image illumination pretreatment and image modification of text information.
- 7. The training method of claim 1, wherein the training method further comprises: The method comprises the steps of identifying a certificate image by utilizing a plurality of closed source large models to obtain a plurality of identification results of the certificate image, changing a line feed character into a display line feed character when the identification results have the line feed character, deleting the space character when the space character of the identification results is detected to appear adjacent to the line feed character; Voting a plurality of identification results to obtain voted identification results corresponding to the certificate images; marking the certificate image by using the identification result after voting to obtain a marked certificate image; Training the initial model of open source optical character recognition by using the marked certificate image as a sample set to obtain a final certificate image recognition model.
- 8. An electronic device, comprising: One or more processors; memory having one or more computer programs stored thereon, which when executed by the one or more processors cause the one or more processors to implement the training method of any of claims 1 to 7.
- 9. A computer readable medium on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the training method of any of claims 1 to 7.
Description
Training method for certificate image recognition model, electronic equipment and medium Technical Field The present invention relates to the field of document image recognition technology, and in particular, to a training method for a document image recognition model, an electronic device, and a computer readable medium. Background Identification of images of credentials is a technique for automatically extracting structured data from credentials. A closed-source large model and a lightweight, open-source OCR (Optical Character Recognition ) small model are two mainstream schemes for implementing this technology. However, the types of certificates are numerous (such as identity cards, household books, passports, drivers' licenses, social security cards, business licenses), the languages of the same type of certificates in different countries (such as thailand, vietnam, indonesia, india) are different, and the formats, information items and formats of the certificates are also different. For the complex certificate, although the identification accuracy of the closed source large model is generally superior to that of the small model, the identification result is easily influenced by the type selection result of the closed source large model, the identification result is unstable, meanwhile, the closed source large model has common faults of slow identification speed (usually more than 10 s), while the open source OCR small model has obvious advantages in identification speed, the identification accuracy is too low (only about 70%), and the identification result of the open source OCR small model is frequently subjected to problems of residence address interception, text illusion, omission or error segmentation of phonetic symbols and the like by taking Vietnam identity cards as examples. Therefore, the existing scheme is difficult to meet the actual use requirements in the dimensions of recognition accuracy, recognition speed, recognition stability and the like. Disclosure of Invention The present invention aims to solve one of the technical problems in the related art to a certain extent. Therefore, the invention provides a training method of a certificate image recognition model, an electronic device for executing the training method and a computer readable medium, which have the advantages of supporting the functions of multi-language and multi-certificate type image recognition and attribute analysis, along with high recognition speed, high accuracy, good stability and light model. In order to achieve the above object, as a first aspect of the present invention, there is provided a training method of a document image recognition model, wherein the training method comprises: Identifying a plurality of certificate images by using the closed source large model to obtain identification results corresponding to the certificate images, wherein the identification results of the certificate images comprise a plurality of key information; Marking the corresponding certificate image by utilizing the identification result to obtain a marked certificate image, wherein the marked certificate image carries a plurality of labels related to the key information; Training the initial model of open source optical character recognition by using the marked certificate image as a sample set to obtain a final certificate image recognition model. Optionally, the key information includes a field representing a global attribute of the document image and a value thereof; the method comprises the steps of marking corresponding certificate images by utilizing the identification result to obtain marked images, wherein the marked certificate images carry labels related to the key information, and the method comprises the following steps: And marking the corresponding certificate image by using the field and the value thereof representing the global attribute of the certificate image to obtain the marked certificate image carrying the tag of the field and the value thereof associated with the global attribute of the certificate image. Optionally, the key information further includes a field representing a local attribute of the certificate image and a value thereof; the method comprises the steps of marking corresponding certificate images by utilizing the identification result to obtain marked images, wherein the marked certificate images carry labels related to the related information, and the method comprises the following steps: and marking the corresponding certificate image by using the field and the value thereof which represent the local attribute of the certificate image to obtain the marked certificate image carrying the tag of the field and the value thereof associated with the local attribute of the certificate image. Optionally, training the initial model for open-source optical character recognition by using the marked certificate image as a sample set to obtain a final certificate image recognition model, including: Tra