EP-3890295-B1 - INFORMATION PROCESSING APPARATUS FOR OBTAINING CHARACTER STRING

EP3890295B1EP 3890295 B1EP3890295 B1EP 3890295B1EP-3890295-B1

Inventors

SOGA, MASAYA

Dates

Publication Date: 20260506
Application Date: 20210323

Claims (10)

An information processing apparatus comprising: obtaining means for obtaining a first scan image and a second scan image; character recognition means for obtaining a character recognition result by performing character recognition processing on a text region in the first scan image; learning means for, if a correction is made to at least a part of a character string of the obtained character recognition result for setting attribute information about the first scan image based on the obtained character recognition result, learning correction content of the correction; and similar form determination means configured to analyze text regions included in the second scan image to obtain layout information, and to determine whether the second scan image is a similar form to the first scan image based on the layout information, characterized in that : the learning means is configured to: determine (S702) whether the correction is deletion of the part of the character string of the character recognition result and the deleted part of the character string is a character string registered in a predetermined term dictionary, learn (S703), as correction content, a regular expression based on the deleted part of the character string in a case where it is determined that the correction is deletion of the part of the character string of the character recognition result and the deleted part of the character string is a character string registered in the predetermined term dictionary, and otherwise learn (S704), as correction content, the character string before replacement and the replaced character string, wherein the character recognition means is configured to, if the character recognition processing is performed on a text region in the second scan image and it is determined that the second scan image is a similar form to the first scan image by the similar form determination means, correct a character recognition result of the text region in the second scan image based on the regular expression, character string before replacement, and replaced character string learned as correction content by the learning means, and output the corrected character recognition result.
The information processing apparatus according to claim 1, wherein the character recognition means is configured to, if the character string of the character recognition result of the text region in the second scan image matches the regular expression learned by the learning means, correct the character recognition result of the text region in the second scan image by deleting a part of the character string of the character recognition result of the text region in the second scan image, the part matching the regular expression, and output the corrected character recognition result.
The information processing apparatus according to any one of claims 1 or 2, wherein the character recognition means is configured to, if the character string of the character recognition result of the text region in the second scan image matches the character string before replacement learned as correction content by the learning means, correct the character recognition result by replacing the character string of the character recognition result of the text region in the second scan image with the replaced character string learned as correction content learned by the learning means, and output the corrected character recognition result.
An information processing method to be performed by an information processing apparatus, the information processing method comprising: obtaining a first scan image and a second scan image; obtaining a character recognition result by performing character recognition processing on a text region in the first scan image; if a correction is made to at least a part of a character string of the obtained character recognition result for setting attribute information about the first scan image based on the obtained character recognition result, learning correction content of the correction; and analyzing text regions in the second scan image to obtain layout information, and determining whether a second scan image is a similar form to the first scan image based on the layout information, characterized in that : the learning includes: determining (S702) whether the correction is deletion of the part of the character string of the character recognition result and the deleted part of the character string is a character string registered in a predetermined term dictionary, and learning (S703), as correction content, a regular expression based on the deleted part of the character string in a case where it is determined that the correction is deletion of the part of the character string of the character recognition result and the deleted part of the character string is a character string registered in the predetermined term dictionary, and otherwise learning (S704), as correction content, the character string before replacement and the replaced character string; and, if the character recognition processing is performed on a text region in the second scan image and it is determined that the second scanned image is a similar form to the determined first scan image, a character recognition result of the text region in the second scan image is corrected based on the regular expression, character string before replacement, and replaced character string learned as correction content.
The information processing method according to claim 4, wherein, if the character string of the character recognition result of the text region in the second scan image matches the learned regular expression, the character recognition result of the text region in the second scan image is corrected by deleting a part of the character string of the character recognition result of the text region in the second scan image, the part matching the regular expression.
The information processing method according to claim 4 or claim 5, wherein, if the character string of the character recognition result of the text region in the second scan image matches the character string before replacement learned as correction content, the character recognition result is corrected by replacing the character string of the character recognition result of the text region in the second scan image with the replaced character string learned as correction content.
A computer program comprising instructions that, when executed by a computer, cause the computer to perform: obtaining a first scan image and a second scan image; obtaining a character recognition result by performing character recognition processing on a text region in the first scan image; if a correction is made to at least a part of a character string of the obtained character recognition result for setting attribute information about the first scan image based on the obtained character recognition result, learning correction content of the correction; and analyzing text regions in the second scan image to obtain layout information, and determining whether a second scan image is a similar form to the first scan image based on the layout information, characterized in that : the learning includes: determining (S702) whether the correction is deletion of the part of the character string of the character recognition result and the deleted part of the character string is a character string registered in a predetermined term dictionary, and learning (S703), as correction content, a regular expression based on the deleted part of the character string in a case where it is determined that the correction is deletion of the part of the character string of the character recognition result and the deleted part of the character string is a character string registered in the predetermined term dictionary, and otherwise learning (S704), as correction content, the character strings before replacement and the replaced character string; and, if the character recognition processing is performed on a text region in the second scan image and it is determined that the second scan image is a similar form to the determined first scan image, a character recognition result of the text region in the second scan image is corrected based on the regular expression, character string before replacement, and replaced character string learned as correction content.
The computer program according to claim 7, wherein, if the character string of the character recognition result of the text region in the second scan image matches the learned regular expression, the character recognition result of the text region in the second scan image is corrected by deleting a part of the character string of the character recognition result of the text region in the second scan image, the part matching the regular expression.
The computer program according to claim 7 or claim 8, wherein, if the character string of the character recognition result of the text region in the second scan image matches the character string before replacement learned as correction content, the character recognition result is corrected by replacing the character string of the character recognition result of the text region in the second scan image with the replaced character string learned as correction content.
A computer-readable storage medium storing a computer program according to any of claims 7 to 9.

Description

BACKGROUND Field The present invention relates to an apparatus and a method for obtaining a desired character string based on a character recognition result of a scan image. The invention also relates to a computer program comprising instructions that, when executed by a computer, cause the computer to perform a method for obtaining a desired character string based on a character recognition result of a scan image, and a computer-readable storage medium storing such a program. Description of the Related Art There has conventionally been a system that sets a filename of a form image obtained by scanning a paper form, based on a recognition result obtained by performing character recognition processing on the form image. Japanese Patent Application Laid-Open No. S62-051866 discusses performing character recognition processing on a predetermined region in a form image and using the result of the character recognition processing as the filename of the form image. Japanese Patent Application Laid-Open No. 2007-503032 discusses the imposition of file naming rules in performing optical character recognition (OCR) on an extraction region selected by a user's specification and using the OCR result as a filename. As the file naming rules, Japanese Patent Application Laid-Open No. 2007-503032 discusses imposing conditions on the length of the filename (maximum length and minimum length), deleting prohibited characters, and preventing reuse of the same filename. However, the technique discussed in Japanese Patent Application Laid-Open No. 2007-503032 involves setting in advance the conditions such as characters prohibited from a filename use. Thus, it has been difficult for the user to flexibly set the conditions by using recognition results of form images. As further prior art there may be mentioned US2019/138592 A1, which discloses an apparatus and method for transformation of a digital scanner image using machine-learning algorithms wherein a portable USB device is configured for connection to a scanner port and a device processor uses OCR to generate an editable PDF file and uses a machine-learning algorithm to apply auto-corrections to the PDF file, wherein the processor communicates with a user interface configured to display each line from the scanned digital image in line with the corresponding auto-corrected text, and US8331739 B1, which discloses a method of identifying and correcting OCR errors wherein multiple OCR engines process a text image and convert it into texts, an error probability estimator compares the outcomes of the multiple OCR engines for mismatches and determines an error probability for each of the mismatches, wherein if the error probability of a mismatch exceeds an error probability threshold, a suspect is generated and grouped together with similar suspects in a cluster, wherein a question for the cluster is generated and rendered to a human operator for answering, and wherein the answer from the human operator is then applied to all suspects in the cluster to correct OCR errors in the resulting text. SUMMARY In accordance with a first aspect of the invention there is provided an information processing apparatus according to claim 1. In accordance with a second aspect of the invention there is provided an information processing method in accordance with claim 4. In accordance with a third aspect of the invention there is provided a computer program in accordance with claim 7. In accordance with a fourth aspect of the invention there is provided a storage medium in accordance with claim 10. Further features of the present disclosure will become apparent from the following description of exemplary embodiments with reference to the attached drawings. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram illustrating an overall configuration of a system.FIG. 2 illustrates a hardware configuration example of a multifunction peripheral (MFP).FIG. 3 illustrates a hardware configuration example of a client personal computer (PC) and an MFP cooperation service.FIG. 4 illustrates a software configuration example of the system.FIG. 5 is a sequence diagram illustrating a processing procedure between apparatuses.FIGS. 6A and 6B are diagrams illustrating examples of screens displayed on the MFP or the client PC.FIG. 7 is a flowchart illustrating details of processing for learning a correction to a character recognition result.FIG. 8 is a diagram illustrating an example of an attribute setting screen when a character recognition result is corrected.FIG. 9 is a flowchart illustrating details of correction processing in performing character recognition processing on a new image.FIG. 10 is a diagram illustrating an example of the attribute setting screen.FIG. 11 is a diagram illustrating details of data output as a character recognition result.FIG. 12 is a diagram illustrating an example of the attribute setting screen.FIG. 13 is a diagram illustrating details of data output as a character recognition resu