CN-115294581-B - Error character recognition method and device, electronic equipment and storage medium
Abstract
The disclosure provides a method and a device for recognizing error characters, electronic equipment and a storage medium, and belongs to the field of image processing. The method comprises the steps of obtaining a text image to be recognized, processing the text image to be recognized to obtain a character recognition result of the text image to be recognized, wherein the character recognition result comprises at least one character and recognition probability of the character, obtaining context information of the target error character in the text image to be recognized when the target error character exists in the text image to be recognized based on the recognition probability, determining position information of the target character in a correct character dictionary based on the target character of the target error character in the character recognition result and a preset correct character dictionary, and processing the context information and the position information to obtain error types of the target error character. With the present disclosure, the error category of the error character can be identified.
Inventors
- QIN YONG
Assignees
- 深圳市星桐科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20220801
Claims (7)
- 1. A method for identifying an incorrect character, the method comprising: Acquiring a text image to be identified; Processing the text image to be recognized to obtain a character recognition result of the text image to be recognized, wherein the character recognition result comprises at least one character and recognition probability of the character; when determining that a target error character exists in the text image to be recognized based on the recognition probability, acquiring context information of the target error character in the text image to be recognized; Determining position information of the target character in a correct character dictionary based on the target character of the target error character in the character recognition result and the preset correct character dictionary; Adding the context information and the position information point by point to obtain first characteristic information of the target error character; Processing the first characteristic information to obtain a second characteristic vector of the target error character, wherein the second characteristic vector carries classification characteristic information of the target error character; processing the second feature vector, and calculating a preset number of classification probabilities of the target error characters; Acquiring the preset number of error classifications corresponding to the target character from a preset error character dictionary according to the position information, wherein the correct character dictionary stores a plurality of preset characters, the error character dictionary stores the preset number of error classifications of each preset character, and the storage sequence of each preset character in the correct character dictionary is the same as that of the error character dictionary; and determining the error category of the target error character in the preset number of error categories based on the preset number of category probabilities.
- 2. The method according to claim 1, wherein the processing the text image to be recognized to obtain a character recognition result of the text image to be recognized includes: Extracting features of the text image to be identified to obtain feature mapping vectors of the text image to be identified; constructing a plurality of context information based on the feature mapping vector; Processing each piece of context information to obtain character recognition probability corresponding to each piece of context information; determining a recognition result corresponding to each piece of context information based on the correct character dictionary and the character recognition probability; and obtaining at least one character and the recognition probability of the character based on the recognition result, and generating a character recognition result of the text image to be recognized.
- 3. The method according to claim 1, wherein the determining the position information of the target character in the correct character dictionary based on the target character of the target error character in the character recognition result and a correct character dictionary set in advance includes: Acquiring a character with the recognition probability smaller than a probability threshold value from the character recognition result as a target character of the target error character in the character recognition result; Acquiring sequence information of the target character in the correct character dictionary; and encoding the sequence information to determine the position information of the target character in the correct character dictionary.
- 4. An apparatus for recognizing an erroneous character, the apparatus comprising: the acquisition module is used for acquiring a text image to be identified; The first recognition module is used for processing the text image to be recognized to obtain a character recognition result of the text image to be recognized, wherein the character recognition result comprises at least one character and recognition probability of the character; The method comprises the steps of determining a text image to be recognized, obtaining context information of a target error character in the text image to be recognized when the target error character exists in the text image to be recognized based on the recognition probability, determining position information of the target error character in a correct character dictionary based on the target character of the target error character in a character recognition result and a preset correct character dictionary, adding the context information and the position information point by point to obtain first feature information of the target error character, processing the first feature information to obtain second feature vector of the target error character, wherein the second feature vector carries classification feature information of the target error character, processing the second feature vector, calculating a preset number of classification probabilities of the target error character, obtaining the preset number of error classifications corresponding to the target error character in the preset error character dictionary according to the position information, storing a plurality of preset characters in the correct character dictionary, storing each preset character in the preset dictionary, and determining the number of error classifications in the same order based on the preset number of error characters in the preset dictionary.
- 5. The apparatus of claim 4, wherein the second identification module is configured to: Acquiring a character with the recognition probability smaller than a probability threshold value from the character recognition result as a target character of the target error character in the character recognition result; Acquiring sequence information of the target character in the correct character dictionary; and encoding the sequence information to determine the position information of the target character in the correct character dictionary.
- 6. An electronic device, comprising: processor, and A memory in which a program is stored, Wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the method according to any of claims 1-3.
- 7. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1-3.
Description
Error character recognition method and device, electronic equipment and storage medium Technical Field The present invention relates to the field of image processing, and in particular, to a method and apparatus for recognizing an error character, an electronic device, and a storage medium. Background In the educational scenario, or in the scenario of word dictation in job modification, it is important to judge which word a student wrote and to indicate where he wrote. The number of the lines in the input image can be divided into single-line recognition and multi-line recognition, and the single-line and sequence-based methods are the main stream in terms of labeling mode, so that a text recognition method paradigm with a correction part, a feature extraction part and a recognition decoding part combined in sequence is formed, and most of the methods follow the paradigm and specifically improve various problems such as bending text, fuzzy text and the like. However, in terms of chinese recognition, there are few methods for performing specific recognition on the wrong word, and most of recognition is two-category recognition, i.e., recognition of whether or not the recognition is the wrong word, but it cannot be recognized where the specific error is. Disclosure of Invention In view of this, the embodiments of the present disclosure provide a method, an apparatus, an electronic device, and a storage medium for identifying an error character, so as to solve the problem that the error category of the error character cannot be identified. According to an aspect of the present disclosure, there is provided a method for recognizing an erroneous character, the method including: Acquiring a text image to be identified; Processing the text image to be recognized to obtain a character recognition result of the text image to be recognized, wherein the character recognition result comprises at least one character and recognition probability of the character; when determining that a target error character exists in the text image to be recognized based on the recognition probability, acquiring context information of the target error character in the text image to be recognized; Determining position information of the target character in a correct character dictionary based on the target character of the target error character in the character recognition result and the preset correct character dictionary; And processing the context information and the position information to obtain the error category of the target error character. According to another aspect of the present disclosure, there is provided a text recognition apparatus, including: the acquisition module is used for acquiring a text image to be identified; The first recognition module is used for processing the text image to be recognized to obtain a character recognition result of the text image to be recognized, wherein the character recognition result comprises at least one character and recognition probability of the character; The second recognition module is used for acquiring context information of the target error character in the text image to be recognized when the target error character exists in the text image to be recognized based on the recognition probability, determining position information of the target character in a correct character dictionary based on the target character of the target error character in the character recognition result and a preset correct character dictionary, and processing the context information and the position information to obtain an error category of the target error character. According to another aspect of the present disclosure, there is provided an electronic device including: processor, and A memory in which a program is stored, The program includes instructions that, when executed by the processor, cause the processor to perform the method of recognizing the error character. According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the above-described erroneous character recognition method. In the method, the error category of the target error character is identified through the position information of the target character in the correct character dictionary and the context information of the target error character in the text image to be identified, and the target character can be positioned in the error category of the target character when being identified by using the position information because the position of the target character in the correct character dictionary is unique, so that the classification quantity is reduced, the solution space can be effectively reduced, and the identification efficiency of the error character is improved while the error category is identified. Drawings Further details, features and advantages of the present disclosure are disclosed in th