CN-122023190-A - Image enhancement method, device, equipment and medium for OCR system
Abstract
The invention provides an image enhancement method, device, equipment and medium for an OCR system, and relates to the technical field of optical character recognition. Aiming at the problems that in the prior art, the optical character recognition technology usually has degradation phenomena such as blurring, noise or illumination non-uniformity and the like to cause degradation of an image to be recognized, and the traditional image enhancement method directly repairs the degraded image, but artifacts are easy to introduce or limited by specific degradation types and difficult to adapt to complex and changeable real scenes, the invention generates degradation description information by extracting degradation characteristics of the image to be recognized, carries out enhancement processing on a preset character template based on the degradation description information to generate a target character template set matched with the visual characteristics of the image to be recognized, and finally realizes high-precision character recognition through the matching process of the image to be recognized and the target character template.
Inventors
- YAN MING
- WANG RUI
Assignees
- 天津新康医疗健康新技术科技发展有限公司
- 天津心悦医学影像诊断有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260415
Claims (10)
- 1. An image enhancement method for an OCR system, comprising: acquiring an image to be identified; Extracting degradation characteristics in the image to be identified and generating degradation description information based on the degradation characteristics under the condition that the definition of the image to be identified does not meet preset conditions; performing enhancement processing on a preset text template based on the degradation description information to obtain a target text template set matched with the visual characteristics of the image to be identified; and matching the image to be recognized with the target text template set to complete optical character recognition.
- 2. An image enhancement method for an OCR system according to claim 1, further comprising, in case the sharpness of the image to be recognized meets a preset condition: Preprocessing the image to be identified to obtain a processed image to be identified; The preprocessing comprises invalid blank region cutting, inclination correction, contrast enhancement and resolution adjustment of an image to be identified.
- 3. An image enhancement method for an OCR system according to claim 1, characterized in that the sharpness of the image to be identified does not meet a preset condition, comprising: calculating global or local definition index of the image to be identified, and judging that the definition of the image to be identified does not meet preset conditions under the condition that the definition index is lower than a preset index threshold value, and/or, Scoring the image to be identified through a pre-trained image quality evaluation model, and judging that the definition of the image to be identified does not meet a preset condition under the condition that the score is lower than a preset scoring threshold value, and/or, And performing preliminary character recognition on the image to be recognized to obtain recognition confidence, and judging that the definition of the image to be recognized does not meet a preset condition under the condition that the recognition confidence is lower than a preset confidence threshold.
- 4. The image enhancement method for an OCR system according to claim 1, wherein the degradation description information includes a blur parameter, a noise parameter, and an illumination parameter, and the enhancement processing of the preset text template based on the degradation description information includes at least one of: Performing fuzzy processing on the preset text templates based on the fuzzy parameters; Adding noise to the preset text template based on the noise parameters; and carrying out illumination simulation processing on the preset character template based on the illumination parameters.
- 5. An image enhancement method for an OCR system according to claim 1, wherein said matching the image to be identified with a set of target text templates comprises: dividing a text line in the image to be recognized to obtain a plurality of character fragments; Constructing a multi-dimensional constraint field, wherein the multi-dimensional constraint field comprises a geometric constraint field for constraining the space position of a character and a semantic constraint field for constraining the rationality of a character sequence; Under the combined action of the multidimensional constraint field, carrying out position optimization and combination on the character fragments; And in the process of position optimization and combination, matching the character fragments or character candidate areas formed by the fragments with the target character template set, and feeding the confidence coefficient of the character category obtained by matching back to the multidimensional constraint field so as to cooperatively optimize the process of position optimization and combination.
- 6. An image enhancement method for an OCR system according to claim 5, wherein said matching the character fragments or character candidate areas combined from fragments with the set of target text templates comprises: In the iterative process of position optimization, inquiring the character category matched with the character fragment and the corresponding confidence level in the target character template set based on the current geometric state of the character fragment; And calculating the probability that the current character fragment sequence forms the semantically reasonable text based on the semantic constraint field and the confidence.
- 7. An image enhancement method for an OCR system according to claim 5, wherein the geometric constraint field is constructed based on at least one typesetting priors of character spacing, baseline alignment, and character size consistency; the semantic constraint field is constructed based on at least one of n-gram statistics, word frequency and neural network language model.
- 8. An image enhancement device for an OCR system, characterized in that the device employs an image enhancement method for an OCR system according to any one of claims 1 to 7, comprising in particular the following modules: The acquisition module is used for acquiring the image to be identified; The extraction module is connected with the acquisition module and is used for extracting degradation characteristics in the image to be identified and generating degradation description information based on the degradation characteristics under the condition that the definition of the image to be identified does not meet preset conditions; The processing module is connected with the extraction module and is used for carrying out enhancement processing on a preset text template based on the degradation description information to obtain a target text template set matched with the visual characteristics of the image to be identified; And the matching module is connected with the processing module and the acquisition module and is used for matching the image to be recognized with the target text template set to finish optical character recognition.
- 9. An electronic device comprising a processor and a memory storing computer program instructions; The processor, when executing the computer program instructions, implements an image enhancement method for an OCR system as claimed in any one of claims 1-7.
- 10. A computer readable storage medium, having stored thereon computer program instructions which, when executed by a processor, implement an image enhancement method for an OCR system as claimed in any one of claims 1-7.
Description
Image enhancement method, device, equipment and medium for OCR system Technical Field The present invention relates to the field of optical character recognition technology, and in particular, to an image enhancement method, apparatus, device, and medium for an OCR system. Background Optical Character Recognition (OCR) technology has been widely used in many fields such as document digitization, intelligent office, autopilot, mobile payment verification, etc., as a core bridge connecting physical text and digital information. In an actual application scene, the recognition accuracy of the OCR system is highly dependent on the quality of an image to be recognized, but is limited by shooting environments such as low illumination, backlight and shake, hardware equipment such as insufficient resolution of a camera and lens dirt, and text carrier states such as ageing of paper documents, printing blurring, dirt shielding and the like, so that the image to be recognized is often degraded to different degrees. The degradation features can directly destroy the original structure and visual recognition degree of the characters, so that the traditional OCR system is difficult to accurately extract the character features, and further the problems of character misrecognition, missing recognition and the like are caused. At present, a common coping mode is to pretreat an image to be recognized by adopting an image enhancement technology so as to improve the image quality, thereby facilitating the subsequent character recognition, and when a low-quality text image is processed by adopting a mode of directly enhancing the image to be recognized in the prior art, the mode has two key defects that on one hand, the direct enhancement of a seriously degraded image can cause irreversible loss of original text characteristics, on the other hand, the enhancement process is difficult to maintain visual consistency among different characters, and the confusion of similar characters is easy to cause, and seriously restricts the recognition accuracy and robustness of an OCR system in a complex real scene. In view of the above, there is a need in the art for improvements. Disclosure of Invention Aiming at the problems that in the prior art, the optical character recognition technology usually has degradation phenomena such as blurring, noise or illumination non-uniformity and the like to cause degradation of an image to be recognized, and the traditional image enhancement method directly repairs the degraded image, but artifacts are easy to introduce or limited by specific degradation types and difficult to adapt to complex and changeable real scenes, the invention provides an image enhancement method for an OCR system, which comprises the following steps: acquiring an image to be identified; Extracting degradation characteristics in the image to be identified and generating degradation description information based on the degradation characteristics under the condition that the definition of the image to be identified does not meet preset conditions; performing enhancement processing on a preset text template based on the degradation description information to obtain a target text template set matched with the visual characteristics of the image to be identified; and matching the image to be recognized with the target text template set to complete optical character recognition. Further, in the case that the sharpness of the image to be identified meets the preset condition, the method further includes: Preprocessing the image to be identified to obtain a processed image to be identified; The preprocessing comprises invalid blank region cutting, inclination correction, contrast enhancement and resolution adjustment of an image to be identified. Further, the definition of the image to be identified does not meet the preset condition, including: calculating global or local definition index of the image to be identified, and judging that the definition of the image to be identified does not meet preset conditions under the condition that the definition index is lower than a preset index threshold value, and/or, Scoring the image to be identified through a pre-trained image quality evaluation model, and judging that the definition of the image to be identified does not meet a preset condition under the condition that the score is lower than a preset scoring threshold value, and/or, And performing preliminary character recognition on the image to be recognized to obtain recognition confidence, and judging that the definition of the image to be recognized does not meet a preset condition under the condition that the recognition confidence is lower than a preset confidence threshold. Further, the degradation description information includes a blur parameter, a noise parameter, and an illumination parameter, and the enhancing process is performed on the preset text template based on the degradation description information, including at least