KR-20260063781-A - Improved Korean character recognition device and method, recording medium

KR20260063781AKR 20260063781 AKR20260063781 AKR 20260063781AKR-20260063781-A

Abstract

The present invention proposes an improved Korean character recognition device and method. The Korean character recognition device of the present invention includes a character recognition processing unit that recognizes characters in a character area image using a Vision Transformer-based Korean character recognition algorithm that improves Korean character recognition performance. The character recognition processing unit may be configured to include an information acquisition unit that stores characters formed by combining Korean consonants and vowels in a dictionary form; a node count designation unit that designates the number of nodes in a final output layer as the number of Korean characters based on the stored dictionary; a prediction unit that predicts the location of characters in the character area image based on the number of Korean characters; and a learning unit that learns the characters in the predicted area.

Inventors

유준혁
송현석

Assignees

대구대학교 산학협력단

Dates

Publication Date: 20260507
Application Date: 20241031

Claims (12)

Input unit for receiving a character area image containing Korean characters; A character recognition processing unit that recognizes characters in the character area image using a Vision Transformer-based Korean character recognition algorithm with improved Korean character recognition performance; and A Korean character recognition device characterized by comprising an output unit that outputs characters recognized by the above-mentioned character recognition processing unit.
In paragraph 1, The above character recognition processing unit is, Information acquisition unit that stores characters formed by combining Korean consonants and vowels in the form of a dictionary; A node count specification unit that specifies the number of nodes in the final output layer based on the above-mentioned stored dictionary as the number of Korean characters; A prediction unit that predicts the location of a character in the character area image based on the number of Korean characters; and A Korean character recognition device comprising a learning unit that learns characters in the predicted corresponding area.
In paragraph 2, The above information acquisition unit is, A Korean character recognition device that uses pickle, a binary storage module in Python.
In paragraph 2, The above information acquisition unit is, A Korean character recognition device that stores characters formed by combining Korean consonants from 'ㄱ' to 'ㅎ' and Korean vowels from 'ㅏ' to 'ㅣ' in the form of a pickle file.
In paragraph 2, The above prediction unit is, A Korean character recognition device that uses only the Character A3 module among the A3 (Adaptive Addressing and Aggregation) modules of the MGP-STR (Multi-Granularity Prediction for Scene Text Recognition) model.
In paragraph 2, The above learning unit is, A Korean character recognition device that learns using 1,000,000 character images artificially generated through a Synthetic Text Image Generator as training data.
In paragraph 1, The above Hangul recognition device is, A Korean character recognition device characterized by providing the recognition accuracy of Ground Truth (GT), an evaluation metric, expressed as a percentage.
A Korean character recognition device based on a Vision Transformer (ViT) is Input step for receiving a character area image containing Korean characters; Information acquisition step of acquiring information about the Hangul characters by storing characters formed by combining consonants and vowels of the Hangul characters in a dictionary form; A node count specification step for specifying the number of nodes in the final output layer based on the above-mentioned stored dictionary as the number of Korean characters; A prediction step for predicting the location of a character in the character area image based on the number of Korean characters; and A Korean character recognition method characterized by including a recognition step of recognizing Korean characters by learning the characters of the predicted corresponding area.
In paragraph 8, A Korean character recognition method further comprising the step of providing the recognition accuracy of the above Korean characters using Ground Truth (GT) as an evaluation metric.
In paragraph 8, The above information acquisition step is, A Korean character recognition method that saves characters formed by combining Korean consonants from 'ㄱ' to 'ㅎ' and Korean vowels from 'ㅏ' to 'ㅣ' in the form of a pickle file.
In paragraph 8 The above prediction step is, A Korean character recognition method that uses only the Character A3 module among the A3 (Adaptive Addressing and Aggregation) modules of the MGP-STR (Multi-Granularity Prediction for Scene Text Recognition) model.
A program recorded on a computer-readable recording medium that realizes the Korean character recognition method described in any one of claims 8 to 11 through execution by a processor.

Description

Improved Korean character recognition device and method, recording medium The present invention relates to a device and method for recognizing Korean characters that have improved recognition functions and performance compared to existing ones. Recognizing specific characters in images or videos is a topic being actively researched in the fields of Optical Character Recognition (OCR) and Scene Text Recognition (STR). In particular, unlike OCR, which recognizes structured text or characters in images, technology related to Scene Text Recognition requires character recognition for various scenes, such as unstructured text or natural images. These aforementioned scene character recognition technologies are Vision Transformer-based deep learning models that primarily deal with recognition performance for English letters and numbers. Therefore, scene character recognition technology is not yet effective in terms of character recognition capabilities for Hangul. This is because Hangul has a more complex visual character structure compared to English, and the number of characters that need to be classified is much greater. For example, based on the Complete KS * 1001 standard, Hangul consists of 2,350 characters, while English has only 52 characters including both uppercase and lowercase letters. Therefore, recognizing Hangul requires a more sophisticated character recognition model and a larger amount of data for training. For this reason, as is well known, while scene character recognition for Hangul is possible, its recognition accuracy has not been as good as that of English or numbers. Nevertheless, character recognition capabilities for various scenarios, such as unstructured text or natural images, are essential. For instance, there has been a continuous upward trend in various illegal activities and security threats within South Korean territorial waters, including illegal smuggling vessels, intrusions by enemy ships, and the defection of North Korean fishing boats. To address these issues, it is necessary to accurately and rapidly recognize text within images such as those of vessels. FIG. 1 is a configuration diagram illustrating a Korean character recognition device according to an embodiment of the present invention. Figure 2 is a configuration diagram illustrating the character recognition processing unit of Figure 1. FIG. 3 is a configuration diagram showing a character recognition model of a Korean character recognition device according to an embodiment of the present invention. FIG. 4 is a flowchart illustrating a Korean character recognition method according to an embodiment of the present invention. Figure 5 is an example diagram showing the Korean character recognition result according to an embodiment of the present invention. The present invention is capable of various modifications and may have various embodiments, and specific embodiments are illustrated in the drawings and described in detail. However, this is not intended to limit the invention to specific embodiments, and it should be understood that it includes all modifications, equivalents, and substitutions that fall within the spirit and scope of the invention. In describing the invention, detailed descriptions of related prior art are omitted if it is determined that such detailed descriptions may obscure the essence of the invention. Terms such as "first," "second," etc., may be used to describe various components, but said components should not be limited by said terms. These terms are used solely for the purpose of distinguishing one component from another. The terms used in this invention are used merely to describe specific embodiments and are not intended to limit the invention. The singular expression includes the plural expression unless the context clearly indicates otherwise. In this application, terms such as "comprising" or "having" are intended to specify the existence of the features, numbers, steps, actions, components, parts, or combinations thereof described in the specification, and should be understood as not precluding the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. Expressions indicating a part, such as “part” or “component,” used in this invention mean that the corresponding component may represent a device capable of including a specific function, software capable of including a specific function, or a combination of a device and software capable of including a specific function; however, it is not necessarily limited to the expressed function. This is provided merely to aid in a more comprehensive understanding of the invention, and a person with ordinary knowledge in the field to which this invention pertains can make various modifications and variations from this description. Accordingly, the scope of the present invention is not limited to the described embodiments, and all things equivalent to or having equivalent variations to