KR-102962083-B1 - METHOD FOR DETERMINING PRODUCT INFORMATION BASED ON OCR USING LABEL IMAGE AND SYSTEM THEREOF

KR102962083B1KR 102962083 B1KR102962083 B1KR 102962083B1KR-102962083-B1

Abstract

An image-based product determination system according to one embodiment includes: a shooting device for capturing a product label image, a product image, and a brand image; and an artificial intelligence (AI)-based product determination unit that generates combined data based on the label image, the product image, and the brand image using cross-attention, and generates product information based on the combined data.

Inventors

김필중
한종완

Assignees

주식회사 무신사

Dates

Publication Date: 20260511
Application Date: 20250821

Claims (13)

A shooting device for capturing product label images, product images, and brand images; and An AI-based product determination unit that generates combined data based on the label image, the product image, and the brand image using cross-attention, and generates product information based on the combined data Includes, The above product determination unit, Label data, product attribute data, and brand data are generated based on the label image, product image, and brand image, and a model is pre-trained using cross-attention on the label data, product attribute data, and brand data at a first time point. At a second time point following the first time point, the model is post-trained using cross-attention on the label image, the product image, and the brand image, and An image-based product decision system is, A conversion module that generates the label data, the product attribute data, and the brand data based on the label image, the product image, and the brand image. Includes more, The above product determination unit, An image-based product determination system that, at a third time point between the first time point and the second time point, calculates a loss value based on a first loss calculated based on the label image, the product image, and the brand image, and a second loss calculated based on the output of the conversion module, and trains the model based on the loss value.
In paragraph 1, The above product determination unit, A QKV generator that generates a Q vector from the label image, generates a K vector from the product image, and generates a V vector from the brand image; An attention module that calculates an attention score based on the above Q vector and the above K vector; and A weighted adder that generates the combined data based on the attention score and the V vector An image-based product decision system including
In paragraph 1, A first neural network module that generates label data from the above label image; A second neural network module that generates product attribute data from the above product image; and A third neural network module that generates brand data from the above brand image Includes more, The above product determination unit, Generating the combined data based on the above label data, the above product attribute data, and the above brand data, Image-based product decision system.
In paragraph 3, The above-mentioned second neural network module is, A training dataset is used in which input data consists of product appearance images and degraded images generated based on the product appearance images, and ground truth data consists of color labels, and is trained using a loss function defined based on classification loss and attribute loss, The above classification loss is, A first color output based on the above product appearance image and calculated based on the above color label, and The above property loss is, Calculated based on the first color output based on the product appearance image and the second color output based on the degradation image, The above loss function is, A first weight is applied to the above classification loss and a second weight is applied to the above attribute loss, The first weight mentioned above is greater than the second weight mentioned above, Image-based product decision system.
delete
delete
In paragraph 1, The above product determination unit, At the third time point, the model is trained by applying a first weight to the first loss and a second weight greater than the first weight to the second loss, and Training the model by increasing the first weight and decreasing the second weight at a fourth time point that follows the third time point and precedes the second time point. Image-based product decision system.
In paragraph 1, A verification unit that verifies the above product information based on product data stored in a database An image-based product decision system that further includes
In paragraph 8, The above verification unit is, If the above product information is incomplete, at least one candidate product is extracted from the product data in the database, and a final product is determined based on the at least one candidate product and the product image. Image-based product decision system.
In Paragraph 9, The above verification unit is, If the verification result is a failure, a failure signal is transmitted to the product decision unit, and The above product determination unit, Performing learning by calculating additional loss based on the above failure signal, Image-based product decision system.
In Paragraph 9, The above verification unit is, If the above product information is complete and product data corresponding to the above product information exists in the above database, the verification result is determined to be successful, and If the above product information is complete and product data corresponding to the above product information does not exist in the above database, the above verification result is determined as a failure, and If the above product information is incomplete, product data having a similarity to the above product information greater than or equal to a threshold value is extracted as at least one candidate product, and the verification result is determined as a partial success. Image-based product decision system.
As an image-based product determination method performed by an electronic device, Step of acquiring the product label image, product image, and brand image; A step of generating combined data based on the label image, the product image, and the brand image through cross-attention using an artificial intelligence module; and Step of generating product information based on the above combined data Includes, The above artificial intelligence module is, Label data, product attribute data, and brand data are generated based on the label image, product image, and brand image, and a model is pre-trained using cross-attention on the label data, product attribute data, and brand data at a first time point. At a second time point following the first time point, the model is post-trained using cross-attention on the label image, the product image, and the brand image, and The above image-based product determination method is, A step of generating the label data, the product attribute data, and the brand data based on the label image, the product image, and the brand image using a conversion module. Includes more, The above artificial intelligence module is, An image-based product determination method, wherein at a third time point between the first time point and the second time point, a loss value is calculated based on a first loss calculated based on the label image, the product image, and the brand image, and a second loss calculated based on the output of the conversion module, and the model is trained based on the loss value.
It includes a processor and memory connected to the processor, The above memory is configured to store a program, and The above processor is configured to execute the above program, and When the above program is executed, the steps of the image-based product determination method of claim 12 are implemented, Electronic device.

Description

Method for Determining Product Information Based on OCR Using Label Image and System Thereof The disclosed content relates to a method and system for determining product information based on OCR using label images. With the recent expansion of the recommerce market, providing accurate product information is becoming increasingly important for online second-hand trading platforms. Particularly for fashion products such as clothing, consumers verify which new product corresponds to the item by checking the style number, brand name, and manufacturing information listed on the care label; based on this, they determine authenticity and a reasonable price relative to new products. The technology commonly used to verify this information is label reading technology based on Optical Character Recognition (OCR). OCR is utilized by extracting text information from images of product care labels. However, existing OCR-based technologies suffer from problems such as reduced character recognition accuracy when labels are damaged or wrinkled due to washing or use, and difficulties in accurate recognition caused by various languages, fonts, and layouts. To address these issues, recommerce platforms have recently been attempting to extract product attributes using additional information beyond label images to ensure product reliability. For example, there are attempts to extract attributes by utilizing product databases or analyzing the entire product image and brand image together. FIG. 1 is a schematic block diagram of an image-based product determination system according to one embodiment. FIG. 2 is a block diagram of an electronic device according to one embodiment. FIG. 3 is a diagram illustrating the learning process of an artificial neural network module according to one embodiment. FIG. 4 is a diagram illustrating the inference process of an artificial neural network module according to one embodiment. FIG. 5 is a diagram illustrating the inference process of an artificial neural network module according to one embodiment. FIG. 6 is a block diagram of an electronic device according to one embodiment. FIG. 7 is a flowchart of an artificial intelligence learning method according to one embodiment. FIG. 8 is a diagram illustrating the learning process of a data generation unit according to one embodiment. FIG. 9 is a drawing for explaining a transformer architecture according to one embodiment. FIG. 10 is a block diagram of a coupler according to one embodiment. FIG. 11 is a drawing for explaining a display screen of an electronic device according to one embodiment. The various embodiments described in this specification are illustrative for the purpose of clearly explaining the technical concept of this disclosure and are not intended to limit it to specific embodiments. The technical concept of this disclosure includes various modifications, equivalents, alternatives, and embodiments optionally combined from all or part of each embodiment described in this specification. Furthermore, the scope of the technical concept of this disclosure is not limited to the various embodiments presented below or the specific descriptions thereof. Terms used in this specification, including technical or scientific terms, may have the meaning generally understood by those skilled in the art to which this disclosure pertains, unless otherwise defined. Expressions used herein such as “comprising,” “may compose,” “possessing,” “possessing,” “having,” and “possessing” imply the existence of the subject feature (e.g., function, operation, or component, etc.) and do not exclude the existence of other additional features. That is, such expressions should be understood as open-ended terms implying the possibility of including a second embodiment. In this specification, singular expressions include plural expressions unless the context clearly specifies them as singular. Additionally, plural expressions include singular expressions unless the context clearly specifies them as plural. Throughout the specification, when a part is described as including a certain component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components. Additionally, the terms 'module' or 'part' as used in the specification refer to software or hardware components, and the 'module' or 'part' performs certain roles. However, the meaning of 'module' or 'part' is not limited to software or hardware. The 'module' or 'part' may be configured to reside in an addressable storage medium or configured to run on one or more processors. Thus, as an example, the 'module' or 'part' may include components such as software components, object-oriented software components, class components, and task components, and at least one of processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays, or variables