JP-2026076139-A - Object detection method and apparatus

JP2026076139AJP 2026076139 AJP2026076139 AJP 2026076139AJP-2026076139-A

Abstract

[Problem] To provide a method for detecting an object. [Solution] The method includes the steps of: detecting at least one object candidate from a target image and generating object candidate information including the location information and confidence score of the object candidate; generating first characteristic information from the object candidate; calculating a similarity score between the first characteristic information and the second characteristic information of a reference object; and selecting a valid object from the object candidates based on the confidence score and the similarity score, wherein the reference object includes a positive reference object or a negative reference object. [Selection Diagram] Figure 1

Inventors

イ、ホンソク
ユン、スンジョン

Assignees

ニューロクルインコーポレーテッド

Dates

Publication Date: 20260511
Application Date: 20251023
Priority Date: 20241023

Claims (14)

A method for detecting an object, A step of detecting at least one object candidate from the target image and generating object candidate information including the location information and confidence score of the object candidate; A step of generating first characteristic information from the aforementioned candidate object; A step of calculating the similarity score between the first characteristic information and the second characteristic information of the reference object; and, This includes a step of selecting a valid object from among the candidate objects based on the confidence score and the similarity score, The method is characterized in that the reference object includes a positive reference object or a negative reference object.
The step of detecting the aforementioned candidate object is performed by inputting the target image into the first machine learning model and outputting the candidate object information. The method according to claim 1, wherein the first machine learning model is trained to output candidate object information from an input image.
The step of generating the first characteristic information is: The method according to claim 1, comprising the steps of: inputting the target image into a second machine learning model to output first overall characteristic information; and extracting the first characteristic information corresponding to the object candidate from the first overall characteristic information.
The process further includes the step of generating the second characteristic information, The step of generating the second characteristic information is: The method according to claim 3, comprising the steps of: inputting a reference image containing the reference object into a third machine learning model to output second overall characteristic information; and extracting second characteristic information corresponding to the reference object from the second overall characteristic information.
The method according to claim 3, wherein the third machine learning model is trained to highlight the differences between the positive reference object and the negative reference object.
The step of selecting the effective object is, The method according to claim 1, comprising the steps of: calculating the number of detection points from the confidence score and the similarity score; selecting the number of effective detection points by comparing the number of detection points with a predetermined detection critical value; and determining the effective object based on the number of effective detection points.
The method according to claim 6, wherein the step of determining the effective object is performed based on a comparison between the number of effective detection points from a positive reference object and the number of effective detection points from a negative reference object that each of the candidate objects has.
The method according to claim 6, wherein the step of determining the effective object is performed by excluding from the effective object candidates any object candidates that have an effective detection score from a negative reference object.
The method according to claim 8, wherein the step of determining the effective object is performed by selecting an object candidate having an effective detection score by a positive reference object from among the object candidates as the effective object.
The method according to claim 6, further comprising the step of adjusting the confidence critical value for selecting the object candidate or the detection critical value from the confidence score.
The step of specifying the reference object from the target image; and, The method according to claim 1, further comprising the step of generating the second characteristic information for the specified reference object.
The method according to claim 1, further comprising the step of counting the effective objects.
A computer program characterized by being stored on a recording medium for performing the method described in any one of claims 1 to 12.
Object detection device, Memory in which a program for object detection is stored; and, A device comprising: a processor that, by executing the program, detects at least one object candidate from a target image, generates object candidate information including the location information and confidence score of the object candidate, generates first characteristic information from the object candidate, calculates a similarity score between the first characteristic information and the second characteristic information of a reference object, and selects a valid object from the object candidates based on the confidence score and the similarity score; wherein the reference object includes a positive reference object or a negative reference object.

Description

This application relates to an object detection method and apparatus. Object counting technology utilizes computer vision and artificial intelligence to automatically count the number of objects in images and videos. It is widely used in various industries, including manufacturing, agriculture, and retail, for quality control, inventory management, and productivity improvement. However, existing object counting techniques require accurate labeling of every object, resulting in significant time and expense for labeling in images containing a large number of objects. Furthermore, when objects are densely packed, existing object detection models struggle to accurately distinguish and count individual objects. This poses a problem in manufacturing lines and agricultural environments where objects overlap or are located close together. Additionally, applying models to new environments or objects requires substantial training data and time. Existing models are optimized for specific environments, necessitating additional data collection and model retraining for application to other environments. This is a flowchart of the object detection method according to an embodiment of this application.This is a flowchart illustrating the embodiment of step S130 in Figure 1.This is a block diagram of an object detection device according to an embodiment of the present application.This is a diagram illustrating the object detection process according to an embodiment of this application.This is a diagram illustrating the object detection process according to an embodiment of this application. A brief description of each drawing is provided to help you better understand the drawings cited in this application. Because the technical concept of this application can be modified in various ways and has various embodiments, specific embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the technical concept of this application to specific embodiments, but rather to include all modifications, equivalents, or substitutes that fall within the scope of the technical concept of this application. In explaining the technical concept of this application, if it is determined that providing a specific explanation of related prior art would unnecessarily obscure the gist of this application, such detailed explanation will be omitted. The terminology used herein is for illustrative purposes only and is not intended to limit and/or restrict this application. Singular expressions include plural expressions unless the context clearly indicates otherwise. Furthermore, numbers used herein (e.g., 1st, 2nd, etc.) are merely identifiers to distinguish one component from another. In this specification, when a part is described as being connected to another part, this includes not only direct connections but also indirect connections through other components in between. Furthermore, when a part is described as containing a component, this means, unless otherwise stated, that it may contain other components rather than excluding them. Furthermore, in this application, the term "or" is intended to mean an implicational "or," not an exclusive "or." That is, where not distinctly specified or clearly defined in context, "X utilizes A or B" is intended to mean one of the natural implicational substitutions. That is, if X utilizes A; X utilizes B; or X utilizes both A and B, "X utilizes A or B" can apply to any of the aforementioned cases. Also, the term "and/or" as used herein refers to and includes all possible combinations of one or more of the enumerated related configurations. Furthermore, terms such as "~part," "~device," "~child," and "~module" as described in this application refer to a unit that processes at least one function or operation. This can be embodied in hardware, software, or a combination of hardware and software, such as a processor, microprocessor, microcontroller, CPU (Central Processing Unit), GPU (Graphics Processing Unit), APU (Accelerate Processor Unit), DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array), etc. Furthermore, it is important to clarify that the classification of configurations in this application is merely based on the primary function each configuration performs. That is, two or more configurations described below may be integrated into a single configuration, or a single configuration may be further subdivided into two or more configurations based on specific functions. Moreover, each configuration described below may additionally perform some or all of the functions performed by other configurations, in addition to its own primary function, and of course, some of the primary functions of each configuration may be exclusively performed by other configurations. In this specification, the term "artificial intelligence learning model" may be used in various senses, such as artificial intelligence mo