DE-102024133111-A1 - Method and device for recognizing an object, as well as machine learning model and computer program product

DE102024133111A1DE 102024133111 A1DE102024133111 A1DE 102024133111A1DE-102024133111-A1

Abstract

The present disclosure relates to a method for recognizing an object. The method comprises the following steps: i) Comparing a detected object from a second frame with at least one detected object from a first frame using at least one of the following parameters: a) distance in terms of distance, in particular the Euclidean distance between the objects; b) Difference between the objects in their geometric dimensions, in particular the height and/or width of the objects; c) Difference between the objects in their preferably optical and/or acoustic appearance; d) Equality of the objects with respect to an object type; ii) Evaluating the results from step i) to determine whether the detected object of the second frame is the detected object of the first frame. The present disclosure further relates to a device (1) for recognizing an object, as well as a machine learning model and a computer program product.

Inventors

Florian Mauer-Endler
Michael Wielpütz
Daniel Eckstein

Assignees

BEEBUCKET GmbH

Dates

Publication Date: 20260513
Application Date: 20241112

Claims (20)

A method for recognizing an object, comprising the steps of: i) comparing a recognized object from a second frame with a recognized object from a first frame based on at least one of the following parameters: a) distance, in particular the Euclidean distance between the objects; b) difference between the objects in their geometric dimensions, in particular the height and/or width of the objects; c) difference between the objects in their preferably optical and/or acoustic appearance; d) similarity of the objects with respect to an object type; ii) evaluating the results from step i) to determine whether the recognized object from the second frame is the recognized object from the first frame.
Procedure according to Claim 1 , where step i) is performed using at least two of the parameters a) to d).
Procedure according to Claim 1 or 2 , where step i) is performed using parameters a) and b).
Method according to one of the preceding claims, wherein step i) is carried out using at least three of the parameters a) to d).
Method according to one of the preceding claims, wherein step i) is performed using parameters a) and b) and c).
Method according to one of the preceding claims, wherein step i) is performed using all four parameters a) to d).
Method according to one of the preceding claims, wherein step i) is performed using a further parameter: • Degree of occlusion of at least one of the objects by a third object.
Method according to one of the preceding claims, wherein step i) is performed using a further parameter: • Degree of occlusion of the object of the second single image by a third object of the first single image.
Method according to one of the preceding claims, wherein the results from the at least one comparison of step i) in metadata are used as the basis for step ii).
Method according to one of the preceding claims, wherein the method uses an AI model to determine whether the detected object of the second frame is the detected object of the first frame.
Method according to one of the preceding claims, wherein the method uses a random forest algorithm to determine whether the detected object of the second frame is the detected object of the first frame.
Method according to one of the preceding claims, wherein the method uses an XGBoost algorithm to determine whether the detected object of the second frame is the detected object of the first frame.
Methods for detecting and re-detecting an object, comprising steps for: • providing a first frame; • detecting an object in the first frame; • providing a second frame; • detecting an object in the second frame; • performing a procedure according to one of the Claims 1 until 12 .
Procedure according to Claim 13 , whereby the second single image was created later in time than the first single image.
Procedure according to Claim 13 or 14 , whereby information about the detected objects is provided in data.
Machine learning model, which is developed using a method according to one of the Claims 1 until 12 has been trained.
Model according to Claim 16 , comprising a random forest algorithm to determine whether a detected object in a second frame is a detected object in a first frame.
Model according to Claim 16 or 17 , comprising an XGBoost algorithm to determine whether a detected object in a second frame is a detected object in a first frame.
Computer program product comprising program code stored on a computer-readable medium for carrying out a procedure according to one of the Claims 1 until 12 and in particular comprehensive a model according to one of the Claims 16 until 18 .
Device (1) for recognizing an object, comprising a comparison device (2) configured to compare a recognized object (O2) of a second frame (E2) with a recognized object (O1) of a first frame (E1) based on at least one of the following parameters: a) distance, in particular Euclidean distance between the objects; b) difference between the objects in their geometric dimensions, in particular the height and/or width of the objects; c) difference between the objects in their preferably optical and/or acoustic appearance; d) similarity of the objects with respect to an object type; an evaluation device (3) configured to evaluate the results of at least one comparison by the comparison device (2) in order to determine whether the object (O2) of the second frame (E2) is the object (O1) of the first frame (E1).

Description

The present disclosure relates to a method and a device for recognizing an object and in particular a machine learning model and in particular a computer program product. Object recognition, also known in technical circles as object tracking or at least as a subset thereof, is becoming increasingly relevant in a wide variety of computer-based applications. For example, object recognition is crucial for many machine vision applications. From analyzing pedestrian and traffic flows to analyzing movement patterns in sports, object recognition plays an indispensable role in gaining valuable insights. In machine vision applications, object recognition is typically a subsequent step following object recognition. Object recognition involves identifying an object, such as a person, car, or other item. This is usually based on individual frames from a video sequence, for example, from a surveillance camera, in which a single object is detected. Object tracking then determines whether each subsequent frame represents the same initially identified object. Analytical methods for object tracking use previously known movement patterns to predict the likely path of an object. These methods usually require a relatively large amount of computing time and often deliver an insufficient success rate in assessing whether a currently observed object corresponds to a previously recognized object. Against this background, there is also a need to improve object recognition. This is based on the expectation that an improved analytical method will allow object recognition to be carried out faster and/or with a higher success rate. Fundamental improvements in object recognition are offered by a method, in particular an analytical method, which includes or consists of the following steps: i) Comparing a detected object of a second frame with at least one detected object of a first frame, preferably created earlier in time, using at least one of the following parameters: a) distance in terms of distance, in particular the Euclidean distance between the objects; b) Difference between the objects in their geometric dimensions, in particular the height and/or width of the objects; c) Difference between the objects in their preferably optical and/or acoustic appearance; d) Equality of the objects with respect to an object type; ii) Evaluating the results from step i) to determine whether the detected object of the second frame is the detected object of the first frame. The improved method is based on the idea of using a procedure that a human would use to assess whether an object is one that has already been recognized. Such a procedure relies, for example, on parameters of the type described above. It has been shown that object recognition is facilitated when one of these parameters is already used. In particular, this results in advantages in the robustness of the method. In particular, it also results in advantages in the execution speed of the method. The improved method can be designed such that step i) is executed using at least two of the parameters a) to d). It has been shown that this further improves object recognition. In particular, it offers further advantages in terms of the method's robustness and execution speed. These advantages are especially noticeable, for example, when step i) is executed using parameters a) and b). The improved method can further be designed such that step i) is executed using at least three of the parameters a) to d). It has been shown that this further improves object recognition. In particular, it results in further advantages in the robustness of the method. It also results in further advantages in the execution speed of the method. These further advantages are particularly noticeable, for example, when step i) is executed using parameters a), b), and c). The improved method can further be designed such that step i) is executed using all four parameters a) to d). It has been shown that this further improves object recognition. In particular, it results in further advantages in the robustness of the method. It also results in further advantages in the execution speed of the method. The improved method can further be designed such that step i) is executed using at least one additional parameter. Such an additional parameter can be the degree of occlusion of at least one of the objects by a third object. Additionally or alternatively, such an additional parameter can be the degree of occlusion of the object in the second frame by a third object, preferably in the first frame. It has been shown that this further improves object recognition. In particular, this results in further advantages in the robustness of the method. It also results in further advantages in the execution speed of the method. The improved procedure can further be designed such that the results from at least one comparison in step i) are used as metadata for step ii). This ensures that the comparison results are presented in a struct