JP-2026075586-A - Methods and systems for object recognition
Abstract
[Problem] To provide a method for improving object recognition performance by identifying incomplete edges, redundant edges, etc., and correcting the contour, and to provide a non-transient computer-readable medium. [Solution] In an object recognition system, the method includes detecting contours from measurement data of one or more objects provided by one or more vision sensors, detecting indentations in the contours of one or more objects, evaluating the correctness of each indentation, performing an action selected from maintain, complete, and remove for each indentation based on the evaluated correctness to generate a corrected contour, and performing an object recognition process on the corrected contour. [Selection Diagram] Figure 5
Inventors
- ジエ フー
- 木村 宣隆
Assignees
- 株式会社日立製作所
Dates
- Publication Date
- 20260508
- Application Date
- 20250616
- Priority Date
- 20241022
Claims (19)
- A method for an object recognition system, Detecting contours from measurement data of one or more objects provided by one or more vision sensors, To detect indentations in the contour of one or more of the aforementioned objects, To evaluate the correctness of each of the aforementioned indentations, Based on the evaluated correctness, one action selected from maintain, complete, and remove is performed for each of the recesses to generate the corrected contour, Performing object recognition processing on the modified contour, A method that includes this.
- In the method according to claim 1, The correctness of each recess is evaluated by a method based on whether the distance between the pole of each recess and the convex hull of the contour exceeds a threshold.
- In the method according to claim 1, The method for evaluating the correctness of each recess is based on whether the ratio of the length of the side of the circumscribing rectangle of the contour associated with each recess exceeds a threshold.
- In the method described in claim 1, A method in which completion is selected as the action when each of the aforementioned indentations is evaluated to be longer than a preset threshold and therefore inaccurate.
- In the method according to claim 1, If the depth information between one or more of the objects and one or more of the vision sensors is evaluated as inaccurate when it indicates that the depth difference in the region adjacent to each recess exceeds a threshold, the completion is selected as the action for each recess. A method in which removal is selected as the action for each recess evaluated as inaccurate if the depth difference does not exceed a threshold.
- In the method according to claim 1, If each of the recesses relating to the first contour of the object causes a difference between the first contour of the object from one or more of the objects and the second contour relating to the object from one or more of the objects, then removal is selected as the action for each of the recesses that is deemed inaccurate. A method in which completion is selected as the action for each of the recesses that are evaluated as inaccurate, if each of the recesses associated with the first contour also occurs in the second contour.
- A method for an object recognition system, Detecting contours from measurement data of one or more objects provided by one or more vision sensors, To detect indentations in the contour of one or more of the aforementioned objects, Evaluating the quality of the contour based on the relevant recesses, Performing object recognition processing on the aforementioned contour, This includes performing an action on the object recognized by the execution of object recognition, based on the quality of the contour associated with the recognized object, The aforementioned action is a method selected from either assigning a pickup to the robot, or adding a slight movement by the robot.
- In the method of claim 7, A method for evaluating the quality of each contour is performed based on whether the number of indentations in each contour exceeds a threshold.
- In the method of claim 7, The evaluation of the quality of each contour is performed based on whether the average or median value of each contour exceeds a threshold.
- In the method of claim 7, The method for evaluating the quality of each contour is based on whether the maximum concavity of each contour exceeds a threshold.
- In the method of claim 7, Assigning the robot to pick up the object is a method selected as the action when the relevant contour of the recognized object is evaluated to have a quality above a threshold, or when the detected indentation on the relevant contour is maintained.
- In the method of claim 7, A method selected as the action to add a slight movement by the robot when the relevant contour of the recognized object is evaluated to have a quality below a threshold, or when at least one of the indentations is completed or removed from the relevant contour.
- In the method of claim 7, A method further comprising updating the statistics of the object recognition process based on one or more of the following: the number of objects having a quality above a threshold, the average indentation of one or more objects having a quality below a threshold, the median indentation of one or more objects having a quality below a threshold, and the maximum indentation of one or more objects having a quality below a threshold.
- A non-transient computer-readable medium that stores instructions for an object recognition system, the instructions are: Detecting contours from measurement data of one or more objects provided by one or more vision sensors, To detect indentations in the contour of one or more of the aforementioned objects, To evaluate the correctness of each of the aforementioned concaves, Based on the evaluated correctness, one action selected from maintain, complete, and remove is performed for each of the aforementioned depressions to generate the corrected contour. Performing object recognition processing on the modified contour, Non-transient computer-readable media including [specific data/information].
- In the non-transient computer-readable medium described in claim 14, The evaluation of the correctness of each recess is performed on a non-transient, computer-readable medium based on whether the distance between the pole of each recess and the convex hull of the contour exceeds a threshold.
- In the non-transient computer-readable medium described in claim 14, The evaluation of the correctness of each recess is performed on a non-transient, computer-readable medium based on whether the ratio of the side length of the circumscribing rectangle of the contour associated with each recess exceeds a threshold.
- In the non-transient computer-readable medium according to claim 14, A non-transient computer-readable medium in which completion is selected as the action if each of the aforementioned indentations is evaluated to be longer than a preset threshold and inaccurate.
- In the non-transient computer-readable medium described in claim 14, If depth information between one or more of the objects and one or more of the vision sensors indicates that the depth difference in the region adjacent to each of the recesses exceeds a threshold, the completion is selected as the action for each of the recesses that is evaluated as inaccurate. A non-transient computer-readable medium in which, if the depth difference does not exceed a threshold, the removal is selected as the action for each recess evaluated as fraudulent.
- In the non-transient computer-readable medium according to claim 14, If each of the recesses associated with the first contour of the object causes a difference between the first contour of the object from one or more of the objects and the second contour associated with the object from one or more of the objects, then the removal is selected as the action for each of the recesses that is deemed inaccurate. A non-transient computer-readable medium in which, if each of the recesses associated with the first contour also occurs in the second contour, the completion is selected as the action for each of the recesses that is evaluated as inaccurate.
Description
This disclosure relates to object recognition systems in general, and more specifically, to object recognition in industrial and logistics systems. Object recognition is a critical problem in computer vision. Object recognition technology is widely used in industries such as warehouse automation, logistics, and retail. Common applications using object recognition include bin picking, piece picking, palletizing, and depalletizing. One of the most important pieces of information for object recognition is the object's contour (in most cases, the terms "edge," "boundary," and "contour" are used interchangeably). Numerous object recognition methods utilizing object contours have been developed, including learning-based and rule-based methods. However, detected contours often have problems, such as incomplete or redundant edges. Therefore, improving object recognition performance by identifying these problems and correcting the contours is highly desirable. However, current object contour detection methods are evaluated at the pixel level, and research focuses on improving the methods themselves, such as designing new neural networks, rather than identifying and improving contour problems. In the development of a mixed inventory unit (SKU) depalletizer, the contours of problematic objects are causing recognition errors. For example, multiple objects may be mistakenly identified as one, or one object may be mistakenly identified as two. To solve this problem, this implementation example focuses on how to evaluate the accuracy of the geometric shape of object contours. In object recognition, the contours of detected objects (e.g., through edge/boundary detection) may not always have the correct shape. However, the detected contours directly impact object recognition accuracy. Figure 1 provides two examples of detected object contours. As shown in Figure 1, the solid black contours are ground truth, and the dashed contours are detected contours. The image on the left shows that the detected contours may be correctly recognized even with redundant lines. The image on the right shows that the central boundary is incomplete, which could lead to the misidentification of two objects as a single object. These two contours, while similar in shape, should have different impacts on object recognition results, and therefore their accuracy should differ. The contour on the left should be considered relatively good, while the contour on the right should be considered poor. Therefore, a method for evaluating contours is needed to determine whether the detected contour is correct. The geometric accuracy of an object's contour should be judged by whether that contour leads to correct object recognition. Current pixel-level evaluation methods using edge/boundary detection are not suitable for evaluating contours in object recognition. Figure 2 illustrates this example, showing that the pixel-level accuracy of detected contours is not a suitable indicator of whether the contour leads to correct object recognition; rather, in object recognition, the geometric accuracy of the object's contour is more important than pixel-level accuracy. As shown in Figure 2, the pixel-level accuracy (overlap) of the detected contour (dashed line) and ground truth (solid line) is not high, yet object recognition using the detected contour may be correct. Therefore, other methods are needed to evaluate whether correct object recognition is possible from a contour. If the evaluation results are insufficient, the detected contours need to be corrected to improve contour quality and object recognition accuracy. One example of "correction" is modifying the contour by interpolating incomplete lines from the left image to the right image in Figure 3. Note that the modification shown in Figure 3 is not always appropriate unless the goal is to create a convex polygon. For example, applying the modification in Figure 3 to the object in the left image of Figure 1 would result in the object being mistakenly identified as two separate objects. Therefore, it is important to determine the appropriate modification for the contour of each object. The geometric shape of an object's contour contains important information for evaluating and designing methods to improve the contour. Indentations are one such geometric attribute and are utilized in the examples described herein. Furthermore, indentation statistics are useful for classifying objects. Object class information can be further used to design criteria for indentation evaluation and correction. Whether and how to modify the contour to obtain correct object recognition is based on the evaluation of the contour and indentations. The embodiments described herein include an object recognition system that uses a vision sensor to measure an object, uses the measured data to detect the object's contour, detects indentations in the object's contour, evaluates the correctness of each indentation based on at least one of th