JP-7855258-B1 - Three-dimensional object detection method and apparatus

JP7855258B1JP 7855258 B1JP7855258 B1JP 7855258B1JP-7855258-B1

Abstract

[Problem] To provide a method and apparatus for detecting three-dimensional objects. [Solution] The three-dimensional object detection device according to the embodiment of the present disclosure includes a sensor that captures a plurality of two-dimensional images of an external object, a processor that detects the external object based on the plurality of two-dimensional images, generates a plurality of two-dimensional bounding boxes of the external object, and obtains a plurality of three-dimensional bounding boxes of the external object based on the plurality of two-dimensional bounding boxes, and a display that displays the final three-dimensional bounding box among the plurality of three-dimensional bounding boxes to the outside. [Selection Diagram] Figure 1

Inventors

ナム，ウンソン
チュン，ギチョル
シン，グァンチョル
イ，ギフン

Assignees

シサーンエーアイカンパニーリミテッド

Dates

Publication Date: 20260508
Application Date: 20241218

Claims (10)

A three-dimensional object detection device, A sensor that captures multiple two-dimensional images of an external object, A processor that detects the external object based on the plurality of two-dimensional images, generates a plurality of two-dimensional bounding boxes relating to the external object, and obtains a plurality of three-dimensional bounding boxes relating to the external object based on the plurality of two-dimensional bounding boxes, The system includes a display that shows the final three-dimensional bounding box among the plurality of three-dimensional bounding boxes to the outside, The sensor includes multiple cameras that capture the multiple two-dimensional images using an X-ray imaging method. Each of the aforementioned multiple cameras is characterized by capturing the multiple two-dimensional images via a fan beam type shooting method by photographing the external object one row at a time as the external object moves along the z-axis . The width direction of each of the plurality of two-dimensional bounding boxes is characterized in that it corresponds to the z-axis direction. The processor is characterized by: (i) projecting each of the plurality of two-dimensional bounding boxes onto the voxels of the xy plane of the three-dimensional voxel space; (ii) obtaining three-dimensional voxels relating to the external object by selecting only the voxels belonging to all of the plurality of two-dimensional bounding boxes from the voxels of the xy plane; (iii) generating the plurality of three-dimensional bounding boxes using the selected three-dimensional voxels; and (iv) setting the width value of the plurality of two-dimensional bounding boxes to the z-axis value of each of the plurality of three-dimensional bounding boxes. Three-dimensional object detection device.
The aforementioned processor, The three-dimensional object detection device according to claim 1, characterized in that false positive values are removed from the plurality of three-dimensional bounding boxes to obtain the final three-dimensional bounding box.
The aforementioned processor, Each of the aforementioned three-dimensional bounding boxes is projected onto a plurality of two-dimensional bounding boxes, The projected plurality of two-dimensional bounding boxes are compared with the plurality of pre-generated two-dimensional bounding boxes, Select the final two-dimensional bounding box that minimizes the error with the aforementioned plurality of two-dimensional bounding boxes. The three-dimensional object detection device according to claim 2, characterized in that it generates the final three-dimensional bounding box using the selected final two-dimensional bounding box.
Each of the aforementioned multiple cameras is The three-dimensional object detection device according to claim 1, characterized in that it captures the plurality of two-dimensional images while geometric calibration is performed.
The aforementioned processor, The three-dimensional object detection device according to claim 4, characterized in that it generates the plurality of three-dimensional bounding boxes from each of the plurality of two-dimensional bounding boxes based on calibration information relating to the plurality of cameras.
A method for a three-dimensional object detection device to detect a three-dimensional object, The steps include taking multiple two-dimensional images of an external object, The steps include detecting the external object based on the plurality of two-dimensional images, The steps include generating a plurality of two-dimensional bounding boxes relating to the aforementioned external object, The steps include obtaining a plurality of three-dimensional bounding boxes relating to the external object based on the plurality of two-dimensional bounding boxes, The steps include: displaying the final three-dimensional bounding box among the plurality of three-dimensional bounding boxes to the outside; The steps include: capturing the plurality of two-dimensional images using an X-ray imaging method with the plurality of cameras included in the three-dimensional object detection device; The steps include capturing the plurality of two-dimensional images via a fan beam type shooting method by photographing the external object one row at a time while the external object moves along the z-axis, The steps include projecting each of the aforementioned two-dimensional bounding boxes onto a voxel in the xy-plane of the three-dimensional voxel space, The steps include obtaining three-dimensional voxels relating to the external object by selecting only the voxels belonging to all of the plurality of two-dimensional bounding boxes from the voxels of the xy plane, The steps include generating the plurality of three-dimensional bounding boxes using the selected three-dimensional voxels, The steps include setting the width values of the plurality of two-dimensional bounding boxes to the respective z-axis values of the plurality of three-dimensional bounding boxes, Includes, The width direction of each of the plurality of two-dimensional bounding boxes corresponds to the z-axis direction. method.
The method according to claim 6, further comprising the step of removing false positive values from the plurality of three-dimensional bounding boxes to obtain the final three-dimensional bounding box.
The steps include projecting each of the aforementioned plurality of three-dimensional bounding boxes onto a plurality of two-dimensional bounding boxes, The steps include comparing the projected plurality of two-dimensional bounding boxes with the plurality of pre-generated two-dimensional bounding boxes, The steps include selecting the final two-dimensional bounding box that minimizes the error with the aforementioned plurality of two-dimensional bounding boxes, The method according to claim 7, comprising the step of generating a final three-dimensional bounding box using the selected final two-dimensional bounding box.
The method according to claim 6, further comprising the step of capturing the plurality of two-dimensional images with the plurality of cameras while geometric calibration has been performed.
The method according to claim 9, comprising the step of generating the plurality of three-dimensional bounding boxes from each of the plurality of two-dimensional bounding boxes based on calibration information relating to the plurality of cameras.

Description

This disclosure relates to a three-dimensional object detection method and apparatus using multi-view images acquired using X-ray computed tomography (CT) equipment. Three-dimensional reconstruction technology is being utilized in a variety of fields, including medical imaging, X-ray object detection, autonomous driving and robotics, gaming and VR/AR, architecture and urban planning, and film and visual effects. In particular, X-ray-based baggage (object) detection technology is driving significant innovation, greatly improving the accuracy and efficiency of security searches at airports and ports. However, when using conventional X-ray equipment for baggage detection, baggage information is provided as a two-dimensional image, but there are limitations to accurately determining the shape and position of an object using only a two-dimensional image. This is a flowchart showing a three-dimensional object detection method according to the embodiments of this disclosure.This is a block diagram showing the configuration of an object detection device according to an embodiment of the disclosure.This figure shows an example of an image acquisition method according to an embodiment of the present disclosure.This invention illustrates the two-dimensional boundary box detection process according to the embodiments of this disclosure.This figure shows the process by which an object detection device in an embodiment of the present disclosure performs three-dimensional voxelization on an object.This figure shows the process by which an object detection device in an embodiment of the present disclosure performs three-dimensional voxelization on an object.The three-dimensional bounding box reconstructed by embodiments of this disclosure is shown.The three-dimensional bounding box reconstructed by embodiments of this disclosure is shown.Embodiments of this disclosure illustrate the process by which an object detection device visualizes a three-dimensional bounding box. [Explanation of terms used in this specification] All embodiments described below are illustrative and provided to aid in understanding this disclosure, and can be modified and implemented in various forms different from those described herein. In describing this disclosure, if it is determined that a specific description of a relevant known function or component would unnecessarily obscure the essence of this disclosure, such description will be omitted. The attached drawings are not drawn to actual scale to aid in understanding the disclosure, and the dimensions of some components may be exaggerated. When a reference number is provided for each component, the same reference number is used whenever possible, even if the same component appears in other drawings. Furthermore, when describing the components of the embodiments of this disclosure, terms such as "first," "second," "A," "B," "(a)," and "(b)" may be used. These terms are merely for distinguishing a component from other components, and do not limit the nature, order, or sequence of the component. When it is stated that a component is "connected," "joined," or "connected" to another component, it should be understood that the component may be directly connected, joined, or connected to the other component, but another component may also be "connected," "joined," or "connected" between the two components. Therefore, the embodiments and configurations shown in the drawings described herein are merely the most preferred embodiments of this disclosure and do not represent all of the technical ideas of this disclosure; various variations of this disclosure are possible. Furthermore, terms or words used in this specification and claims should not be limited to their ordinary or lexicographical meanings, but should be interpreted in a sense and concept consistent with the technical idea of this disclosure, in accordance with the principle that inventors may appropriately define the concepts of terms in order to best describe their disclosure. Furthermore, the singular expressions used in this application include plural expressions unless they clearly mean something different in context. [Three-dimensional object detection method: Figure 1] Figure 1 is a flowchart showing a three-dimensional object detection method according to an embodiment of the present disclosure. As shown in Figure 1, the three-dimensional object detection method (S100) includes steps S110, S120, S130, S140, S150, S151, and S160. A detailed explanation is provided below. First, the object detection device according to the embodiment of this disclosure (described later with reference to Figure 2) captures multiple images of an external object using multiple cameras (S110). Here, the object detection device can capture images based on a fanbeam-type X-ray imaging method. Here, the object detection device can capture images of the object one row at a time. Here, the object may include a hazardous article (i.e., a gun, sword, scissors, or