KR-102961310-B1 - Method and device for object detection in point cloud based on feature extraction method robust to outliers

KR102961310B1KR 102961310 B1KR102961310 B1KR 102961310B1KR-102961310-B1

Abstract

A method for detecting objects in a point cloud performed by a processor is disclosed. The method for detecting objects in a point cloud comprises the steps of: dividing an XY plane in a 3D point cloud into a plurality of grids to form vertical columns along the Z-axis; applying LiDAR points included in each of the vertical columns to a first neural network to generate point feature vectors; calculating attention scores for the point feature vectors; and detecting objects in the 3D point cloud according to the calculated attention scores.

Inventors

천창환
신승환

Assignees

(주)뷰런테크놀로지

Dates

Publication Date: 20260508
Application Date: 20250729

Claims (8)

In a method for detecting objects in a point cloud performed by a processor, A step of dividing the XY plane in a 3D point cloud into multiple grids to form vertical columns along the Z-axis; A step of applying each of the LiDAR points included in each of the above vertical columns to a first neural network to generate point feature vectors each having k dimensions; A step of calculating attention scores for the above point feature vectors; and The method includes the step of detecting objects in the 3D point cloud according to the calculated attention scores, The step of calculating attention scores for the above point feature vectors is, A step of dividing k-dimensional point feature vectors of n points into M groups; A step of applying n k/M dimension point feature vectors divided into the M groups to a convolution block to output n compressed feature vectors divided into the M groups; and A method for object detection in a point cloud comprising the step of calculating attention scores for n compressed feature vectors divided into the M groups.
A method for detecting objects in a point cloud, wherein, in claim 1, the first neural network is a multi-layer perceptron.
In claim 1, the step of calculating the attention scores for n compressed feature vectors divided into the M groups is: A method for object detection in a point cloud comprising the step of applying n compressed feature vectors divided into the M groups to a softmax function to calculate the attention scores.
In claim 1, the step of detecting objects in the 3D point cloud according to the calculated attention scores is, A step of generating group-specific weighted sums for each of the n k/M dimension point feature vectors divided into the M groups and each of the plurality of attention scores divided into the M groups; A step of performing a continuous operation of combining the above weighted sums to output a single feature vector; and A method for detecting objects in a point cloud, comprising the step of detecting objects in the 3D point cloud using the above-mentioned feature vector.
A processor that executes object detection commands in a point cloud; and It includes memory for storing the above commands, The above commands are, To form a filler along the Z-axis, the XY plane in the 3D point cloud is divided into multiple grids, and Each of the LiDAR points included in each of the above-mentioned fillers is applied to the first neural network to generate point feature vectors having k dimensions, and Attention scores are calculated for the above point feature vectors, It is implemented to detect objects in the 3D point cloud according to the above-calculated attention scores, and The commands for calculating attention scores for the above point feature vectors are, Dividing k-dimensional point feature vectors of n points into M groups, n k/M dimension point feature vectors divided into the above M groups are applied to a convolution block to output n compressed feature vectors divided into the above M groups, and A computing device implemented to calculate attention scores for n compressed feature vectors divided into the M groups.
In paragraph 5, the first neural network is a computing device that is a multi-layer perceptron.
In paragraph 5, the instructions for calculating the attention scores for the n compressed feature vectors divided into the M groups are, A computing device implemented to calculate the attention scores by applying a softmax function to n compressed feature vectors divided into the M groups.
In paragraph 5, the commands for detecting objects in the 3D point cloud according to the calculated attention scores are, For each of the n k/M dimension point feature vectors divided into the M groups and each of the multiple attention scores divided into the M groups, weighted sums are generated for each group, Performs a continuous operation of combining the above weighted sums to output a single feature vector, A computing device implemented to detect an object in the 3D point cloud using the above-mentioned single feature vector.

Description

Method and device for object detection in point cloud based on feature extraction method robust to outliers The present invention relates to a method and apparatus for object detection in a point cloud, and more specifically, to an object detection method and apparatus based on a feature extraction method robust to outliers. LiDAR-based object detection technology is used in the field of autonomous driving to recognize objects in the surrounding environment in real time and to estimate their location and size. To this end, processing is required to accurately extract information regarding the presence and boundaries of objects from point clouds. However, since point clouds have irregular distributions and low density characteristics, an effective feature extraction method is essential. Detailed descriptions of each drawing are provided to help to more fully understand the drawings cited in the detailed description of the present invention. Figure 1 shows a block diagram of a lidar system according to an embodiment of the present invention. FIG. 2 shows a structural diagram for detecting objects in a point cloud according to an embodiment of the present invention. Figure 3 shows a detailed structural diagram of the pillar extraction layer illustrated in Figure 2. FIG. 4 shows a flowchart for explaining an object detection method in a point cloud according to an embodiment of the present invention. Figure 5 shows a flowchart for explaining in detail the operation of outputting a feature vector for each of the fillers illustrated in Figure 4. Specific structural or functional descriptions regarding embodiments according to the concept of the present invention disclosed herein are provided merely for the purpose of explaining embodiments according to the concept of the present invention, and embodiments according to the concept of the present invention may be implemented in various forms and are not limited to the embodiments described herein. Embodiments according to the concept of the present invention may be subject to various modifications and may take various forms; therefore, embodiments are illustrated in the drawings and described in detail in this specification. However, this is not intended to limit the embodiments according to the concept of the present invention to specific disclosed forms, and includes all modifications, equivalents, or substitutions that fall within the spirit and scope of the present invention. Terms such as "first" or "second" may be used to describe various components, but said components should not be limited by said terms. For the sole purpose of distinguishing one component from another, for example, without departing from the scope of rights according to the concept of the present invention, the first component may be named the second component, and similarly, the second component may be named the first component. When it is stated that one component is "connected" or "connected" to another component, it should be understood that while it may be directly connected or connected to that other component, there may also be other components in between. Conversely, when it is stated that one component is "directly connected" or "directly connected" to another component, it should be understood that there are no other components in between. Other expressions describing the relationship between components, such as "between" and "exactly between," or "adjacent to" and "directly adjacent to," should be interpreted in the same way. The terms used herein are used merely to describe specific embodiments and are not intended to limit the invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this specification, terms such as “comprising” or “having” are intended to indicate the existence of the described features, numbers, steps, actions, components, parts, or combinations thereof, and should be understood as not precluding the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as generally understood by those skilled in the art to which the present invention pertains. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with their meaning in the context of the relevant technology, and should not be interpreted in an ideal or overly formal sense unless explicitly defined in this specification. Hereinafter, the present invention will be described in detail by explaining preferred embodiments of the present invention with reference to the attached drawings. Figure 1 shows a block diagram of a lidar system according to an embodiment of the present invention. Referring to FIG. 1, the lidar system (100) means a system capable of detecting an object (30) and tracking the object (30). The object (30) is