KR-102961597-B1 - MAP GENERATION MODEL BUILDING APPARATUS AND MAP GENERATION APPARATUS USING THE SAME

KR102961597B1KR 102961597 B1KR102961597 B1KR 102961597B1KR-102961597-B1

Abstract

A map generation model building device according to the present invention comprises: a memory in which a map generation model building program is stored; and a processor that executes the map generation model building program, wherein the map generation model building program generates embedding data by applying a captured image taken by a moving device moving in a learning space to an encoder module, generates spatial map data by recording the embedding data in a map base data based on the position information of the moving device, generates a rendering image based on the position information of the moving device on the spatial map data using a decoder module, and trains the map generation model by updating the encoder module and the decoder module by comparing the rendering image and the captured image through a loss function, wherein the map base data includes a plurality of grids in which the embedding data is recorded, and the embedding data includes RGB information and depth information for each pixel of the captured image, and the encoder module records the embedding data in the plurality of grids of the map base data.

Inventors

오성회
권오빈
박정호

Assignees

서울대학교산학협력단

Dates

Publication Date: 20260507
Application Date: 20230807
Priority Date: 20230502

Claims (14)

In a map generation model building device for building a map generation model that generates map data in which image data is recorded, Memory where the map generation model building program is stored; and A processor that executes the above-mentioned map generation model building program, comprising: The above map generation model construction program is, A captured image taken by a mobile device moving in a target space is applied to an encoder module to generate embedding data, and the embedding data of the captured image is recorded in map base data based on the location information of the mobile device where the captured image was taken, thereby generating spatial map data in which the embedding data is recorded for each location, and a rendered image is generated from the embedding data corresponding to the location information of the mobile device on the spatial map data using a decoder module, and the rendered image and the captured image are compared through a loss function to update the encoder module and the decoder module to train the map generation model, The above map base data includes a plurality of grids in which the above embedding data is recorded, and The above embedding data is, It includes RGB information for each pixel of the above-mentioned captured image and depth information of the above-mentioned captured image, The above encoder module is, A map generation model building device that records the embedding data based on the position information of the moving device in the plurality of grids of the map base data.
In paragraph 1, The above encoder module is, A map generation model building device that is trained to set the central point of the above map base data as the initial position of the above moving device.
In paragraph 2, The above encoder module is, A map generation model building device trained to generate embedding data by calculating the 3D coordinates of each pixel based on the initial position using the position information of the moving device and the depth information of the captured image.
In paragraph 3, The above encoder module is, A map generation model building device trained to generate spatial map data by recording the average of embedding data for multiple pixels recorded in the same grid based on the 3D coordinates of each pixel above into the grid.
In paragraph 4, The above encoder module is, A map generation model building device trained to update spatial map data by recording the average of the first embedding data recorded in the grid and the second embedding data to be recorded in the grid in the grid.
In paragraph 1, The above map generation model construction program is, A map generation model construction device that generates target map data by recording a target image in map base data using the encoder module, and trains the map generation module to estimate a location corresponding to the target image on the spatial map data by matching the target map data with the spatial map data using a location estimation module.
In paragraph 6, The above encoder module is, A map generation model building device trained to generate target map data by recording the target image into the map base data based on the central point of the map base data.
In paragraph 6, The above position estimation module is, A map generation model building device that is trained to search for an expected location where the mobile device may be located in the spatial map data and an expected direction the mobile device is looking at from the expected location through matching of embedding data between the target map data and the spatial map data.
In paragraph 1, The above map generation model construction program is, A map generation model building device that trains the map generation model to correct the position information of the mobile device by using a position correction module to compare a rendering image based on the position information of the mobile device on the spatial map data through the decoder module with a captured image taken by the mobile device through a loss function.
In a map generating device, Memory where the map generation program is stored; and A processor that executes the above map generation program, comprising: The above map generation program is, Receives a captured image from a mobile device that moves while capturing a target space, and generates spatial map data for the target space by applying the location information of the mobile device where the captured image was taken and the captured image to a map generation model. The above map generation model is, Spatial map data is generated by machine learning through an encoder module and a decoder module, recording RGB information of an object included in the captured image and depth information of the captured image. The above encoder module is, An embedding data is generated that includes RGB information for each pixel of the above-mentioned captured image and depth information of the above-mentioned captured image, and the embedding data of the above-mentioned captured image is recorded in a plurality of grids of map base data based on the location information of the mobile device in which the above-mentioned captured image was captured, thereby generating spatial map data in which the embedding data is recorded for each location. The above decoder module is, A map generating device that generates a rendering image by rendering embedding data recorded in the spatial map data based on the position information of the moving device on the spatial map data.
In Paragraph 10, The above map generation program is, The target image is applied to the map generation model to estimate the location corresponding to the target image on the spatial map data, The above map generation model is, Machine learning is performed through the encoder module and the position estimation module to estimate the position corresponding to the target image on the spatial map data, and The above encoder module is, Generates target map data by recording the above target image in map-based data, and The above position estimation module is, A map generation device that estimates a location corresponding to the target image on the spatial map data through matching the target map data and the spatial map data.
In Paragraph 11, The above encoder module is, The target image is recorded in the map base data based on the central point of the map base data to generate the target map data, and The above position estimation module is, A map generating device that searches for an expected location where the mobile device may be located in the spatial map data and an expected direction the mobile device is looking at from the expected location through matching the embedding data of the target map data and the spatial map data.
In Paragraph 12, The above map generation program is, Output the above-mentioned predicted location onto the above-mentioned spatial map data, and A map generating device that transmits information regarding the above-mentioned expected location and the above-mentioned expected direction to a control unit of the above-mentioned moving device to move the above-mentioned moving device to the above-mentioned expected location.
In Paragraph 10, The above map generation program is, Correcting the location information of the mobile device on the spatial map data through the above map generation model, The above map generation model is, Machine learning is performed through the above decoder module and position correction module to correct the position information of the mobile device on the spatial map data, and The above position correction module is, A map generation device that corrects the position information of the mobile device by comparing a rendering image based on the position information of the mobile device on the spatial map data through the above decoder module with a captured image taken by the mobile device through a loss function.

Description

Map generation model building apparatus and map generation apparatus using the same The present invention relates to a device for generating map data in which image data is recorded. Moving robots are being utilized in various fields due to advancements in sensors and controllers. Representative examples include robotic vacuum cleaners in the home, service robots for public spaces, conveyor robots in production sites, and worker support robots; furthermore, the scope of application and demand for moving robots are expected to increase explosively in the future. These mobile robots recognize their location without prior information about the surrounding environment and generate maps to record information about the environment. The process of generating maps and the process of measuring location are performed organically and simultaneously, which is called Simultaneous Localization and Map-building (SLAM). Mobile robots acquire information about their surrounding environment to generate maps, and a Time of Flight (TOF) camera can be used for this purpose. A TOF camera is a device that acquires three-dimensional distance information using the Time of Flight method, which measures the time it takes for infrared rays emitted from a light-emitting part to be reflected by an object and return to a light-receiving part. Since it can calculate three-dimensional distance information from infrared intensity images without a separate complex calculation process, it is possible to acquire three-dimensional distance information in real time. Grid-based mapping has been used as a method for mobile robots to generate maps containing surrounding environment information using such TOF cameras. Grid-based mapping divides the surrounding environment into small 3D grids and fills the corresponding grids with surrounding environment information based on the current position information of the mobile robot and 3D distance information of the TOF camera. In other words, a 3D grid map is constructed by probabilistically recording the probability that an object exists in an arbitrary grid in 3D space. However, the grid-based map generation method had a problem in that the accuracy of the generated map decreased as the driving distance of the mobile robot increased or the size of the target space increased, because the information registered in the grid became inaccurate as the error in the position information of the mobile robot increased. As such, conventional map generation methods generate maps by storing only the overall structure of the environment or the presence or absence of obstacles, and visual information such as the shapes of objects included in the observed environment is mostly represented in the form of sparse graphs due to capacity issues. In addition, conventional methods that can render environmental information as images on 2D or 3D maps focus on obtaining high-resolution images, resulting in slow data processing speeds and requiring a large amount of memory, making them somewhat difficult to apply to actual work in terms of speed and memory. Therefore, technology capable of overcoming these limitations is required. FIG. 1 is a conceptual diagram of a map generation model construction device according to one embodiment of the present invention. Figure 2 is a block diagram schematically showing the configuration of a map generation model. Figure 3 is an example diagram schematically illustrating the learning process for generating map data through an encoder module and a decoder module. Figure 4 is an example diagram schematically illustrating the learning process for estimating the expected location corresponding to the target image through a location estimation module. Figure 5 is an example diagram schematically illustrating the learning process for correcting the position information of a moving device through a position correction module. FIG. 6 is a conceptual diagram schematically showing a map generating device according to an embodiment of the present invention. Figure 7 is a conceptual diagram schematically illustrating the operation of a map generation program. FIGS. 8 to 11 are exemplary diagrams for explaining the operation of a map generation device. The present invention will be described in detail below with reference to the attached drawings. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. Furthermore, the attached drawings are intended only to facilitate understanding of the embodiments disclosed in this specification, and the technical concept disclosed in this specification is not limited by the attached drawings. In order to clearly explain the present invention in the drawings, parts unrelated to the explanation have been omitted, and the size, form, and shape of each component shown in the drawings may be varied in various ways. Identical or similar parts throughout the specification are denoted by identical or