KR-20260062768-A - JOINT OPTIMIZATION METHOD OF TEXTURE AND GEOMETRY ENHANCEMENT FOR HIGH-QUALITY 3D MESH CONTENT CREATION FROM A SINGLE IMAGE

KR20260062768AKR 20260062768 AKR20260062768 AKR 20260062768AKR-20260062768-A

Abstract

A combined optimization method for texture and geometry enhancement to generate high-quality 3D mesh content from a single image includes the steps of: generating a coarse 3D texture mesh based on a single image using a pre-trained large-scale reconstruction model; extracting a sparse vertex mesh from the coarse 3D texture mesh, extracting a normal map image from the sparse vertex mesh, and generating a high-density vertex mesh by performing a successive remeshing operation based on inverse rendering on the normal map image; and generating an enhanced 3D texture mesh with enhanced geometry and texture by applying a texture enhancement diffusion model to the high-density vertex mesh.

Inventors

조경은
박지선

Assignees

국방과학연구소
동국대학교 산학협력단

Dates

Publication Date: 20260507
Application Date: 20241219
Priority Date: 20241029

Claims (16)

In a method for optimizing the combination of texture and geometry enhancement for generating high-quality 3D mesh content from a single image, A step of generating a coarse 3D texture mesh based on a single image using a pre-trained large-scale reconstruction model; A step of extracting a sparse vertex mesh from the above-mentioned coarse 3D texture mesh, extracting a normal map image from the above-mentioned sparse vertex mesh, and generating a high-density vertex mesh by performing a continuous remeshing operation based on inverse rendering on the normal map image; and A method for combined optimization of texture and shape enhancement, comprising the step of generating an enhanced 3D texture mesh with enhanced shape and texture by applying a texture enhancement diffusion model to the above high-density vertex mesh.
In Article 1, The method further includes the step of generating a multi-view normal image and a multi-view RGB image from the single image using a multi-view diffusion model, The above large-scale reconstruction model is a combined optimization method of texture and shape improvement that generates the above rough 3D texture mesh using the above multi-viewpoint RGB images.
In Article 1, The process of extracting a normal map image from the above sparse vertex mesh is, A step of generating a multi-viewpoint normal map image from the sparse vertex mesh using a depth-based normal map generation model; A step of generating an improved multi-point normal map image from the multi-point normal map image using a normal improvement network; and A method for combined optimization of texture and shape improvement, comprising the step of generating an improved multi-view alpha image from the improved multi-view normal map image using an alpha image generation model.
In Paragraph 3, The process of generating a high-density vertex mesh by performing a continuous inverse rendering-based remeshing operation on the above normal map image is, A combined optimization method for texture and shape improvement comprising the step of generating the high-density vertex mesh by performing a reverse rendering-based continuous remeshing process using the improved multi-viewpoint normal map image and the improved multi-viewpoint alpha image.
In Paragraph 4, The step of generating the above high-density vertex mesh is, A combined optimization method for texture and shape enhancement comprising the step of calculating the loss between a rendered normal map image and an improved multi-viewpoint normal map image when vertex and face values of a 3D mesh are input through inverse rendering, and optimizing the vertices and faces of the sparse vertex mesh through a loss function to generate the high-density vertex mesh.
In Article 1, The step of generating the above-mentioned improved 3D texture mesh is, A method for combined optimization of texture and shape improvement, comprising the step of generating a low-resolution high-density mesh texture map by performing high-density mesh texture mapping using a low-resolution texture map extracted from the sparse vertex mesh, the high-density vertex mesh, and the coarse 3D texture mesh.
In Article 6, The step of generating the above-mentioned improved 3D texture mesh is, A method for combined optimization of texture and shape enhancement, further comprising the step of inputting multi-viewpoint RGB images rendered from the high-density vertex mesh and the low-resolution high-density mesh texture map at six fixed viewpoints into an RGB enhancement network to generate an enhanced rendered image.
In Article 7, The step of generating the above-mentioned improved 3D texture mesh is, A combined optimization method for texture and shape improvement, further comprising the step of generating a high-resolution, high-density mesh texture map by updating texture information through inverse texture mapping applied to the above-mentioned improved rendering image.
In Article 7, A method for optimizing the combination of texture and shape improvement, which combines the high-resolution high-density mesh texture map and the high-density vertex mesh to generate the improved 3D texture mesh.
In a method for optimizing the combination of texture and geometry enhancement for generating high-quality 3D mesh content from a single image, A step of extracting a sparse vertex mesh from a coarse 3D texture mesh generated based on a single image; A step of extracting a normal map image from the above sparse vertex mesh; and A step of generating a high-density vertex mesh by performing a continuous remesh operation based on inverse rendering on the above normal map image; and A method for combined optimization of texture and shape enhancement, comprising the step of generating an enhanced 3D texture mesh with enhanced shape and texture by applying a texture enhancement diffusion model to the above high-density vertex mesh.
In Article 10, The step of extracting a normal map image from the above-mentioned sparse vertex mesh is: A step of generating a multi-viewpoint normal image from the sparse vertex mesh using a depth-based normal map generation model; A step of generating an improved multi-time normal image from the above multi-time normal image using a normal improvement network; and A method for combined optimization of texture and shape improvement, comprising the step of generating an improved multi-view alpha image from the improved multi-view normal image using an alpha image generation model.
In Article 11, The step of generating a high-density vertex mesh by performing a continuous remesh operation based on inverse rendering on the above normal map image is: A combined optimization method for texture and shape improvement comprising the step of generating the high-density vertex mesh by performing a reverse rendering-based continuous remeshing operation using the improved multi-view normal image and the improved multi-view alpha image.
In Article 12, The step of generating the above high-density vertex mesh is, A combined optimization method for texture and shape enhancement comprising the step of calculating the loss between a rendered normal map image and an improved multi-viewpoint normal map image when vertex and face values of a 3D mesh are input through inverse rendering, and optimizing the vertices and faces of the sparse vertex mesh through a loss function to generate the high-density vertex mesh.
In Article 11, The step of generating the above-mentioned improved 3D texture mesh is, A method for combined optimization of texture and shape improvement, comprising the step of generating a low-resolution high-density mesh texture map by performing high-density mesh texture mapping using a low-resolution texture map extracted from the sparse vertex mesh, the high-density vertex mesh, and the coarse 3D texture mesh.
In Article 14, The step of generating the above-mentioned improved 3D texture mesh is, A method for combined optimization of texture and shape enhancement, further comprising the step of inputting multi-viewpoint RGB images rendered from the high-density vertex mesh and the low-resolution high-density mesh texture map at six fixed viewpoints into an RGB enhancement network to generate an enhanced rendered image.
In Article 14, The step of generating the above-mentioned improved 3D texture mesh is, A step of generating a high-resolution, high-density mesh texture map by updating texture information through inverse texture mapping applied to the above-mentioned improved rendering image; and A method for combined optimization of texture and shape improvement, further comprising the step of generating the improved 3D texture mesh by combining the high-resolution high-density mesh texture map and the high-density vertex mesh.

Description

Joint Optimization Method of Texture and Geometry Enhancement for High-Quality 3D Mesh Content Creation from a Single Image The present invention relates to a method for optimizing the combination of texture and shape improvement for generating high-quality 3D mesh content from a single image. There are technologies for generating 3D content based on images, but existing technologies require video footage captured from multiple angles, such as Structure from Motion (SfM), Multi-View Stereo (MVS), and Neural Radiance Fields (NeRF). Recently, research on Large 3D Reconstruction Models (LRMs) has been conducted, which generate three-dimensional (3D) meshes from a single image by performing large-scale learning using images and 3D shapes with artificial intelligence. However, AI algorithms that estimate 3D shapes and generate meshes from a single image face difficulties in simultaneously achieving two goals: generalizing to predict the shapes of various objects and increasing the detail of the generated 3D objects. To address this problem, existing methods train image-based 3D mesh generation networks for the purpose of generalization by learning based on mesh information with a fixed 3D resolution. However, this approach is limited in its ability to represent complex shapes and detailed features due to the limited capacity of the AI network. Consequently, 3D mesh content generated using this LRM technology suffers from the problem of low quality in 3D shapes and textures. FIG. 1 is a block diagram showing the overall structure of a method for optimizing the combination of texture and shape improvement for generating high-quality 3D mesh content from a single image according to an embodiment of the present invention. FIG. 2 shows the structure of a shape improvement method in the combined optimization of texture and shape improvement for generating high-quality 3D mesh content from a single image according to one embodiment of the present invention. FIG. 3 shows the structure of a texture enhancement method in the combined optimization of texture and shape enhancement for generating high-quality 3D mesh content from a single image according to one embodiment of the present invention. Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings so that those skilled in the art can easily implement the present invention. The present invention may be embodied in various different forms and is not limited to the embodiments described herein. To clearly explain the present invention, parts unrelated to the explanation have been omitted, and the same reference numerals are used for identical or similar components throughout the specification. Furthermore, throughout the specification, when a part is described as "including" a certain component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components. FIG. 1 is a block diagram showing the overall structure of a method for optimizing the combination of texture and shape improvement for generating high-quality 3D mesh content from a single image according to an embodiment of the present invention. Referring to FIG. 1, the combined optimization method for texture and shape enhancement for generating high-quality 3D mesh content from a single image according to an embodiment of the present invention can perform shape enhancement (geometry enhancement) that performs mesh enhancement of complex shapes without relying on the fixed resolution of the 3D mesh by applying a continuous remeshing technique based on inverse rendering. In addition, the combined optimization method for texture and shape enhancement for generating high-quality 3D mesh content from a single image according to an embodiment of the present invention can perform texture enhancement that optimizes the detailed quality of the remeshed mesh texture through a texture enhancement diffusion model. A method for optimizing the combination of texture and shape improvement for generating high-quality 3D mesh content from a single image according to an embodiment of the present invention can be implemented as a computer device, a computer program executable on the computer device, and a storage medium storing the computer program executable on the computer device. To explain in more detail, the combined optimization method for texture and shape enhancement for generating high-quality 3D mesh content from a single image according to an embodiment of the present invention may include a coarse 3D mesh reconstruction step (S10), a geometry enhancement step (S20), and a texture enhancement step (S30). First, the rough 3D mesh reconstruction step (S10) is described. In the coarse 3D mesh reconstruction step (S10), a coarse 3D textured mesh can be generated based on a single image (RGB) through a pre-trained large 3D reconstruction model (3D LRMs) (13). The generated coarse 3D textured mesh may