US-20260127818-A1 - Method of Generating Three-Dimensional Model from Single Image

US20260127818A1US 20260127818 A1US20260127818 A1US 20260127818A1US-20260127818-A1

Abstract

A method of generating three-dimensional model from single image is disclosed. The method includes the following steps: inputting a plurality of two-dimensional images containing an assembled product, conducting manual annotation process to product components for establishing a component data set; training the images in the component data set and establishing a semantic segmentation network model. The semantic segmentation network model converts the graphic characteristics of the graphic data into component images; inputting the image to be converted, identifying the product category of the image to be converted, and selecting the corresponding component data set; conducting component segmentation by the semantic segmentation network model and separating the image to be converted into multiple components; and combining the multiple components by the geometry information and object description in the description file to form a three-dimensional product model.

Inventors

Che-Rung Lee
Iuan-Kai Fang

Assignees

K.E.A. Design Consultants, Inc.

Dates

Publication Date: 20260507
Application Date: 20251104
Priority Date: 20241106

Claims (10)

1 . A method of generating a three-dimensional model from a single image, the method comprising: inputting a plurality of two-dimensional images including an assembled product, conducting a manual annotation process to product components of the plurality of two-dimensional images to establish a component data set including a plurality of records of graphic data; training the plurality of records of graphic data in the component data set and establishing a semantic segmentation network model, the semantic segmentation network model being configured to convert graphic characteristics of the plurality of records of graphic data into component images; inputting an image to be converted, identifying a product category of the image to be converted, and selecting the corresponding component data set according to the product category; conducting component segmentation by the semantic segmentation network model and separating the image to be converted into a plurality of components, with each of the plurality of components including a description file; and combining the plurality of components by geometry information and an object description in the description file to form a three-dimensional product model.
2 . The method according to claim 1 , wherein the manual annotation process includes performing manual background removal and color block segmentation and labeling on the plurality of two-dimensional images, and the plurality of records of graphic data includes an original graphic data, a background-removed graphic data and a color block labeled graphic data.
3 . The method according to claim 2 , wherein the manual labeling operation includes adding a semantic annotation to the color block labeled graphic data, and the semantic annotation includes an addition or removal of the components, a combination of component types, a conversion of component functions, a change of component materials and a perceptual size difference.
4 . The method according to claim 1 , wherein the semantic segmentation network model is an encoder-and-decoder architecture based on a conditional generative adversarial network, the component image is generated from the plurality of records of graphic data, and an output result is determined by a Markov discriminator.
5 . The method according to claim 4 , wherein the semantic segmentation network model includes a self-propagation mechanism and a self-attention mechanism.
6 . The method according to claim 1 , wherein the geometric information includes a contour detection result of the plurality of components, the contour detection result includes a coordinate position and vector information, and the object description includes relative positional relationships of the plurality of components.
7 . The method according to claim 1 , further comprising: refining the three-dimensional product model by modifying respective components of the three-dimensional product model to form a three-dimensional fine model.
8 . The method according to claim 1 , further comprising: importing the three-dimensional product model into three-dimensional drawing software to generate a three-dimensional drawing model corresponding to the assembled product.
9 . The method according to claim 1 , wherein the assembled product includes a furniture product, a home appliance product, or an automotive product.
10 . The method according to claim 9 , wherein the furniture product includes a chair, table, bed, sofa, or cabinet.

Description

CROSS-REFERENCE TO RELATED APPLICATION The present application claims priority to, and the benefit of, Taiwan Patent Application No. 113142428, filed on Nov. 6, 2024, in the Taiwan Intellectual Property Office, the disclosure of which is incorporated by reference herein in its entirety. BACKGROUND OF THE INVENTION 1. Field of the Invention The present disclosure relates to a method for generating a three-dimensional model from a single image, and more particularly to a method for constructing a three-dimensional model from a single image of an assembled product, and the generated model includes detailed component models of the assembled product. 2. Description of the Related Art Converting two-dimensional (2D) images into three-dimensional (3D) objects or models is a classic problem in the field of computer vision. During the process of capturing 2D images, many important geometric properties may be lost or distorted, leading to ambiguity and making the reconstruction theoretically intractable. To overcome the difficulties of such transformation, existing technologies often rely on multi-view image synthesis, which involves identifying corresponding points across multiple images taken from different angles, and establishing spatial correspondences to reconstruct a 3D object. However, multi-view image synthesis may fail to accurately represent the original object's characteristics, and obtaining multiple images from different viewpoints is not always feasible in real-world scenarios. For example, in the interior design industry, designers frequently use 3D modeling software (such as SketchUp, Rhinoceros 3D, or 3D Max) to visualize design concepts. By scaling, translating, and rotating different objects within a virtual space, designers are able to simulate and present various design outcomes for client's review. Nevertheless, during discussions regarding additional objects for the interior space, only a single image may be provided by the clients, which may be sourced from the internet, magazines, or casual photographs. It is impractical to expect clients would supply images of the desired object from multiple angles for reconstruction and modeling. Consequently, designers must manually create a 3D model of the required object based on the provided image before integrating it into the existing design. This modeling process consumes significant amount of time and labor, increasing overall project cost and reducing design efficiency. Moreover, most existing 2D-to-3D model conversion technologies are limited to reconstructing only the overall contour of the object in the image. If the object is an assembled product, the resulting 3D model typically may not be able to be decomposed into its constituent components. Therefore, individual parts must be redrawn or modeled separately, which compromises usability and flexibility. In view of the above, although certain technologies for converting 2D images into 3D models have been proposed, they generally rely on multi-image synthesis and still suffer from limitations in conversion accuracy. In particular, the built 3D model is not provided with the capability to extract and manipulate individual components from assembled products, thereby restricting practical applications. To address these issues, a method for generating a three-dimensional model from a single image is conceived and developed, so as to overcome the shortcomings of existing techniques and enhance implementation and industrial utility. SUMMARY OF THE INVENTION In view of the aforementioned problems in the prior art, an object of the present disclosure is to provide a method for generating a three-dimensional model from a single image, so as to address the issues in conventional conversion methods, which are incapable of accurately constructing 3D models and generating disassembled component models of a product. According to one purpose of the present disclosure, a method of generating a three-dimensional model from a single image is provided, the method includes following steps: inputting a plurality of two-dimensional images including an assembled product, conducting a manual annotation process to product components of the plurality of two-dimensional images to establish a component data set; training a plurality of records of graphic data in the component data set and establishing a semantic segmentation network model, in which the semantic segmentation network model converts graphic characteristics of the plurality of records of graphic data into component images; inputting an image to be converted, identifying a product category of the image to be converted, and selecting the corresponding component data set according to the product category; conducting component segmentation by the semantic segmentation network model and separating the image to be converted into a plurality of components, each of the plurality of components including a description file; and combining the plurality of components by