CN-122023646-A - Effect graph generation and local modification system and method
Abstract
The application discloses an effect diagram generation and local modification system and method, and belongs to the technical field of effect diagram generation. The method comprises the steps of extracting outline features, core objects and background areas of a two-dimensional sketch through a sketch input and preprocessing module, inputting a pre-trained cross-mode generation model by combining preset style parameters, mapping an abstract two-dimensional sketch image into an initial geometric structure and texture features of a 3D model which meet style requirements, simultaneously, analyzing a natural language modification instruction input by a user into a modification object and specific operation by combining an object and a 3D model component mapping library through a natural language prompt word analysis module, adjusting grid vertex coordinates and rendering parameters of a target component through a local modification execution module, reducing local modification difficulty, and realizing 'initial geometric structure and texture feature generation-local modification-high-definition rendering' of a two-dimensional sketch image input-3D model, thereby shortening the generation period and modification difficulty of the 3D model.
Inventors
- Ge Peijuan
- XU TAO
- WANG YANG
- ZHU XUEDAN
- XU MINGXI
- JIANG JUN
- WANG CHENYUE
- ZHANG LING
- ZOU YIBIN
Assignees
- 上海勘测设计研究院有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260105
Claims (10)
- 1. An effect map generation and local modification system, comprising: the sketch input and preprocessing module is used for extracting outline features, core objects and background areas of the two-dimensional sketch image according to the two-dimensional sketch image uploaded by a user; The sketch and 3D feature mapping module is used for generating an initial geometric structure and texture features of the 3D model based on a pre-trained cross-mode generation model according to outline features, core objects and background areas of the two-dimensional sketch image and preset style parameters; The natural language prompt word analysis module is used for determining a first modification object and a first modification instruction at least according to a first natural language prompt word input by a user, and determining a first part to be modified corresponding to the first modification object in the 3D model based on a preset object and 3D model part mapping library; The local modification execution module is used for updating grid vertex coordinates and rendering parameters of the first part to be modified at least according to the first modification instruction to obtain a target 3D model; and the 3D rendering and outputting module is used for rendering the illumination, shadow and material of the target 3D model and generating a high-definition effect graph of the target 3D model.
- 2. The system of claim 1, wherein the sketch input and preprocessing module comprises a contour feature extraction unit and a semantic segmentation unit; the contour feature extraction unit is used for extracting contour features of the two-dimensional sketch image based on a preset edge detection algorithm; the semantic segmentation unit is used for carrying out semantic segmentation on the two-dimensional sketch image based on a pre-trained deep learning model, identifying the category label and the spatial position of each pixel in the two-dimensional sketch image, and determining the core object and the background area of the two-dimensional sketch image based on the category label and the spatial position of each pixel.
- 3. The system of claim 2, wherein the contour features of the two-dimensional sketch image include at least geometric features and spatially distributed features; the generating process of the initial geometry and texture features of the 3D model comprises the following steps: generating a two-dimensional feature vector of the two-dimensional sketch according to the geometric features, the spatial distribution features, the class labels of each pixel and the spatial positions, wherein the two-dimensional feature vector of the two-dimensional sketch at least comprises shape, semantics and spatial joint information for representing the two-dimensional sketch; According to preset style parameters, determining a 3D basic model corresponding to the preset style parameters from a preset object and 3D model component mapping library; fusing the two-dimensional feature vector of the two-dimensional sketch and the 3D basic model to obtain a fusion result; and adjusting the fusion result according to the preset style parameters to generate an initial geometric structure and texture characteristics of the 3D model.
- 4. The system of claim 1, wherein the natural language hint word parsing module includes at least an instruction receiving unit and the local modification performing module includes at least an iterative optimization unit; The instruction receiving unit is at least used for: Determining a second modification object and a second modification instruction according to a second natural language prompt word input by the user, and determining a second part to be modified corresponding to the second modification object in the 3D model based on a preset object and 3D model part mapping library; the iterative optimization unit is at least used for: Updating grid vertex coordinates and rendering parameters of the second part to be modified according to the second part to be modified and the second modification instruction to obtain an iterated 3D model; And generating the target 3D model according to the confirmation modification instruction input by the user and the iterated 3D model.
- 5. The system of claim 4, wherein the natural language hint word parsing module further comprises at least a preview unit configured to: The first part to be modified and the target 3D model are displayed after updating the mesh vertex coordinates and rendering parameters of the first part to be modified, and/or the second part to be modified and/or the target 3D model are displayed after updating the mesh vertex coordinates and rendering parameters of the second part to be modified.
- 6. The system according to claim 1, wherein the preset object and 3D model component mapping library includes at least a component ID corresponding to the core object, and a mapping relationship between the core object and the component ID; The natural language prompt word analysis module at least further comprises a component to be modified determining unit, and the component to be modified determining unit is at least used for: Determining a target core object in the 3D model according to the first modification instruction, wherein the target core object represents a core object corresponding to the first modification instruction; and determining the part ID corresponding to the target core object as the first part to be modified according to the mapping relation between the core object and the part ID.
- 7. The system of claim 1, wherein the training process of the pre-trained cross-modality generation model comprises: Collecting at least one two-dimensional sketch data set, adding a cross-modal alignment label for each two-dimensional sketch in the two-dimensional sketch data set, wherein the cross-modal alignment labels represent 3D model categories corresponding to the two-dimensional sketch; collecting at least one 3D model data set, and adding a category label to each model in the 3D model data set; According to the semantic consistency between the cross-modal alignment label and the class label, constructing a training sample pair for model training, wherein the training sample pair at least comprises a positive training sample pair for representing the two-dimensional sketch and model combination of the same class and a negative training sample pair for representing the two-dimensional sketch and model combination of different classes; calculating cosine similarity between the two-dimensional sketch in the training sample pair and the model through a cross-modal generation model to be trained, and obtaining cosine similarity of each positive training sample pair and cosine similarity of each negative training sample pair; And updating model parameters of the cross-modal generating model to be trained by taking the maximization of the cosine similarity of each positive training sample pair and the minimization of the cosine similarity of each negative training sample pair as targets to obtain the pre-trained cross-modal generating model.
- 8. The system according to claim 1, wherein the process of constructing the mapping library of the preset object and the 3D model component comprises: Collecting WordNet semantic network data, extracting hierarchical relations and semantic attributes of natural language words, and constructing a word relation semantic graph; And according to the vocabulary relation semantic graph and the component labeling data of the standardized 3D model dataset, natural language vocabularies pointing to the same 3D component are aggregated and mapped to a unified component ID, and the preset object and 3D model component mapping library is obtained.
- 9. The system of claim 1, further comprising a history version storage module to: Storing at least initial geometry and texture features of the 3D model, the target 3D model and a high definition effect map of the target 3D model.
- 10. An effect graph generation and local modification method is characterized by comprising the following steps: extracting outline features, core objects and background areas of the two-dimensional sketch image according to the two-dimensional sketch image uploaded by a user; Generating a model based on pre-trained cross-mode generation according to the outline characteristics, the core object and the background area of the two-dimensional sketch image and preset style parameters, and generating an initial geometric structure and texture characteristics of a 3D model; Determining a first modification object and a first modification instruction at least according to a first natural language prompt word input by a user, and determining a first part to be modified corresponding to the first modification object in the 3D model based on a preset object and 3D model part mapping library; updating grid vertex coordinates and rendering parameters of the first part to be modified at least according to the first modification instruction to obtain a target 3D model; And performing illumination, shadow and material rendering on the target 3D model to generate a high-definition effect graph of the target 3D model.
Description
Effect graph generation and local modification system and method Technical Field The application belongs to the technical field of effect diagram generation, and particularly relates to an effect diagram generation and local modification system and method. Background Along with the development of the digital design field, the 3D effect graph is used as an intuitive visual expression tool and is widely applied to the fields of industrial design, architectural design, game development and the like. Traditional 3D modeling needs to rely on professional software, is complex in flow and high in requirements on operator skills, and is difficult to meet the requirements of rapid prototyping. In the prior art, the 3D generation technology based on the text needs to rely on accurate text description and has high requirement on the expression capability of a user, and the 3D generation technology based on the image needs to take a real image as input and cannot directly process abstract expressions such as a user hand-drawn sketch. In addition, the local modification function of the existing 3D generation tool is dependent on manual frame selection or parameter adjustment, the capability of accurately positioning a modification region through natural language is lacking, a user needs to master model hierarchical structure knowledge, and an operation threshold is high. These problems severely restrict the popular application of 3D modeling and the improvement of design efficiency. Disclosure of Invention The embodiment of the application aims to provide an effect diagram generation and local modification system and method, which can solve the problems that the traditional 3D modeling cannot directly apply a sketch by hand and the local modification efficiency is low. In order to solve the technical problems, the application is realized as follows: in a first aspect, an embodiment of the present application provides an effect map generating and local modifying system, including: the sketch input and preprocessing module is used for extracting outline features, core objects and background areas of the two-dimensional sketch image according to the two-dimensional sketch image uploaded by a user; The sketch and 3D feature mapping module is used for generating an initial geometric structure and texture features of the 3D model based on a pre-trained cross-mode generation model according to outline features, core objects and background areas of the two-dimensional sketch image and preset style parameters; The natural language prompt word analysis module is used for determining a first modification object and a first modification instruction at least according to a first natural language prompt word input by a user, and determining a first part to be modified corresponding to the first modification object in the 3D model based on a preset object and 3D model part mapping library; The local modification execution module is used for updating grid vertex coordinates and rendering parameters of the first part to be modified at least according to the first modification instruction to obtain a target 3D model; and the 3D rendering and outputting module is used for rendering the illumination, shadow and material of the target 3D model and generating a high-definition effect graph of the target 3D model. Further, the sketch input and preprocessing module comprises a contour feature extraction unit and a semantic segmentation unit; the contour feature extraction unit is used for extracting contour features of the two-dimensional sketch image based on a preset edge detection algorithm; the semantic segmentation unit is used for carrying out semantic segmentation on the two-dimensional sketch image based on a pre-trained deep learning model, identifying the category label and the spatial position of each pixel in the two-dimensional sketch image, and determining the core object and the background area of the two-dimensional sketch image based on the category label and the spatial position of each pixel. Further, the contour features of the two-dimensional sketch image at least comprise geometric features and spatial distribution features; the generating process of the initial geometry and texture features of the 3D model comprises the following steps: generating a two-dimensional feature vector of the two-dimensional sketch according to the geometric features, the spatial distribution features, the class labels of each pixel and the spatial positions, wherein the two-dimensional feature vector of the two-dimensional sketch at least comprises shape, semantics and spatial joint information for representing the two-dimensional sketch; According to preset style parameters, determining a 3D basic model corresponding to the preset style parameters from a preset object and 3D model component mapping library; fusing the two-dimensional feature vector of the two-dimensional sketch and the 3D basic model to obtain a fusion result; and adjusting