US-12626101-B2 - Method for constructing design concept generation network (DCGN) and method for automatically generating conceptual scheme

US12626101B2US 12626101 B2US12626101 B2US 12626101B2US-12626101-B2

Abstract

A method for constructing a design concept generation network (DCGN) and a method for automatically generating a conceptual scheme are provided. A DCGN includes a Transformer encoder, a Transformer decoder, an importance constraint matrix generation module, an importance constraint embedding layer, a cross-attention (CA) layer, and an optimization module. A word importance constraint is ingeniously introduced based on an attention mechanism of a Transformer to record input word constraint information contained in a generated text sequence. This can effectively ensure the reliability and effectiveness of a generated conceptual scheme and is conducive to capturing potential semantic importance information and implementing semantic knowledge reasoning.

Inventors

Wu Zhao
Yiwei Jiang
Bo Wu
Xingyu CHEN
Miao Yu
Xin Guo
Kai Zhang
Qian Zhao
Hui Yu
Jun Li
Bing LAI
Chong Jiang

Assignees

SICHUAN UNIVERSITY

Dates

Publication Date: 20260512
Application Date: 20230313
Priority Date: 20220704

Claims (14)

1 . A method for constructing a design concept generation network (DCGN), wherein the DCGN comprises a Transformer encoder, a Transformer decoder, an importance constraint matrix generation module, an importance constraint embedding layer, a cross-attention (CA) layer, and an optimization module; and the method comprises the following steps: S1: obtaining, by the Transformer encoder, a feature of a hidden layer of the Transformer encoder based on input words in a sample; S2: obtaining, by the Transformer decoder, a feature of a hidden layer of the Transformer decoder based on a target sequence in the sample; S3: obtaining, by the importance constraint matrix generation module, an importance constraint matrix based on the input words and the target sequence in the sample; S4: mapping, by the importance constraint embedding layer, the importance constraint matrix to a distributed vector space to obtain two input word importance embedding features; S5: obtaining, by the CA layer, a generated sequence based on the feature of the hidden layer of the Transformer encoder, the feature of the hidden layer of the Transformer decoder, and the two input word importance embedding features; and S6: constructing a loss function based on the generated sequence and the target sequence, and adjusting, by the optimization module, network parameters based on the loss function; and repeating S1 to S6 until the loss function meets a specified requirement to obtain the DCGN; wherein in S2, the Transformer decoder maps a target sequence y t-1 =[y 0 ,y 1 , . . . , y t-1 ] at a moment t−1 to a distributed feature representation through a self-attention layer to obtain the feature h d t of the hidden layer of the Transformer decoder: h d t = SA ⁡ ( W d K ⁢ y : t - 1 , W d V ⁢ y : t - 1 , W d Q ⁢ y : t - 1 ) , wherein SA( ) represents a spatial attention; and W d K , W d V , and ⁢ W d Q represent weight matrices of the self-attention layer of the Transformer decoder; wherein in S3, f(x, w ,y :t ) represents an input word importance constraint vector C :t contained in the target sequence y :t ; f(x, w ,y :t ) is calculated as follows: f ( x, w ,y :t )= w ·c t , wherein ·⋅ represents a dot product operation of a vector or a matrix; and w =[ w 1 , w 2 , . . . , w i , . . . , w m ]∈ m represents a relative importance vector of the input words x in the target sequence y :t and is calculated as follows: w _ i = [ w i - w min w max - w min × ( M - 1 ) ] , ∀ i ∈ { 1 , 2 , … , m } , wherein w i represents a relative importance of an i th input word in the target sequence y :t ; w i represents an absolute importance of the i th input word in the target sequence y :t ; w min represents a minimum absolute importance of the input word in the target sequence y :t ; w max represents a maximum absolute importance of the input word in the target sequence y :t ; [ ] represents a rounding operation; and M≥m>1 and M represents a maximum number of input words contained in samples in an entire training sample set; and c t ∈ m represents an input word constraint contained in the target sequence y :t ; when the target sequence y :t contains the i th input word, an i th element in the vector c t is 1, and the vector is calculated as follows: c t = ( c t i ) i = 1 m = { c t i = 0 , if ⁢ x i ⊂ y : t c t i = 1 , if ⁢ x i ⊄ y : t .
2 . The method according to claim 1 , wherein in S1, the Transformer encoder obtains the feature h e of the hidden layer of the Transformer encoder by using the following formula: h e = SA ⁡ ( W e K ⁢ x , W e V ⁢ x , W e Q ⁢ x ) , ( 1 ) wherein x represents the input words; SA( ) represents the spatial attention; and W e K , W e V , and ⁢ W e Q represent the weight matrices of a self-attention layer of the Transformer encoder.
3 . The method according to claim 1 , wherein in S4, two importance constraint embedding matrices W c K and W c V are introduced, and the importance constraint matrix is mapped to the distributed vector space to obtain the two input word importance embedding features h : t K and h : t V , wherein h : t K and h : t V are as follows at a moment t during generation: h : t K = W c K ( C : t - 1 ) = W c K [ f ⁡ ( x , w _ , y : t - 1 ) ] ( 7 ) h : t V = W c V ( C : t - 1 ) = W c V [ f ⁡ ( x , w _ , y : t - 1 ) ] . ( 8 )
4 . The method according to claim 3 , wherein in S5, the CA layer fuses the feature h e of the hidden layer of the Transformer encoder, the feature h d t of the hidden layer of the Transformer decoder, and the two input word importance embedding features h : t K , h : t V to obtain a generated sequence y : t o at the moment t: y : t o = CA ⁡ ( W d K ⁢ h e , W d V ⁢ h e , h : t K , h : t V , W d Q ⁢ h d t ) , ( 9 ) wherein W d K , W d V , W d Q represent weight matrices of a self-attention layer of the Transformer decoder; a j th element in a CA function is expressed as follows: CA ⁡ ( q , k , h k , h v , v ) j = ∑ i = 0 M - 1 α ij ( v i + h ij v ) , ( 10 ) wherein q = W d K ⁢ h e ; k = W d V ⁢ h e ; v = W d Q ⁢ h d t ; α ij = soft ⁢ max ⁡ ( e ij ) = exp ⁡ ( e ij ) ∑ l = 0 d - 1 exp ⁡ ( e il ) ; e ij = q j ( k i + h ij k ) T d ; i = 0 , 1 , … , M - 1 ; j ⁢ and ⁢ l = 0 , 1 , … , d - 1 ; and ⁢ ( ) T represents a transpose operation.
5 . The method according to claim 1 , wherein in S6, for given N samples { x ( n ) , y ( n ) } n = 1 N , the loss function constructed based on the generated sequence and the samples target sequence is as follows: Loss DCGN = 1 TN ⁢ ∑ n = 1 N ∑ t = 1 T err ( y : t o , y : t ) , ( 11 ) wherein err ( y . t o , y . t ) represents an error between a generated sequence y : t o and a target sequence y :t at a moment t, and wherein T represents a length of the target sequence in the sample.
6 . A method for automatically generating a conceptual scheme, performed by the DCGN constructed by using the method according to claim 1 and comprising the following steps: L1: obtaining, by the Transformer encoder, the feature of the hidden layer of the Transformer encoder based on the input words; L2: obtaining, by the Transformer decoder, the feature of the hidden layer of the Transformer decoder at a moment t based on the generated sequence at a moment t−1; L3: obtaining, by the importance constraint matrix generation module, the importance constraint matrix based on the input words in the sample and the generated sequence at the moment t−1; L4: mapping, by the importance constraint embedding layer, the importance constraint matrix to the distributed vector space to obtain the two input word importance embedding features; and L5: obtaining, by the CA layer, the generated sequence based on the feature of the hidden layer of the Transformer encoder, the feature of the hidden layer of the Transformer decoder, and the two input word importance embedding features.
7 . The method for automatically generating the conceptual scheme according to claim 6 , wherein in L1, the input words are keywords constructed based on a design problem, at least one design incentive, keywords constructed based on design requirements, or a combination of at least two of input word sources.
8 . The method for automatically generating the conceptual scheme according to claim 7 , wherein in L2, the feature h d t of the hidden layer of the Transformer decoder at the moment t is calculated based on the generated sequence at the moment t−1 by using the following formula: h d t = SA ⁡ ( W d K ⁢ y : t - 1 , W d V ⁢ y : t - 1 , W d Q ⁢ y : t - 1 ) , ( 12 ) wherein y :t-1 represents an input sequence of the Transformer decoder at the moment t during generation, y : t - 1 = [ y 0 o , y : t - 1 o ] , y 0 o represents a given sequence at a start moment, and y : t - 1 o represents the generated sequence at the moment t−1.
9 . The method for automatically generating the conceptual scheme according to claim 6 , wherein in L2, the feature h d t of the hidden layer of the Transformer decoder at the moment t is calculated based on the generated sequence at the moment t−1 by using the following formula: h d t = SA ⁡ ( W d K ⁢ y : t - 1 , W d V ⁢ y : t - 1 , W d Q ⁢ y : t - 1 ) , ( 12 ) wherein y :t-1 represents an input sequence of the Transformer decoder at the moment t during generation, y : t - 1 = [ y 0 o , y : t - 1 o ] , y 0 o represents a given sequence at a start moment, and y : t - 1 o represents the generated sequence at the moment t−1.
10 . The method according to claim 1 , wherein in S4, two importance constraint embedding matrices and W c K and W c V are introduced, and the importance constraint matrix is mapped to the distributed vector space to obtain the two input word importance embedding features h : t K and h : t V , wherein h : t K and h : t V are as follows at a moment t during generation: h : t K = W c K ( C : t - 1 ) = W c K [ f ⁡ ( x , w _ , y : t - 1 ) ] [ [ 7 ] ] h : t V = W c V ( C : t - 1 ) = W c V [ f ⁡ ( x , w _ , y : t - 1 ) ] . [ [ 8 ] ]
11 . The method for automatically generating the conceptual scheme according to claim 6 , wherein in S1 of the method for constructing the DCGN, the Transformer encoder obtains the feature h e of the hidden layer of the Transformer encoder by using the following formula; h e = S ⁢ A ⁡ ( W e K ⁢ x , W e V ⁢ x , W e Q ⁢ x ) , [ [ ( 1 ) ] ] wherein x represents the input words; SA( ) represents the spatial attention; and W e K , W e V , and ⁢ W e Q represent the weight matrices of a self-attention layer of the Transformer encoder.
12 . The method for automatically generating the conceptual scheme according to claim 6 , wherein in S4 of the method for constructing the DCGN, two importance constraint embedding matrices W c K and W c V are introduced, and the importance constraint matrix is mapped to the distributed vector space to obtain the two input word importance embedding features h : t K and h : t V , wherein h : t K and h : t V are as follows at the moment t during generation: h : t K = W c K ( C : t - 1 ) = W c K [ f ⁡ ( x , w _ , y : t - 1 ) ] [ [ 7 ] ] h : t V = W c V ( C : t - 1 ) = W c V [ f ⁡ ( x , w _ , y : t - 1 ) ] . [ [ 8 ] ]
13 . The method for automatically generating the conceptual scheme according to claim 12 , wherein in S5 of the method for constructing the DCGN, the CA layer fuses the feature h e of the hidden layer of the Transformer encoder, h d t of the hidden layer of the Transformer decoder, and the two input word importance embedding features h : t K , h : t V to obtain a generated sequenc y : t o at the moment t: y : t o = CA ⁡ ( W d k ⁢ h e , W d V ⁢ h e , h : t K , h : t V , W d Q ⁢ h d t ) , ( 9 ) wherein W d K , W d V , W d Q represent weight matrices of a self-attention layer of the Transformer decoder; a j th element in a CA function is expressed as follows: CA ⁡ ( q , k , h k , h v , v ) j = ∑ i = 0 M - 1 α ij ( v i + h ij v ) , ( 10 ) wherein q = W d K ⁢ h e ; k = W d K ⁢ h e ; v = W d Q ⁢ h d t ; α ij = soft ⁢ max ⁡ ( e ij ) = exp ⁡ ( e ij ) ∑ l = 0 d - 1 exp ⁡ ( e il ) ; e ij = q j ( k i + h ij k ) T d ; i = 0 , 1 , … , M - 1 ; j ⁢ and ⁢ l = 0 , 1 , … , d - 1 ; and ⁢ ( ) T represents a transpose operation.
14 . The method for automatically generating the conceptual scheme according to claim 6 , wherein in S6 of the method for constructing the DCGN, for given N samples { x ( n ) , y ( n ) } n = 1 N , the loss function constructed based on the generated sequence and the target sequence is as follows: Loss DCGN = 1 TN ⁢ ∑ n = 1 N ∑ t = 1 T err ( y : t o , y : t ) , ( 11 ) wherein err ( y : t o , y : t ) represents an error between a generated sequence y : t o and a target sequence y :t at the moment t, and wherein T represents a length of the target sequence in the sample.

Description

CROSS REFERENCE TO THE RELATED APPLICATIONS This application is a continuation-in-part application of International Application No. PCT/CN2022/125347, filed on Oct. 14, 2022, which is based upon and claims priority to Chinese Patent Application No. 202210780085.4, filed on Jul. 4, 2022, the entire contents of which are incorporated herein by reference. TECHNICAL FIELD The present disclosure belongs to the technical field of product design, relates to the automatic generation of a conceptual scheme in product design, and particularly relates to the construction of a design concept generation network (DCGN) and automatic generation of a conceptual scheme based on the DCGN. BACKGROUND Innovative design is the basis of product development. As the core of innovative design, concept design determines most of the costs, quality, and performance during product development and is essential to product innovation. For example, in view of the problem of how to provide a usable water source for residents in coastal areas, a system for purifying seawater into drinking water or a design concept of using solar energy to desalinate seawater to produce canned drinking water or beverage products is proposed. As another example, when exploring a future public transportation system, a design concept of providing a personalized positioning seat service in the future public transportation system is put forward, which is beneficial for taking better care of vulnerable people. These design concepts provide designers or enterprises with design ideas in an early stage of product development and are more conducive to the generation of a product conceptual design scheme. Prior design data is an important innovation source. As a core of innovative product concept design, conceptual scheme generation is a process of extracting valuable design knowledge from the prior design data and further transferring and reorganizing cross-field design knowledge to generate a creative conceptual scheme. With the advent of an era of big data and big knowledge, engineering data applied to concept design is increasing. This brings abundant innovation sources to the research of conceptual scheme generation. Fully applying the data to the conceptual scheme generation is beneficial to expanding design space and producing more design concepts. However, there are more severe challenges, mainly in two aspects. First, with the explosive growth of design data, the amount of knowledge applied to concept design is also gradually increasing. It is increasingly difficult to reason, transfer, and reorganize a large amount of design knowledge to produce creative conceptual schemes based on the manual experience and design heuristics of the designers. Second, design knowledge mainly comes from descriptions of existing product design schemes in different fields and is often complex and diverse with various knowledge types, such as functions, structures, scientific effects, cases, and others. In addition, an association relationship between knowledge is more complex and flexible. It is increasingly difficult to obtain valuable design knowledge based on design problems or design constraints and to combine multi-type cross-field design knowledge to generate new conceptual schemes. As deep learning technology rapidly develops, many automatic generation technologies are developed and successfully complete various intelligent tasks, such as machine translation, image generation, speech recognition, and the like. The latest depth generation model has made important breakthroughs in many aspects of engineering design, such as structure optimization, material design, shape synthesis, and the like. There are also studies that use topology optimization and generative models, such as generative adversarial networks and the like, to automatically generate design concepts in the forms of images, spatial shapes, and the like. These design concepts are too abstract to understand or too detailed and not suitable for conceptual scheme design exploration in the early stage. It is found through research that text is the most general and common form of describing design concepts and can cover rich and valuable design knowledge. How to learn potential combination rules of reasoning, transferring, and reorganizing design knowledge from massive cross-field text data through a simple and effective model and generating conceptual schemes suitable for the early stage is an important problem to be resolved in current product design. SUMMARY In view of current technical status of the lack of a method for automatically generating a conceptual scheme in the field of product design, an objective of the present disclosure is to provide a method for constructing a DCGN and a method for automatically generating a conceptual scheme through a DCGN. Reasoning, transfer, reorganization, and other potential rules of cross-field design knowledge can be adaptively learned from massive text data based on design