CN-121787291-B - Multi-mode semantic topology and constraint reasoning CAD sequence generation method based on time scale
Abstract
A multi-mode semantic topology and constraint reasoning CAD sequence generation method based on time scale belongs to the technical field of computer aided design and artificial intelligence intersection. The method constructs an end-to-end automatic generation framework which is input from multiple modes to an editable parameterized CAD model, remarkably improves semantic consistency, structural rationality and logic correctness of a generated sequence, can deeply fuse multi-source data such as images, sketches and point clouds, extracts and accurately captures design intentions by utilizing time scale perceived characteristics and dynamic topological graph evolution, effectively solves the problem of error accumulation and logic drift in a traditional autoregressive model through a grammar and geometry constraint guided sequence synthesis mechanism, and realizes efficient and stable generation of the parameterized model by combining a geometric feedback driving optimization engine, thereby realizing systematic breakthrough in the aspects of generation quality, automation degree and cross-scene applicability and providing key technical support for intelligent manufacturing and intelligent design.
Inventors
- WANG XIANGXIANG
- QIN SHENGYU
- YU YONGBIN
- Fan Manping
- LI CHENBO
- LIU JUN
- Xue Kaiyi
- WANG JINGYA
- Han Xindie
Assignees
- 电子科技大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260305
Claims (9)
- 1. A multi-mode semantic topology and constraint reasoning CAD sequence generation method based on a time scale is characterized by comprising the following steps: Step A, the image, sketch and point cloud data of the same component are respectively coded by a mode specific coder and then projected to a unified semantic space to obtain unified semantic features, and collaborative alignment loss optimization is carried out; Step B, converting the unified semantic features into a semantic topological graph, explicitly characterizing entity, operation and constraint relations, wherein the evolution of the semantic topological graph structure is defined as node operation prediction and connection prediction on a time scale; Step C, decomposing the natural language input instruction into a structured command triplet and generating a corresponding embedded vector to obtain a command sequence, wherein the triplet comprises an operation type, a parameter list and a parameter type, inputting a semantic topological graph into a transducer decoder to obtain candidate tokens and probability distribution thereof, adding the generated candidate tokens into the command sequence, performing real-time grammar checking and geometric feasibility discrimination, and after verification, taking charge of path selection and sequence expansion by a dynamic beam search algorithm controlled by a time scale self-adaptive scheduler; step D, converting the generated CAD command sequence and semantic topological graph into CAD-CL intermediate representation, then performing global optimization and code transformation based on the geometric algebra and the tense state dependency graph, and finally generating and verifying a final target code according to the kernel capability cognition graph; The step A is specifically as follows: step A1, using three deformable convolution layers with different void ratios for an input image in parallel to obtain a group of feature images, and splicing the group of feature images to obtain a fusion feature tensor; the method comprises the steps of obtaining a fusion feature tensor, carrying out global average pooling on the fusion feature tensor to obtain a channel descriptor, learning the weight of each channel of the channel descriptor through two full-connection layers, and finally weighting the fusion feature tensor based on the weight to obtain a final image feature; A2, constructing a local dynamic graph of each point based on the local neighborhood for the input point cloud, then carrying out relation dynamic learning and graph signal evolution on the local dynamic graph, and nodes in the local dynamic graph The geometric difference information among the nodes is mapped into high-dimensional potential energy characteristics through a multi-layer perceptron After the high-dimensional potential energy characteristics of (a) are subjected to symmetrical maximum pooling, the node is driven After updating and converging the local dynamic graph, obtaining a steady state neighborhood feature set, and then carrying out maximum pooling and average pooling in parallel, and obtaining the final local geometric semantic feature of the point after the two paths of results are spliced; Step A3 System processing sketch sequence Wherein To include a hybrid time scale of continuous rendering phases and discrete events, Expressed on a mixing time scale Go up, moment of time Analyzing a sketch sequence into a topological graph, wherein a node set of the topological graph comprises sketch primitives, an edge set comprises topological connection edges and space adjacent edges, the topological graph is subjected to heterogeneous graph convolution and an attention mechanism to obtain topological enhancement features, and then the topological graph is spliced with time dynamic features of stroke of the sketch to obtain feature representation of nodes of the topological graph; A4, respectively sending the features of the three modes into a shared semantic projector to obtain uniform semantic features; and step A5, carrying out L2 normalization on the unified semantic features, and then carrying out training optimization on the cross-modal contrast loss, the contrast modal alignment loss and the time scale smoothness loss.
- 2. The method for generating the multi-modal semantic topology and constraint reasoning CAD sequences based on the time scale according to claim 1, wherein the step B is specifically as follows: The method comprises the steps of B1, generating a three-layer semantic topological graph based on unified semantic features, wherein the three-layer semantic topological graph comprises an entity layer, an operation layer, a constraint layer and a constraint layer, wherein nodes represent basic geometric voxels, edges represent space adjacency relations, the nodes represent feature operations, and the edges represent dependency relations among the operations; b2, learning an iterative graph structure and performing time scale dynamics; At each time point The controller takes the current semantic topological graph and the unified semantic features as inputs, and calculates delta-derivatives of the semantic topological graph, wherein the delta-derivatives are specifically two predicted actions; the method comprises node operation prediction, edge connection prediction, wherein the action space comprises adding entity nodes, adding operation nodes, adding constraint nodes and no operation, wherein if the new nodes are predicted to be added nodes, the new nodes are further predicted to be connected with existing nodes in a semantic topological graph, and the types of edges are determined; step B3 any two nodes representing geometric elements in the dynamic semantic topological graph And The basic spatial relationship among the two is obtained by a lightweight multi-layer perceptron to obtain a spatial relationship query vector, wherein the basic spatial relationship comprises relative displacement, relative scaling and relative orientation; Each semantic topological graph node is associated to a multi-mode feature segment during initialization, and the nodes are associated with corresponding image features and space position codes thereof for image modes, associated with corresponding point cloud mass features and three-dimensional bounding box center coordinates thereof for point cloud modes, and associated with sketch pen touch features and two-dimensional track coordinates thereof for sketch modes; Then, the spatial relation query vector is used as a guide, attention aggregation is respectively carried out in a feature pool of three modes of an image, a point cloud and a sketch, similarity scores of node association features under the spatial relation query vector and each mode are calculated, and the features are weighted and summed according to the similarity scores, so that three independent semantic enhancement vectors of the modes are generated; finally, the basic spatial relationship and the cross-modal semantic enhancement vector are spliced to obtain a geometric semantic relationship embedded vector, and the geometric semantic relationship embedded vector is attached to the connection node And Is defined on the edge features of the (c).
- 3. The method for generating the multi-modal semantic topology and constraint reasoning CAD sequences based on the time scale according to claim 2, wherein the step C is specifically as follows: Step C1, decomposing the CAD command into a structured command triplet comprising an operation type, a parameter list and a parameter type, respectively performing embedded learning on the operation type, the parameter type and the parameter value, and finally, enabling an embedded vector of the CAD command to be the sum of the embedded parts of the CAD command; At each decoding time, C2., the transducer decoder generates its internal decoding state based on the current context information, which includes the generated command sequence and dynamic semantic topological graph, and calculates the probability distribution of the candidate tokens based on the state; using a real-time grammar checker preset with a CAD command context-free grammar and type system to check each candidate token, and setting the probability distribution of the candidate token to zero if the grammar error is caused by adding the candidate token into the current sequence; After evaluating the added candidate tokens using a parameterized geometric feasibility arbiter, the whole sequence can construct probabilities of an effective three-dimensional model, which are used to adjust the final score of the candidate: ; Wherein, the Representing candidate tokens Is a comprehensive assessment score of (2); representing the time of day of a transform decoder Is a state of inner decoding of (a); Representing the probability generated by the transducer decoder, Representing a geometric feasibility weight coefficient; representing the probability of parameterized geometric feasibility arbiter prediction, Representing the time of decoding up to the present A CAD command sequence prefix that has been generated; the width of the beam search is dynamically adjusted according to a time scale exponential function of the generation progress.
- 4. The method for generating the multi-modal semantic topology and constraint reasoning CAD sequences based on the time scale according to claim 3, wherein the step D is specifically as follows: step D11, converting the command triples and the semantic topological graph into a domain-specific intermediate representation CAD-CL; Firstly, constructing a function from a general command category to a strong CAD-CL category, combining each command triplet with nodes and edge semantics in a semantic topological graph, and solving the specific semantics of parameter references through pull-back operation in category theory; The type of the domain-specific intermediate representation CAD-CL depends on the compile time, the type check is done on the compile time scale, the type error will immediately trigger the compile interrupt and the error report as a right discrete event; constructing and maintaining a parameter semantic passport for each parameter in the compilation pipeline, said parameter semantic passport comprising basic physical dimensions, numerical valid fields, design intent tracing, geometric constraint context, version and evolution history and verification and state metadata; Step D12, carrying out symbolized geometric reasoning and tense state dependence analysis on the domain-specific intermediate representation CAD-CL and the parameter semantic passport attached to the domain-specific intermediate representation CAD-CL; Firstly, based on parameter semantic analysis and feasibility verification, constructing a dependency graph for keeping the symmetry of the plum cluster based on a CAD-CL and an attached parameter semantic passport of a domain-specific intermediate representation, and forming geometric operation into a three-dimensional special Euclidean cluster or a cluster action of a rotating cluster; Then, the geometric algebra is used as a unified language to formalize the geometric entity and the operation thereof in the front-end parameter semantic passport based on the unified simplification of the parameter expression of the geometric algebra; representing all parameterized constraints as a polynomial equation set taking driving dimension parameters as variables to form a polynomial ideal, and converting the original constraint system into a canonical form with equivalent algebraic properties by calculating the ideal lattice Luo Buna base; Forming the key state of the CAD kernel into shared resources, marking the read-write effect of each CAD-CL on the resources, marking the read-write effect based on the CAD-CL, constructing a temporal state dependency graph, wherein nodes are CAD-CL, and representing read-write dependency relationship among commands due to the shared resources; Based on the tense state dependency graph, the optimization process is modeled as a power system on a time scale, the system state is a current code sequence and the dependency graph, and under the constraint of the tense state dependency graph, the command which can be rearranged, parallelized or eliminated is identified, and the stretching commands executed on different sketches are scheduled to be executed in parallel if no state conflict exists; Step D13, compiling the optimized intermediate representation CAD-CL into an object code oriented to a specific CAD kernel; constructing a multi-objective code generator driven by a kernel capability cognition spectrum, integrating a kernel capability cognition spectrum, wherein the spectrum structurally codes core attributes of different kernel APIs, constructing a multi-constraint satisfaction problem by a code generation process, and selecting an API call sequence optimal for pareto for a given CAD-CL program by heuristic search by a solver based on the cognition spectrum, and simultaneously satisfying multi-objective constraints of performance, precision and compatibility; The simulator checks whether the prepositive post condition of the API call is satisfied and whether the sequence possibly causes the abnormality known by the kernel, any potential failure can generate a right discrete failure event on the verification time scale, the trigger back end immediately re-generates and dispatches codes, and the stability of the verification process is ensured by a Lyapunov function on the time scale; step D2., geometrically feeding back a driving optimization mechanism, wherein the optimization process is constructed as a dynamic equation on a parameter optimization time scale; Step D21. Delta-gradient decrease on manifold, parameter update delta-gradient based on time scale, parameter vector to be optimized Exist in a manifold Above it encapsulates all adjustable driving parameters in parameterized CAD model, calculates multi-modal loss function Delta-gradient of (2) The parameter updating rule is as follows: ; Wherein, the , Is a parameter optimized time scale The next iteration time point above; For self-adaptive learning rate, adjusting through a time scale exponential function; For the point in time Is a parameter vector of (a); D22. The method comprises the steps of establishing a parameter dependency graph of dynamic topology perception and causal navigation optimization, constructing the parameter dependency graph, wherein nodes comprise parameter nodes and geometric entity nodes, establishing one-to-one mapping between the geometric entity nodes and entity nodes in a semantic topological graph to realize topology perception, and automatically establishing initial edge connection based on semantic and explicit geometric constraints of a CAD command sequence; the weights of the edges dynamically evolve on a parameter-optimized time scale, the update of which follows a delta-kinetic process based on perturbation analysis: at the moment of time The system applies a tiny disturbance to the parameter node, and observes the variation quantity of the disturbance to the target node through lightweight forward simulation, wherein the causal weight of the target node Iterative updating is performed by: ; Wherein, the Controlling the updating step length for the learning rate of causal weight updating; Is a tiny disturbance quantity; Target node for the disturbance The amount of change in state; When the optimization algorithm is at the moment When deciding to adjust the target parameters, the system inquires the parameter dependency graph immediately, and precisely locates all downstream node sets with strong causal connection with the target parameters through traversal of the graph Then starting an incremental symbolization reconstruction mechanism, wherein the mechanism represents parameterized modeling history as a computational graph, and only re-evaluates and geometrically reconstructs computational subgraphs induced by a downstream node set through a partial evaluation technology; Before a parameter updating instruction is submitted to a kernel for execution, the system performs one-time lightweight forward simulation along a parameter dependency graph, so as to predict geometrical constraint conflicts possibly caused by the updating, once a failure event is predicted, the system starts a restoration strategy generator, the restoration strategy generator traces back a critical causal path causing the conflicts based on the parameter dependency graph, and solves a local parameter adjustment strategy by utilizing Lyapunov stability theory on a time scale.
- 5. The method for generating the multi-modal semantic topology and constraint reasoning CAD sequence based on the time scale according to claim 4, wherein the relation power learning and graph signal evolution are specifically as follows: Designing a relational dynamic learning layer, driving node characteristics in a local dynamic graph to evolve along with a microscopic time scale until convergence, and specifically using a geometric potential function to code each edge in the local dynamic graph The corresponding geometric difference information between two points is defined as: ; Wherein, the 、 Are not nodes 、 At the moment of time Is a feature vector of (1); a 3-layer multi-layer perceptron sharing parameters; is a feature of the high-dimensional potential energy, Calculating for Euclidean distance; As a feature point in the point cloud, Is a neighborhood point; Each node Is driven by the potential energy of the neighborhood nodes, and follows the delta-update rule on the microscopic time scale: ; Wherein, the Representing a local neighborhood of the object, On a microscopic time scale The next time point on the table meets the following conditions ; As a function of the granularity of the particles, Wherein In order to be an index of the time decay, The updating process is circularly executed until reaching a steady state.
- 6. The method for generating the multi-modal semantic topology and constraint reasoning CAD sequence based on the time scale according to claim 5, wherein the heterogeneous graph convolution and attention mechanism is specifically as follows: designing an independent convolution weight matrix for each class of topological edges of a topological graph , Forming a heterogeneous graph convolution architecture, and nodes in a topological graph of a sketch In the first place The layer feature update formula is: ; Wherein, the Is a node In relation to A lower neighbor set; Is an activation function; In order for the weight matrix to be learnable, Is a neighbor node Is the first of (2) Layer feature vectors; a learnable weight matrix for node self-connections, As a central node Own first Layer feature vectors; Topology aware attention weighting Calculation by attribute-enhanced graph annotation force mechanism: ; Wherein the method comprises the steps of Encoding functions for topological properties, fusing edge types Length of Joint curvature The geometric characteristics of the geometric features, As a vector of attention parameters that can be learned, Is a linear rectifying activation function with leakage, Representing vector concatenation operations.
- 7. The method for generating the multi-modal semantic topology and constraint reasoning CAD sequences based on the time scale according to claim 6, wherein the time scale smoothness loss is as follows: ; Wherein, the And On a mixed time scale At the point in time of the above-mentioned, Is defined on a mixed time scale As a function of the time scale index on the top, And Respectively are time points And The L2 normalized feature vector of the corresponding unified semantic vector, Is the decay index; The cross-modal contrast loss is specifically that a positive sample pair of the multi-modal data of the same task and a negative sample pair of different tasks are constructed, and the pulling of the positive sample pair and the pushing of the negative sample pair are realized through the cross-modal contrast loss; The antagonistic modal alignment loss minimizes the distribution difference of different modal characteristics through the antagonistic training of the three-layer multi-layer perceptron discriminant and the modal specific encoder.
- 8. The method for generating the multi-modal semantic topology and constraint reasoning CAD sequences based on the time scale according to claim 7, wherein the updating of the state of the semantic topology follows the basic dynamics equation of the time scale: ; Wherein, the Is the delta-derivative of the graph structure, Is the current point in time in the evolution of the topology graph, Is a design time scale At the next point in time on the day, Is that Granularity function on, the process loops until the controller predicts no operation, i.e Marking convergence of the design logic; while training the controller, designing an asymptotic stability regularization term based on a time scale exponential function : ; Wherein, the Is the initial point in time of the evolution process; is the initial time of the design process Graph structure change trend of (2); is a time scale exponential function; is the stability decay index of the device, Is the Fu Luo Beini Usnea norm square.
- 9. The method for generating a multi-modal semantic topology and constraint reasoning CAD sequence based on time scale according to claim 8, further comprising a pulse trace-back and correction control mechanism modeled as a pulse control system on the generation time scale, the mechanism being divided into three steps: Step C31. Fault mapping and localization, receiving error report from step D, mapping fault back to specific command step in sequence by analyzing error type and context, the error event being considered at time instant A pulse disturbance occurs; step C32, generating state rollback, rolling back the internal state of step C to the failure command The previous one, step, being a security state And freeze the previously generated sequence prefix that has been verified as error-free; Step C33, self-adaptive constraint reinforcement and guidance exploration, wherein in the step When the generation is carried out again, the system dynamically starts an adaptive constraint strengthening and guiding exploration mechanism, the mechanism firstly injects a command or parameter combination which causes the last failure into a dynamic tabu table, and adaptively adjusts the exploration and utilization balance according to the attenuation characteristic of a time scale exponential function, the decision weight of a geometric feasibility discriminator is obviously improved in the initial stage of rollback, the known fault domain is avoided, the constraint weight is smoothly attenuated along with the promotion of the generation process, a guide model explores a new generation path in a feasible solution space, and the stability of pulse control is ensured by a Lyapunov function on a time scale.
Description
Multi-mode semantic topology and constraint reasoning CAD sequence generation method based on time scale Technical Field The invention belongs to the technical field of Computer Aided Design (CAD) and artificial intelligence intersection, and particularly relates to a multi-mode semantic topology and constraint reasoning CAD sequence generation method based on a time scale. Background Computer Aided Design (CAD) is a core tool for modern industrial design and manufacture, and plays an irreplaceable role in the fields of machinery, architecture, aerospace, etc. In recent years, the intellectualization and automation of CAD technology has become a key direction of industry development. At present, AI-based CAD model generation methods have become research hotspots, and in particular, initial progress has been made in three-dimensional reconstruction and sequence generation based on point clouds, images or sketches. For example, a model DeepCAD or the like can generate a CAD command sequence from an image, a Point2Cyl or the like approach attempts to reconstruct the basic geometry from the Point cloud. However, there are several key problems with existing methods: Most methods rely on single-mode input (such as images or point clouds), are difficult to effectively fuse multi-source heterogeneous data such as sketches, images and point clouds, and the applicability of the method in actual complex scenes is limited, the existing generation methods often ignore layering logic in the design process, lack modeling of a human design paradigm of 'whole priority and partial refinement', so that generated CAD sequences have defects in terms of semantic consistency and structural rationality, and traditional generation models such as transformers are easy to generate invalid commands or logic errors when generating long sequences, so that the editability and reusability of the generated models are seriously affected. More importantly, existing methods generally treat the design process as a single, uniform time stream that fails to effectively model the hybrid time characteristics inherent in CAD designs. In practical designs, the designer's thinking and behavior frequently switches between continuous parameter adjustment and discrete command operations, constituting a typical hybrid time domain. The lack of a unified modeling framework for such hybrid time dynamics in the traditional approach results in the generated design sequences being logically disjointed in time and difficult to capture the designer's intended evolution process. The time scale theory is used as a mathematical framework of a unified continuous and discrete dynamic system, and provides an ideal theoretical basis for solving the problems. However, no research has been done to apply time-scale theory to the intelligent generation of CAD sequences to achieve accurate description and reasoning of hybrid time dynamics in the design process. Most of the existing methods can not realize the end-to-end automatic process from multi-mode input to editable parametric models, a great deal of manual intervention is still needed, and the dual requirements of intelligent manufacturing on efficiency and quality are difficult to meet. Disclosure of Invention To address the above challenges, there is a need for an intelligent method that can fuse multimodal inputs, model design hierarchy, and generate high quality editable CAD sequences. The invention provides a time scale-based multi-mode semantic topology and constraint reasoning CAD sequence generation method, which realizes end-to-end automatic generation of multi-mode input of sketch, image, point cloud and the like into a parameterized CAD model sequence, effectively improves rationality, editability and cross-scene applicability of a generated model, and provides key technical support for intelligent manufacturing and intelligent design. The following operations are specifically performed: Step A, the image, sketch and point cloud data of the same component are respectively coded by a mode specific coder and then projected to a unified semantic space to obtain unified semantic features, and collaborative alignment loss optimization is carried out; step B, converting the unified semantic features into a semantic topological graph, explicitly characterizing entity, operation and constraint relations, wherein evolution of a graph structure is defined as node operation prediction and connection prediction on a time scale; Step C, decomposing the natural language input instruction into a structured command triplet and generating a corresponding embedded vector to obtain a command sequence, wherein the triplet comprises an operation type, a parameter list and a parameter type, inputting a semantic topological graph into a transducer decoder to obtain candidate tokens and probability distribution thereof, adding the generated candidate tokens into the command sequence, performing real-time grammar checking and geome