Search

CN-121686508-B - Two-dimensional engineering drawing element identification method and system based on time sequence-frequency spectrum joint diffusion

CN121686508BCN 121686508 BCN121686508 BCN 121686508BCN-121686508-B

Abstract

The application discloses a two-dimensional engineering drawing element identification method and system based on time sequence-frequency spectrum joint diffusion, comprising the following steps of drawing standardization and context coding; the method comprises the steps of defining a response spectrum space diffusion process, conducting guided denoising based on spectrum cross attention, response spectrum decoding and element positioning, and boundary box refining and multi-attribute identification. The application unifies identification and generation within the same diffusion framework. Element recognition is no longer a feature matching process, but rather a process of generating from noise to thermodynamic diagrams under global context guidance, which is more conceptually advanced. As the whole process is carried out in the frequency domain, the model can access the global structure information of the drawing at each step. The cross-attention mechanism enables the model to perform reasoning like an expert, and efficient global-local information coordination is achieved.

Inventors

  • CAI QING
  • SHU WEINONG
  • XUE HONGJING

Assignees

  • 北京市建筑设计研究院股份有限公司

Dates

Publication Date
20260512
Application Date
20251225

Claims (9)

  1. 1. A two-dimensional engineering drawing element identification method based on time sequence-frequency spectrum joint diffusion is characterized by comprising the following steps: The method comprises the steps of (1) carrying out standardization pretreatment on an input vector or raster format two-dimensional engineering drawing, including rasterization, size normalization and binarization, inputting a standardized drawing image into a base feature extraction network to obtain a deep feature image, and carrying out multichannel two-dimensional fast Fourier transform on the deep feature image to construct an image spectrum context serving as a global static condition; Step 2, defining the response spectrum space diffusion process to include The task of each element category to be identified constructs a Defining a forward diffusion process from the ground state response spectrum to gradually inject complex Gaussian noise until the complex Gaussian noise is converted into a pure random noise spectrum through preset variance scheduling; Step 3, guided denoising based on spectrum cross attention, namely constructing a time sequence denoising network, taking the current noisy response spectrum as a query, taking the image spectrum context generated in the step 1 as a key and a value in each time step of back diffusion, predicting a noise spectrum to be removed through a time sequence-spectrum cross attention module, and iteratively updating the response spectrum; Step 4, response spectrum decoding and element positioning, namely after the back diffusion process is finished, respectively executing inverse fast Fourier transform on each channel of the response spectrum after final denoising to obtain Extracting the category and center point coordinates of each element through peak detection and non-maximum suppression according to the response thermodynamic diagram of each spatial domain; And 5, boundary box refining and multi-attribute identification, namely, extracting local features on a multi-scale deep feature map generated by the base feature extraction network in the step 1 by taking the central point obtained in the step 4 as a reference, and inputting a lightweight regression and classification head network to regress an accurate boundary box of the element and selectively identify more fine-granularity attributes of the element.
  2. 2. The two-dimensional engineering drawing element identification method based on time sequence-frequency spectrum joint diffusion according to claim 1, wherein the standardized pretreatment in step 1 is specifically: the vector drawing or the grating drawing is uniformly rendered, scaled/filled to a fixed size, the line is rendered to be a unit pixel width, the image is normalized to be a single-channel binary image, the base characteristic extraction network adopts a convolutional neural network or a visual transducer which is pre-trained on a large-scale image data set, and parameters of the convolutional neural network or the visual transducer are kept frozen in an element identification task.
  3. 3. The two-dimensional engineering drawing element identification method based on time sequence-frequency spectrum joint diffusion according to claim 1, wherein the construction formula of the image frequency spectrum context in the step 1 is as follows: ; Wherein, the Is output by the substrate feature extraction network A deep characteristic diagram of the channel, Is a two-dimensional fast fourier transform, Is a function of shifting the zero frequency component to the center of the spectrum to finally obtain complex tensor 。
  4. 4. The two-dimensional engineering drawing element identification method based on time sequence-frequency spectrum joint diffusion according to claim 1, wherein the calculation formula of the time sequence-frequency spectrum cross attention module in the step 3 is as follows: ; wherein the query From noisy response spectra Channel feature and time step embedding of (a) After combination, the key is generated by complex linear projection Sum value Are each image spectrum context Generating through different complex linear projections; For the purpose of conjugate transposition, In order to take the real part of the operation, Is the characteristic dimension of key K.
  5. 5. The two-dimensional engineering drawing element identification method based on time sequence-frequency spectrum joint diffusion as claimed in claim 1, wherein in step 3, the time sequence denoising network is characterized in that Is composed of L2 norm distance: ; Wherein, the Is a target response thermodynamic diagram Is used for the fourier transform of (a), Is a standard complex gaussian noise that is used to determine, Is the cumulative variance scheduling parameter of the forward diffusion process, Is the time step of the network Noisy response spectrum and spectral context Prediction of noise under conditions.
  6. 6. The two-dimensional engineering drawing element identification method based on time sequence-frequency spectrum joint diffusion as claimed in claim 1, wherein in step 3, the time sequence denoising network is characterized in that A spectrum converter U-Net architecture is adopted, standard convolution blocks in an encoder and a decoder are replaced by spectrum converter blocks, the blocks sequentially comprise a time sequence-spectrum cross attention module, a spectrum self attention module operating in channel and frequency dimensions and a complex feedforward network, and linear projection and normalization layers inside the network are designed to support complex operation.
  7. 7. The two-dimensional engineering drawing element identification method based on time sequence-frequency spectrum joint diffusion according to claim 1, wherein in step 4, the peak detection is specifically performed on each thermodynamic diagram Confidence threshold filtering is applied, non-maximum suppression is then performed through a max pooling operation, and only the point with the value equal to the maximum value of the neighborhood is reserved as the element center point.
  8. 8. The two-dimensional engineering drawing element recognition method based on time sequence-frequency spectrum joint diffusion according to claim 1, wherein the input features of the bounding box refining in the step 5 are from a multi-stage feature map of a base feature extraction network, local features are extracted around detected center points by RoIAlign operation after fusion through a feature pyramid network structure, and the lightweight regression and classification head network comprises parallel full-connection layers for regressing the center point offset and width and height of the bounding box and classifying predefined fine-grained attributes.
  9. 9. A system for applying the two-dimensional drawing element recognition method based on time-frequency spectrum joint diffusion as claimed in any one of claims 1 to 8, comprising: The drawing preprocessing and context coding module is used for executing the operation of the step 1, realizing the standardized processing of the input drawing, and coding and generating a static image spectrum context serving as a global condition; the guided response spectrum generation module is used for executing the operations of the step 2 and the step 3, and the core of the guided response spectrum generation module is a time sequence denoising network based on spectrum cross attention, wherein the network is guided by taking an image spectrum context, and a random noise spectrum is gradually converted into a final response spectrum containing position information of all element categories through iterative denoising; And the result decoding and refining module is used for executing the operations of the step 4 and the step 5, obtaining a space thermodynamic diagram by carrying out inverse Fourier transform on the final response spectrum, and finally outputting the category, accurate position and attribute information of all elements in the drawing by combining a lightweight detection head.

Description

Two-dimensional engineering drawing element identification method and system based on time sequence-frequency spectrum joint diffusion Technical Field The invention relates to the technical field of computer vision and artificial intelligence, in particular to a two-dimensional engineering drawing element identification method and system based on time sequence-frequency spectrum joint diffusion. Background Two-dimensional engineering drawings, such as building plan, mechanical drawing, pipeline layout and the like, are core carriers for information transmission and archiving in the engineering field. Automatically identifying and locating various standardized elements (e.g., doors, windows, equipment, etc.) from these drawings is critical to achieving drawing digitization, BIM (building information modeling) reverse construction, equipment asset management, and automated design review. Currently, the technical route of drawing element recognition is mainly divided into a traditional method and a method based on deep learning. The traditional methods are mostly dependent on template matching, hough transformation or manually designed feature descriptors, are very sensitive to the scale, rotation, shielding and drawing noise of elements, have poor generalization capability and are difficult to adapt to complex and changeable actual engineering drawings. The deep learning-based method, particularly a Convolutional Neural Network (CNNs), such as a target detection framework of Faster R-CNN, YOLO and the like, has made remarkable progress in element recognition tasks. The method regards element identification as a general target detection problem, and can learn rich visual characteristics by training on a large amount of labeling data. However, they are essentially "discriminant" models, whose identification process is to search and match locally in the pixel space or feature space of the image, with the limitations of 1. Global context underutilization is limited by the limited receptive field of conventional CNNs, making it difficult to efficiently utilize the global layout and structural information of the drawing. For example, a "door" element always appears on a "wall", and this global structural constraint is difficult to adequately model in a local inspection box. 2. Sensitivity to noise and style variations pixel or local feature based methods are susceptible to low-level visual variations such as line thickness, print quality, scan artifacts, etc. 3. The paradigm limits that recognition is regarded as a process of finding matches in a huge search space, and the conceptual model is direct, and deep structure information contained in the data may not be fully mined. In recent years, diffusion Models (Diffusion Models) have made a revolutionary breakthrough in the field of image generation as a powerful generation model. The process of gradually recovering data from noise is similar to the cognitive process from blurring to clarity of human beings. However, in recognition tasks, diffusion models are often used as data enhancement or preprocessing tools, and their potential as core inference engines has not been fully exploited. Therefore, a brand new technical paradigm is urgently needed in the field, which can surpass the traditional discriminant recognition framework, deeply fuse the global structural information of the drawing with the local feature mode of the element, and realize the element recognition in a more robust and more efficient manner. The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art. Disclosure of Invention The invention aims to overcome the defects of the prior art, provides a two-dimensional drawing element identification method and system based on time sequence-frequency spectrum joint diffusion, and aims to solve the problems of insufficient utilization of global context, sensitivity to noise, limited identification paradigm and the like in the background art. In order to achieve the above purpose, the present invention adopts the following technical scheme: The invention provides a two-dimensional drawing element identification method based on time sequence-frequency spectrum joint diffusion, which comprises the following steps: Step 1, drawing standardization and context coding, carrying out standardization pretreatment on an input vector or raster format two-dimensional engineering drawing, wherein the standardization pretreatment comprises rasterization, size normalization and binarization; inputting the standardized drawing image into a base feature extraction network to obtain a deep feature image, and performing multi-channel two-dimensional fast Fourier transform on the deep feature image to construct an image spectrum context serving