CN-121998941-A - Tooth detection and FDI recognition system and method
Abstract
The invention discloses a tooth detection and FDI identification system and method, comprising the following steps: a mixed frequency domain-space attention module, a transducer decoding module, a dental arch perception query graph priori module and a category uniqueness gating module. The mixed frequency domain-space attention module improves the tooth structure representation capability under the conditions of reflection, blurring, noise and shielding, improves the detection robustness and the positioning accuracy, thereby supporting the transducer decoding module to accurately position and initially classify, and jointly ensures that the final output of the transducer decoding module meets the clinical logic requirements of the topological continuity of dental arches and the global uniqueness of FDI numbers by the cooperation of the dental arch perception query graph priori submodule and the class uniqueness gating submodule.
Inventors
- ZHU JIANGPING
- WU XI
- HUANG RUIJIE
- WU PEI
- PENG YIRAN
Assignees
- 成都苍岷科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260127
Claims (10)
- 1. A tooth detection and FDI recognition system is characterized by comprising a mixed frequency domain-space attention module, The dental arch perception query graph prior module and the class uniqueness gating module are connected with the transducer decoding module; The mixed frequency domain-space attention module receives the oral image input feature map and enhances the oral image feature map flowing through the mixed frequency domain-space attention module; The transducer decoding module comprises a fixed number of learnable queries, iteratively updates the queries through a multi-layer decoder, takes the enhanced characteristics of the mixed frequency domain-space attention module as coding memory, and outputs a tooth detection frame and a category prediction corresponding to each query; The dental arch perception query graph priori module is used for cooperating with the transducer decoding module, constructing a query adjacency relation in a self-attention stage of the transducer decoding module, and generating an attention mask according to a reference point of the query so as to limit query interaction to a local neighborhood conforming to an anatomical structure of the dental arch; The class uniqueness gating module is used for cooperating with the transducer decoding module and is used for applying differentiable uniqueness constraint to class predictions output by the transducer decoding module in a model training stage so as to inhibit the same tooth class from being repeatedly predicted by a plurality of queries.
- 2. The tooth detection and FDI identification system of claim 1, wherein the frequency-space is mixed The inter-attention module comprises a frequency domain branch and a space branch which are processed in parallel; The frequency domain branch is used for performing fast Fourier transform on input features to obtain a frequency spectrum, decomposing the frequency spectrum according to a preset frequency band threshold to obtain a low-frequency component and at least one high-frequency component, performing inverse Fourier transform on each frequency band component to obtain spatial domain frequency band features, generating spatial modulation weights through depth separable convolution on the basis of the input features, performing weighted modulation on each frequency band component feature, and finally combining all modulated frequency band components to obtain frequency domain enhancement features; The space branches are used for extracting stripe direction characteristics by adopting mutually orthogonal stripe convolution kernels according to the characteristics of the transverse extension of the dental arch and the longitudinal structure of the dental body, and integrating the stripe direction characteristics to obtain space enhancement characteristics; The frequency domain enhancement features and the space enhancement features are fused through a selective kernel fusion mechanism and then output through residual connection.
- 3. The tooth detection and FDI identification system of claim 1, wherein said arch sense The construction of the query adjacency of the query graph priori module specifically comprises one of 1) calculating Euclidean distance of any two query reference points, and selecting k+1 queries closest to each i as a permissible attention set, wherein the nearest k+1 queries can contain themselves; 2) Dividing the query into an upper dental arch group and a lower dental arch group according to the ordinate of the reference point of the query vector of the current decoding layer, estimating one-dimensional parameters of each query along the dental arch curve based on the reference point coordinate in each group, selecting k nearest neighbors in the dental arch group to which each query belongs based on the one-dimensional parameters, constructing a k nearest neighbor graph, generating a self-attention mask according to the k nearest neighbor graph, and only allowing the query to focus on the k nearest neighbors in the dental arch group and the dental arch group to block interaction with other queries in self-attention calculation of a transducer decoding module.
- 4. The tooth detection and FDI identification system of claim 1, wherein said class is unique And the sex gating module is used for constructing an approximately one-to-one soft matching matrix between the FDI class set and the main query set in the current image, and superposing the logarithmic probability of the soft matching matrix as a penalty term to the classification non-normalized score (logits).
- 5. The tooth detection and FDI identification system of claim 1, further comprising an input and pre-processing And the processing module is used for receiving the oral cavity image, and performing size scaling, normalization, optional brightness/contrast correction and cutting on the oral cavity image to obtain a network oral cavity image input tensor.
- 6. The tooth detection and FDI identification system of claim 1, further comprising an output module, The method is used for receiving the output of the transducer decoding module, processing the tooth detection frame and the category prediction, outputting the boundary frame normalization form or the pixel form, the FDI number and the confidence degree of each tooth, and screening an output set according to a confidence degree threshold value.
- 7. The tooth detection and FDI identification system of claim 1, further comprising a multi-scale feature The multi-scale feature extraction network comprises a convolutional neural network Backbone back bone and a feature fusion network FPN/PAN, wherein the convolutional neural network Backbone back bone is used for extracting multi-scale feature images of the input oral cavity images, and outputting the multi-scale feature images to the mixed frequency domain-space attention module, the multi-scale feature extraction network comprises the convolutional neural network Backbone back bone and the feature fusion network FPN/PAN, the convolutional neural network Backbone back bone is used for extracting multi-scale feature images of the input oral cavity images, and the feature fusion network FPN/PAN is used for further fusing the multi-scale feature images to obtain the multi-scale feature images.
- 8. A method for tooth detection and FDI identification, comprising the steps of: Acquiring an oral image to be analyzed; Inputting the oral image into a tooth detection and FDI identification system according to any of claims 1-7; analyzing the output of the tooth detection and FDI identification system to obtain the boundary frame of each tooth in the oral cavity image and the FDI number corresponding to the boundary frame.
- 9. An electronic device comprising a tooth detection and FDI identification system according to any of claims 1-7.
- 10. An electronic storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the dental object detection method of claim 8.
Description
Tooth detection and FDI recognition system and method Technical Field The invention relates to the technical field of intelligent analysis and computer vision of oral cavity images, in particular to a tooth detection and FDI recognition system and method. Background The oral cavity image has important value in the scenes of clinical diagnosis and treatment, automatic input of Electronic Dental Record (EDR), remote oral cavity medical treatment and the like. To achieve structured interpretation of an oral image, it is generally necessary to first complete positioning and standardized numbering of the teeth. Compared with natural scene images, the intraoral photo/intraoral scanning rendering image/panoramic film often has the conditions of specular reflection, overexposure, motion blur, shielding, color approaching of gums and teeth, large changes of visual angles and dimensions, and the like, so that the edge and texture characteristics of the teeth are unstable, and meanwhile, the teeth are dense in targets and similar in shape, so that detection and classification confusion is more serious. The current method for structurally reading the oral cavity image comprises the following steps: Anchor-based detectors (e.g., faster R-CNN, SSD, YOLO, etc.) typically output a large number of candidate boxes and use non-maximal suppression (NMS) deduplication, followed by independent classification of each box to obtain the dental number. Because the teeth are arranged tightly and in a similar form, NMS threshold values, anchor scales and the like need to be frequently adjusted, under the scene of reflection/blurring/shielding, candidate frames are unstable in score, false inhibition or preservation of the false frames is easy to cause, the same FDI class can be repeatedly predicted by a plurality of frames, and additional regularized post-treatment (such as sequencing according to x coordinates, grouping according to dental arches, preserving the highest score of the repeated classes and the like) is needed for repairing. The target detection is modeled as set prediction by a transducer detector such as DETR and deformable DETR thereof, one-to-one training supervision is carried out through Hungary matching, and dependence of Anchor and NMS is reduced. However, the model generally predicts the class distribution independently, lacks explicit constraint of FDI number global uniqueness and dental arch topological order, and is still easy to generate repeated number, dental arch crossing confusion and local sequence exchange under oral image noise and device domain crossing offset. The partial dental method introduces adjacent graph/relation branches or performs graph optimization and sequence matching in an inference stage to correct numbers, but often depends on initial detection quality, has complex structure and high inference cost, and is unstable in graph edge construction under the condition of strong noise and easy to generate error propagation. Therefore, it is necessary to construct a new tooth detection and recognition method under the condition of complex oral cavity images so as to improve the accuracy of tooth detection. Disclosure of Invention In order to solve the problems, the invention aims to provide a tooth detection and FDI recognition system and method, wherein the mixed frequency domain-space attention module improves the tooth structure characterization capability under reflective, fuzzy, noise and shielding conditions, improves the detection robustness and positioning accuracy, thereby supporting the transducer decoding module to accurately position and initially classify, and the dental arch perception query graph priori submodule and the class uniqueness gating submodule work together with the transducer decoding module to jointly ensure that the final output of the transducer decoding module meets the clinical logic requirements of dental arch topological continuity and FDI number global uniqueness. The invention is realized by the following technical scheme: A tooth detection and FDI identification system includes a hybrid frequency-spatial attention module (HFSA), a transducer decoding module, an arch perception query graph prior module (QGP), and a class-uniqueness gating module (UniGate); The mixed frequency domain-space attention module receives the oral image input feature map and enhances the oral image feature map flowing through the mixed frequency domain-space attention module; The transducer decoding module comprises a fixed number of learnable queries, iteratively updates the queries through a multi-layer decoder, takes the enhanced characteristics of the mixed frequency domain-space attention module as coding memory, and outputs a tooth detection frame and a category prediction corresponding to each query; the dental arch perception query graph priori module is used for cooperating with the transducer decoding module, constructing a query adjacency relation in a self-attention sta