CN-122023524-A - Unsupervised anatomical landmark positioning method based on double-layer alignment and super-resolution head
Abstract
The invention discloses an unsupervised anatomical landmark positioning accurate positioning method based on double-layer alignment and super-resolution heads, which aims to solve the domain offset problem caused by the difference of different clinical center imaging devices. The method comprises the following main steps of S1, constructing an input layer alignment module, adopting an adaptive instance to normalize and generate a target style image, and simultaneously reserving an original space anatomical structure, S2, constructing an output layer alignment module, adopting a mean value teacher framework, guiding student model learning through a high-quality pseudo tag and adaptive mark point shielding strategy, S3, introducing a lightweight super-resolution head, fusing multi-scale features extracted by a backbone network, and adopting pixel rearrangement to generate a high-resolution heat map. Compared with the existing method, the method has the advantages that (1) the cross-domain adaptability is effectively improved through input-output double-layer alignment, and (2) the super-resolution head can directly decode the coordinates without complex post-processing, so that the quantization error and the calculation complexity are reduced.
Inventors
- LU GANG
- LI JIAHENG
- XIAO YIBO
- LIN XIANGHONG
Assignees
- 西北师范大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260128
Claims (10)
- 1. An unsupervised anatomical landmark point positioning method based on double-layer alignment and super-resolution head is characterized by comprising the following steps: S1, acquiring source domain and target domain images, adopting self-adaptive instance normalization AdaIN at an input layer to convert the source domain images into images with target domain styles, and reserving a space anatomical structure of the images; s2, adopting a teacher-student training paradigm at an output layer, adopting pseudo labels generated by a teacher model and consistency regularization constraint on unlabeled samples, and guiding the student model to learn in a target domain; And S3, extracting multi-scale features by adopting a backbone network, generating a high-resolution heat map by fusing the multi-scale features and the low-resolution heat map through a lightweight super-resolution head module, and finally decoding to obtain the space coordinates of the anatomical landmark points.
- 2. The method for accurately positioning an unsupervised anatomical landmark based on a dual-layer alignment and super-resolution head as set forth in claim 1, wherein said step S1 comprises the steps of And target domain training data Wherein Representing the ith Zhang Touying X-ray image and the corresponding L real coordinates of the mark points H and w represent the height and width of the image, respectively, And Respectively representing the number of training samples of a source domain and a target domain; Adaptive instance normalization AdaIN combines the spatial anatomy of the original image and the style information of the target image to accomplish efficient style migration within the input space, given the source image And a target image Fusion of source domain and target domain features by adaptive instance normalization AdaIN The formula is as follows: ; Wherein the method comprises the steps of And Respectively representing the channel mean and standard deviation, and finally transforming the source image into the image of the target domain style through the decoder g : ; Wherein the method comprises the steps of As a source-target tradeoff parameter for controlling the degree of stylization, the decoder g is made up of multiple convolutional layers, a ReLU activation function, mirror fill, and nearest neighbor upsampling operations.
- 3. The method for accurately positioning the unsupervised anatomical landmark based on the double-layer alignment and the super-resolution head according to claim 1, wherein the step S2 adopts a teacher-student training paradigm at the output layer, and specifically comprises: The teacher-student cooperative training paradigm is adopted, and in the full supervision training stage, supervision branches are optimized by minimizing the mean square error loss From real heat maps Prediction heat map with student model The constitution is expressed as: ; Wherein the method comprises the steps of Representing a real heat map constructed from L two-dimensional Gaussian kernels, y representing pixel position: 。
- 4. The method for accurately positioning the unsupervised anatomical landmark based on the dual alignment and super resolution head according to claim 3, wherein in the step S2, for the unlabeled samples, a dual-branch disturbance strategy is adopted, namely, the samples processed by the spatial enhancement A 1 are input into the student branches, and the samples processed by the spatial enhancement A 2 and stylized dual disturbance are input into the teacher branches, and for establishing a supervision closed loop, the two-branch disturbance strategy is adopted by inverse spatial transformation And And adopting a self-adaptive mark point shielding strategy to adjust through shielding the image area corresponding to the high confidence point.
- 5. The method for accurately positioning the unsupervised anatomical landmark based on the double-layer alignment and super-resolution head according to claim 4, wherein the adapting the adaptive landmark occlusion strategy adjusts by occluding the image area corresponding to the high confidence point, specifically comprises: Firstly, the inverse transformation is applied to project the predicted heat map generated by the teacher model back to the original input image space Then adopts the known transformation matrix Alignment of heat maps to student model input space And then replacing the randomly selected central area of the high-confidence characteristic points with randomly sampled pixel blocks at other positions in the same image.
- 6. The method for accurately positioning the unsupervised anatomical landmark based on the double-layer alignment and the super-resolution head according to claim 4, wherein the step S2 further provides a pseudo tag learning mechanism based on high confidence level screening, specifically, when the first heat map Is greater than the adaptive landmark occlusion threshold The threshold value is obtained through dynamic estimation, and the previous p% quantile is selected from the maximum activation value of the heat map corresponding to each mark point to be used as a limit; ; Wherein the method comprises the steps of And returning to 1 when the specified condition is met for indicating the function, otherwise returning to 0, thereby effectively capturing the spatial global-local context relation based on the anatomical priori knowledge by the model, and defining the joint training objective function of the positioning network as follows: ; Wherein the method comprises the steps of Is a weight coefficient; The index moving average EMA is beneficial to realizing more stable convergence of the model in the training process and enhancing the adaptability of the model, and the teacher weight At each step by student weight Is updated by EMA: ; Super-parameters of smoothing coefficient Set to 0.999.
- 7. The method for accurately positioning the unsupervised anatomical landmark based on the double-layer alignment and the super-resolution head according to claim 1, wherein in the step S3, the backbone network adopts HRNet-48 to extract the multi-scale features with rich semantic and spatial details for the input image Backbone network Generating a multi-scale feature and low resolution heat map: ; Wherein the method comprises the steps of The representation has relative to the input image A feature map of the pixel step size, ; Refers to a low resolution heat map that is 1/32 of the original image resolution in size.
- 8. The method for accurately positioning unsupervised anatomical landmark based on double-layer alignment and super-resolution head according to claim 7, wherein for realizing high-efficiency integration of cross-level features, the fusion module First, the features are matched by convolution operator Mapping and characterizing Performing up-sampling to ensure consistency of spatial resolution, performing channel stitching on the aligned features, performing primary dimension compression on the stitched features through point convolution PW Conv, inputting the primary dimension compression to two cascaded functional blocks, forming each functional block by depth convolution Conv, batch normalization BN, point convolution and GELU activation functions, generating final fusion characterization by point convolution operation, and performing feature fusion on each layer The mathematical definition of (a) is as follows: ; Wherein the method comprises the steps of Is a low resolution heat map Then, the fusion feature M is input into an encoder composed of PW convolution operation and ReLU Generating a marker point embedding for each marker point: ; For the first feature point, a large kernel convolution is first applied Generating a signature of the s 2 channels, then upsampling it into a high-resolution heat map by a pixel-reordering PS operation: ; the pixel rearrangement operation realizes the improvement of the spatial resolution on the premise of not introducing interpolation errors by reorganizing channel dimension information, thereby effectively avoiding quantization bias brought by the traditional upsampling method.
- 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the unsupervised anatomical landmark point accurate positioning method based on dual layer alignment and super resolution head according to any one of claims 1 to 8 when executing the program.
- 10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the unsupervised anatomical landmark point accurate positioning method based on dual layer alignment and super resolution head according to any one of claims 1 to 8.
Description
Unsupervised anatomical landmark positioning method based on double-layer alignment and super-resolution head Technical Field The invention relates to the field of computer vision and medical image processing, in particular to an unsupervised domain adaptation method for positioning anatomical landmark points of head shadow X-ray images. Background As a conventional examination means in the stomatology, the head shadow side position tablet has irreplaceable value in clinical application such as misjaw deformity analysis, orthodontic and orthognathic treatment scheme design, growth and development evaluation and the like. The accuracy of the marker point positioning result in the head shadow measurement process directly determines the credibility of the measurement parameters, and further influences the clinical diagnosis conclusion and the formulation of treatment strategies. The traditional deep learning method is excellent when the distribution of training data and test data is consistent, but in an actual clinical scene, the data has obvious domain deviation due to factors such as imaging equipment used by different medical centers, patient crowd difference and the like. Such domain shifting can severely hamper the generalization ability of the model, making models trained on one dataset perform poorly on another center's data. The existing unsupervised domain adaptation method usually tries to align at the feature layer or the output layer, but has the problems that (1) the traditional resistance training causes unstable training or pattern collapse, and (2) the existing heat map regression method has larger quantization error under low-resolution output. Therefore, there is a need to develop an effective method that can simultaneously address cross-domain feature alignment and improve positioning accuracy. Through retrieval, application publication number CN109166183B, an anatomical landmark point identification method comprises the steps of establishing a first three-dimensional model according to three-dimensional data, marking anatomical landmark points of the first three-dimensional model, marking corresponding grid labels, carrying out planarization treatment on the first three-dimensional model to obtain two-dimensional image data, carrying out planarization treatment on the grid labels corresponding to the first three-dimensional model to obtain two-dimensional label data, training the two-dimensional image data and the two-dimensional label data to obtain a second model, respectively taking multiple groups of prediction data as input data of the second model to obtain multiple groups of prediction results, respectively executing inverse operation of a global seamless parameterization process on the multiple groups of prediction results to obtain multiple groups of third three-dimensional models, determining a target model according to the multiple groups of third three-dimensional models, and determining landmark points corresponding to the target grid labels according to the target model and a preset energy function. On the premise of ensuring practical availability, the calculation complexity is reduced, and the time efficiency is improved. Through retrieval, application publication number CN109166183B proposes an anatomical landmark recognition method based on three-dimensional volume data. The method comprises the steps of constructing a three-dimensional model, carrying out planarization treatment on the three-dimensional model, mapping three-dimensional anatomical landmark points into two-dimensional label data, carrying out model training based on a two-dimensional image and a two-dimensional label, reconstructing the three-dimensional model through reverse parameterization of a plurality of prediction results, and determining landmark point positions by combining a preset energy function. The method reduces the calculation complexity to a certain extent and improves the processing efficiency. However, this approach does not explicitly model the cross-domain imaging differences, and the generalization capability is limited. Secondly, the position of the marker point is subjected to 'three-dimensional-two-dimensional-three-dimensional' multiple mapping and reverse calculation, and prediction deviation or parameter error in any stage can be amplified in the subsequent reverse transformation process, so that the final marker point positioning accuracy is affected, and the final marker point positioning accuracy is more obvious particularly in a fine-granularity anatomical structure positioning task. Compared with CN109166183B, the method does not depend on three-dimensional modeling and parametric inversion, but realizes high-precision positioning of cross-domain anatomical landmark points in a two-dimensional image space through a double-layer aligned unsupervised learning mechanism and a super-resolution thermogram coding strategy, and is more suitable for complex and multi-center clinica