CN-121746441-B - Multi-mode image registration method, system and computer equipment based on deformation adaptation

CN121746441BCN 121746441 BCN121746441 BCN 121746441BCN-121746441-B

Abstract

The invention discloses a multi-mode image registration method, a system and computer equipment based on deformation adaptation, which comprise the steps of collecting a plurality of groups of multi-mode images, carrying out gray level standardization, constructing a diversified registration data set, constructing a registration network model comprising a pyramid coding module, a deformation self-adaptation module, a cross-mode interaction module and a registration parameter estimation module, sequentially inputting image pairs into the modules, respectively extracting basic feature mapping, deformation feature mapping and interaction enhancement feature mapping, finally outputting an estimated conversion parameter matrix, supervising a training process through a preset loss function, selecting optimal network parameters, obtaining a trained registration model, inputting the image pairs to be registered into the trained model in actual application, obtaining the conversion parameter matrix and completing image registration. The multi-mode image registration performance can be effectively improved, and the multi-mode image registration method still has good robustness and adaptability especially under the conditions of severe geometric distortion and obvious mode difference.

Inventors

ZHU QING
LI TIANMING
WANG YAONAN
LUO JIANQIAO
HUANG JIANAN
ZENG CHAO
HUANG MANYA

Assignees

湖南大学

Dates

Publication Date: 20260508
Application Date: 20260228

Claims (8)

1. A multi-modal image registration method based on deformation adaptation, the method comprising the steps of: s1, acquiring a plurality of groups of multi-mode images and carrying out gray level standardization, dividing the standardized images into reference images and images to be registered and combining the reference images and the images to be registered into a plurality of groups of image pairs so as to construct a registration data set with diversity; S2, constructing a multi-modal image registration neural network model based on deformation adaptation, wherein the model comprises a pyramid coding module, a deformation self-adaptation module, a cross-modal interaction module and a registration parameter estimation module which are connected in sequence; S3, sequentially inputting a pair of multi-modal images in a dataset into a pyramid coding module to obtain basic feature mapping of an image pair, inputting a deformation self-adaption module to obtain deformation feature mapping of the image pair, inputting a cross-modal interaction module to generate interaction enhancement feature mapping of the image pair, and finally inputting the interaction enhancement feature mapping of the image pair into a registration parameter estimation module to output an estimated conversion parameter matrix of the corresponding image pair, wherein the deformation self-adaption module comprises a deformable convolution layer, a similarity calculation layer, a normalization layer and a standard deviation layer which are sequentially connected, the step length of the deformable convolution layer is 1, and the convolution kernel size is 3 And 3, the similarity calculation layer is used for calculating cosine similarity between the central pixel and adjacent pixels, and the basic feature mapping of the image pair in S3 is input into the deformation self-adaptive module to obtain deformation feature mapping of the image pair, wherein the deformation feature mapping is specifically as follows: (3) in the formula, A deformation-adaptive module is represented and, And Respectively representing basic feature maps of the reference image and the image to be registered in the selected image pair, And Respectively representing deformation adaptive feature mapping of a reference image and an image to be registered in a selected image pair; the cross-modal interaction module comprises a feature fusion layer, an attention weight calculation layer and a double-branch output convolution layer which are sequentially connected, wherein the double-branch output convolution layer consists of two groups of independent convolution layers, the step length is 1, and the convolution kernel size is 1 1, S3, generating interaction enhancement feature mapping of the image pair by a cross-modal interaction module through deformation feature mapping input of the image pair, wherein the interaction enhancement feature mapping specifically comprises the following steps: (4) in the formula, The cross-modal interaction module is represented, And Respectively representing deformation adaptive feature mapping of the reference image and the image to be registered in the selected image pair, And Respectively representing interaction enhancement feature mapping of a reference image and an image to be registered in a selected image pair; s4, based on basic feature mapping and estimation conversion parameter matrix of the image pairs, calculating a total loss function value of each image pair by combining a preset loss function, supervising a training process of the multi-modal image registration neural network model, and selecting network parameters with the minimum loss value to update the multi-modal image registration neural network model to obtain a trained multi-modal image registration neural network model; S5, acquiring a multi-modal image to be registered in an actual scene, forming an image pair to be registered, processing the multi-modal image pair to be registered by using a trained multi-modal image registration neural network model to obtain an image conversion parameter matrix, and converting the image to be registered in the image pair according to the image conversion parameter matrix to obtain a registered image.
2. The deformation-adaptation-based multi-modal image registration method as set forth in claim 1, wherein in S1, a plurality of groups of multi-modal images are acquired and gray scale normalization is performed, and the gray scale normalization process is specifically expressed as follows: (1) in the formula, And Representing the positions of the original image and the normalized image respectively Is used for the display of the display panel, And Representing the maximum pixel value and the minimum pixel value in the original map, respectively.
3. The deformation adaptation-based multi-modal image registration method as claimed in claim 2, wherein the pyramid coding module includes a start convolution module and a three-level residual module connected in sequence, wherein the start convolution module includes a convolution layer, a normalization layer and an activation function layer connected in sequence, a step size of the convolution layer is 1, and a convolution kernel size is 3 3, The residual error module comprises a first convolution layer and a second convolution layer which are connected in sequence, wherein the step length of the first convolution layer is 2, and the convolution kernel size is 3 3, Step length of the second convolution layer is 1, convolution kernel size is 3 3, Interlayer pass through 1 And (1) convolution adjustment dimension to realize residual connection, wherein in S3, a pair of multi-mode images in the data set are sequentially input into a pyramid coding module to obtain basic feature mapping of the image pair, specifically: (2) in the formula, The pyramid coding module is represented as such, And Representing the reference image and the image to be registered in the selected image pair respectively, And Respectively representing basic feature mapping of the reference image and the image to be registered in the selected image pair.
4. The deformation adaptation-based multi-modal image registration method as claimed in claim 3, wherein the registration parameter estimation module includes a neighborhood cost roll layer, a first convolution layer, a second convolution layer, a normalization layer, an activation function layer, a max-pooling layer and a third convolution layer connected in sequence, wherein the step sizes of the first and third convolution layers are 1, and the convolution kernel size is 1 1, Step size of the second convolution layer is 1, convolution kernel size is 3 3, In S3, the interaction enhancement feature mapping of the image pair is sent to a registration parameter estimation module to output a conversion parameter matrix of the corresponding image pair, specifically: (5) in the formula, Representing an image transformation parameter matrix for an image pair, A registration parameter estimation module is represented, And Respectively representing the interaction enhanced feature mapping of the reference image and the image to be registered in the selected image pair.
5. The deformation adaptation-based multi-modal image registration method as set forth in claim 4, wherein the loss function in S4 includes a consistency loss function and a registration truth value loss function, and the consistency loss function value of the basic feature map of the reference image and the basic feature map of the image to be registered and the registration truth value loss function value of the estimated conversion parameter matrix and the truth value conversion parameter matrix are sequentially calculated, and specifically expressed as: (6) (7) (8) in the formula, Representing the calculated value of the total loss function, Representing the computed value of the consistency loss function, Representing the registration truth-loss function calculation, And As the weight coefficient of the light-emitting diode, The image conversion process is represented by a sequence of images, Representing an image transformation parameter matrix for an image pair, The euclidean norm is represented as such, And Respectively representing basic feature maps of the reference image and the image to be registered in the selected image pair, The number of norms of 1 is indicated, And The estimated conversion parameter matrix and the true conversion parameter matrix are respectively.
6. The deformation adaptation-based multi-modal image registration method as set forth in claim 5, wherein the converting the image to be registered in the image pair according to the image conversion parameter matrix in S5 includes: (9) in the formula, Representing the registered image of the image to be registered, Representing the image to be registered, The image conversion process is represented by a sequence of images, Representing an image transformation parameter matrix for the image pair.
7. A deformation adaptation based multi-modal image registration system performing the method of any one of claims 1-6, comprising a dataset determination module, a multi-modal image registration neural network model building module, an estimated transformation parameter matrix determination module for selected image pairs, a model training module, and an image registration module; The data set determining module is used for collecting a plurality of groups of multi-mode images and carrying out gray level standardization, dividing the standardized images into reference images and images to be registered and combining the reference images and the images to be registered into a plurality of groups of image pairs so as to construct a registration data set with diversity; The multi-modal image registration neural network model building module is used for building a multi-modal image registration neural network model based on deformation adaptation, and the model comprises a pyramid coding module, a deformation self-adaptation module, a cross-modal interaction module and a registration parameter estimation module which are connected in sequence; the system comprises a selected image pair estimation conversion parameter matrix determining module, a deformation self-adaptive module, a cross-modal interaction module and a registration parameter estimation module, wherein the selected image pair estimation conversion parameter matrix determining module is used for sequentially inputting a pair of multi-modal images in a data set into the pyramid coding module to obtain basic feature mapping of the image pair, inputting the deformation self-adaptive module to obtain deformation feature mapping of the image pair, inputting the cross-modal interaction module to generate interaction enhancement feature mapping of the image pair, and finally sending the interaction enhancement feature mapping of the image pair into the registration parameter estimation module to output a conversion parameter matrix of the corresponding image pair; The model training module is used for calculating the total loss function value of each image pair by combining a preset loss function based on the basic feature mapping and the estimated conversion parameter matrix of the image pair, supervising the training process of the multi-modal image registration neural network model, and selecting the network parameter with the minimum loss value to update the multi-modal image registration neural network model so as to obtain a trained multi-modal image registration neural network model; the image registration module is used for processing the multi-mode images to be registered by using the trained multi-mode image registration neural network model to obtain an image conversion parameter matrix, and converting the images to be registered in the image pair according to the image conversion parameter matrix to obtain registered images.
8. Computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

Description

Multi-mode image registration method, system and computer equipment based on deformation adaptation Technical Field The present invention relates to the field of image data processing technologies, and in particular, to a deformation adaptive multi-mode image registration method, system, and computer device. Background The rapid development of multi-source imaging devices has led to the widespread use of optical, infrared, radar and other novel sensors in the fields of remote sensing mapping, medical imaging, autopilot and industrial detection. The multi-mode images acquired by different imaging mechanisms can present complementary structure and spectrum information, and provide more comprehensive basis for environment perception and data analysis. However, such images have significant differences in brightness distribution, texture detail, and spatial geometry, directly leading to accurate registration of cross-modality images, which is a long-standing challenge. The traditional registration method usually determines a space transformation relation based on feature point matching or gray level similarity, but when the significant brightness and structure differences among modes are faced, the problems that the features are difficult to correspond, the noise is sensitive and the like often occur, and the overall accuracy and the robustness are limited. While the deep learning approach improves this problem to some extent, existing networks focus on overall rigid or simple non-rigid transformations, and are still significantly inadequate for adaptation to the existence of severe geometric distortions. In order to overcome the challenges, the invention provides a multi-mode image registration method based on deformation adaptation, which can effectively improve the multi-mode image registration performance, and has good robustness and adaptability under the conditions of serious geometric distortion and obvious mode difference. Disclosure of Invention Aiming at the technical problems, the invention provides a multi-mode image registration method, a system and computer equipment based on deformation adaptation. The technical scheme adopted for solving the technical problems is as follows: a multi-modality image registration method based on deformation adaptation, the method comprising the steps of: s1, acquiring a plurality of groups of multi-mode images and carrying out gray level standardization, dividing the standardized images into reference images and images to be registered and combining the reference images and the images to be registered into a plurality of groups of image pairs so as to construct a registration data set with diversity; S2, constructing a multi-modal image registration neural network model based on deformation adaptation, wherein the model comprises a pyramid coding module, a deformation self-adaptation module, a cross-modal interaction module and a registration parameter estimation module which are connected in sequence; S3, sequentially inputting a pair of multi-mode images in the dataset into a pyramid coding module to obtain basic feature mapping of the image pair, inputting a deformation self-adaptive module to obtain deformation feature mapping of the image pair, inputting a cross-mode interaction module to generate interaction enhancement feature mapping of the image pair, and finally inputting the interaction enhancement feature mapping of the image pair into a registration parameter estimation module to output an estimated conversion parameter matrix of the corresponding image pair; s4, based on basic feature mapping and estimation conversion parameter matrix of the image pairs, calculating a total loss function value of each image pair by combining a preset loss function, supervising a training process of the multi-modal image registration neural network model, and selecting network parameters with the minimum loss value to update the multi-modal image registration neural network model to obtain a trained multi-modal image registration neural network model; S5, acquiring a multi-modal image to be registered in an actual scene, forming an image pair to be registered, processing the multi-modal image pair to be registered by using a trained multi-modal image registration neural network model to obtain an image conversion parameter matrix, and converting the image to be registered in the image pair according to the image conversion parameter matrix to obtain a registered image. Preferably, in S1, a plurality of groups of multi-mode images are collected and gray scale normalization is performed, and the gray scale normalization process can be specifically expressed as follows: (1) in the formula, AndRepresenting the positions of the original image and the normalized image respectivelyIs used for the display of the display panel,AndRepresenting the maximum pixel value and the minimum pixel value in the original map, respectively. Preferably, the pyramid coding module comprises a s