CN-121982101-A - Colon endoscope camera pose estimation method based on iterative refinement strategy

CN121982101ACN 121982101 ACN121982101 ACN 121982101ACN-121982101-A

Abstract

A method for estimating the pose of a colonoscope camera based on an iterative refinement strategy relates to the technical field of medical image processing and computer vision and comprises the steps of constructing and preprocessing an image data set, constructing a network model integral frame, designing and realizing a feature extraction network, designing and realizing a camera motion classification network, designing and realizing an iterative relative pose refinement network, training and using a network model and the like. The method utilizes the camera motion classification network to distinguish two camera motion modes of insertion and withdrawal, combines a coarse-to-fine iteration refinement strategy of an iteration relative pose refinement network to improve pose estimation precision, effectively improves learning ability and generalization performance of the whole method, realizes accurate estimation of the pose of the camera of the colon endoscope, can be widely applied to camera pose estimation tasks of the colon endoscope, and provides accurate camera pose support for clinical application scenes such as three-dimensional reconstruction of the colon endoscope, polyp screening and the like.

Inventors

LI ZHI
ZHANG ZHIPENG
MA RONGKANG
ZHANG GUIXU
ZENG TIEYONG

Assignees

华东师范大学

Dates

Publication Date: 20260505
Application Date: 20260129

Claims (4)

1. The method for estimating the pose of the colonoscope camera based on the iterative refinement strategy is characterized by comprising the following steps of: step S1 construction of data set Dividing the existing SimCol D data set into a training set, a verification set and a test set according to the proportion, wherein the training set is used for model training and parameter optimization, the verification set is used for evaluating the performance of the model in different training stages, and the test set is used for final performance test of the model; Step S2 pretreatment of data set Preprocessing the image data in the training set, the verification set and the test set, wherein the preprocessing operation comprises uniform image resolution, normalization of pixel values, noise addition and contrast adjustment so as to reduce the risk of overfitting and enhance the robustness of a model, and the first camera in the same camera motion sequence is performed according to time intervals Frame image and the first Pairing the frame images to form an image pair as network input; S3, constructing a model for estimating the pose of the colonoscope camera based on an iterative refinement strategy The constructed colonoscope camera pose estimation model is a characteristic extraction network-camera motion classification network-iteration relative pose refinement network structure, and specifically comprises the following steps: s3-1 design and implementation of feature extraction network Using ResNet-18 network as characteristic extraction network, and matching inputted endoscope image Extracting multi-scale features, and selecting a top-level feature map As image characterizations, the following are presented: , Wherein the method comprises the steps of Representing ResNet-18 networks, top level feature graphs Global average pooling Global Average Pooling of channel dimensions, GAP and pose feature generation And performing channel dimension stitching to obtain pose feature vectors The processing mode is as follows: , Pose feature vector Providing feature input for a subsequent camera motion classification network and an iterative relative pose refinement network; S3-2 design and implementation of camera motion classification network Camera motion classification networks Camera Motion Classification Network, CMCN are used for pose feature vectors Performing two classification for judging input image pair Belongs to the insertion movement ) Or withdraw from the motion ) Wherein Indicating that the camera is on On-axis translation amount, network output image Relative to the image Probability vectors belonging to two classes of motion The whole process is expressed as follows: , Probability vector The method comprises the steps of combining pose estimation output by an iterative relative pose refinement network to generate final camera pose estimation; s3-3 design and implementation of iterative relative pose refinement network Iterative relative pose refinement networks ITERATIVE RELATIVE Pose Refinement Network, IRPRN employ coarse-to-fine iterative refinement strategies, the networks first employing an initial rotation header And initially translating the head Pose feature vector Generating an initial relative pose The process is as follows: , Wherein the method comprises the steps of And Representing an initial rotation vector and an initial translation vector, respectively, and then the first in the network The relative pose refinement layers adopt dynamic convolution to extract the features of the previous layer of feature images and generate the first layer of feature images Layer refinement relative pose First, the Refining relative pose in layers The generation mode of the (c) is as follows: , , , , Wherein the method comprises the steps of Represent the first The dynamic convolution operation of the layers, And Respectively represent the first The rotation vector offset and translation vector offset of the layer are used To represent the relative pose that the network ultimately generates, In (3) rotation vector Translation vector All are inserted into the body ) And withdraw from the motion ) Two parts, thus, relatively pose Can be expressed as: , finally, probability vectors output for the camera motion classification network And iterating the relative pose of the relative pose refinement network output A weighted summation is required in the following manner: , obtaining final camera pose output by a colon endoscope camera pose estimation model based on an iterative refinement strategy ; Step S4, training and use of network model Training the model for estimating the pose of the colonoscope camera based on the iterative refinement strategy constructed in the step S3 by using the training data processed in the steps S1 and S2, calculating absolute translation errors Absolute Translation Error, ATE, relative translation errors Relative Translation Error, RTE and Rotation Error (RE) indexes on a verification set after each training, verifying the training effect of the model, storing a final model after training, inputting a pair of colonoscope images to be estimated when the stored model is used for estimating the pose of the colonoscope image in a test set, and outputting the pose of the camera after the model is processed for a follow-up three-dimensional reconstruction or polyp screening task.
2. The method for estimating the pose of a colonoscope camera based on iterative refinement strategy according to claim 1, wherein said existing SimCol D dataset is divided into training set, validation set and test set according to a ratio of 7:2:1.
3. The method for estimating the pose of the colonoscope camera based on the iterative refinement strategy according to claim 1, wherein the camera motion classification network in the step 3-2 adopts a multi-layer fully-connected network structure, a ReLU activation function and a Dropout random inactivation mechanism are sequentially arranged after each linear layer, and finally a classification result of the camera motion is output through a Softmax function.
4. The method for estimating the pose of the colonoscope camera based on the iterative refinement strategy according to claim 1, wherein the iterative relative pose refinement network in the step 3-3 adopts a coarse-to-fine iterative refinement strategy, three relative pose refinement layers are used for iteratively refining the initial relative pose, dynamic convolution in each layer is used for adaptively adjusting a convolution kernel weight focusing edge and a corner point area, and all rotating heads and translating heads in the network are of a multi-layer fully-connected network structure.

Description

Colon endoscope camera pose estimation method based on iterative refinement strategy Technical Field The invention relates to the technical field of medical image processing and computer vision, in particular to a method for estimating the pose of a video camera of a colon endoscope based on an iterative refinement strategy, which is suitable for three-dimensional reconstruction of the colon endoscope, polyp screening and other clinical scenes, and solves the problem of estimating the pose of the video camera in the endoscope scene caused by complex structure, self-shielding and illumination change. Background Colonoscopy is a core means of colon cancer screening, and three-dimensional reconstruction of a colonoscope scene can help doctors to find areas of the colon wall which are not adequately screened, so that the missed diagnosis rate of intestinal wall polyps is reduced. And accurate camera pose estimation is a key precondition for three-dimensional reconstruction of a colon endoscope. The existing endoscope pose estimation method is mainly divided into two types, namely a self-supervision learning-based method, depth and pose networks are trained through distortion loss of images without marking data, but the methods cannot learn insertion and withdrawal movements of a specific camera of a colon endoscope, are easy to converge to local optimum, so that pose estimation deviation of the camera exists, and the other type is a supervision learning-based method, relies on synthetic data for training, but the camera movement of the existing synthetic data set has large difference from the real endoscope movement, has poor generalization performance, and the adopted pose estimation network lacks an iterative refinement mechanism, so that complex situations of a colon endoscope scene are difficult to deal with. In addition, some SLAM or SfM methods based on characteristics are limited by the problems of self-shielding, texture deletion, inconsistent illumination and the like in a colonoscope scene, and the reconstruction success rate is low. Therefore, the design of the method for estimating the pose of the colonoscope camera, which can learn the movement mode of the colonoscope camera, has strong generalization and accurate estimation, has important significance for promoting the clinical application of three-dimensional reconstruction of an endoscope and polyp screening. Disclosure of Invention The invention aims to provide a colon endoscope camera pose estimation method based on an iterative refinement strategy, aiming at the defects of incapability of learning camera motion, poor generalization and low estimation precision existing in the existing colon endoscope camera pose estimation method. According to the method, training images and real labels are provided through SimCol D data sets, a camera motion classification network is adopted to learn a camera motion mode, a coarse-to-fine iteration refinement strategy of an iteration relative pose refinement network is combined to improve the camera pose estimation precision, and finally accurate prediction of the pose of the colonoscope camera is achieved. The specific technical scheme for realizing the aim of the invention is as follows: A method for estimating the pose of a colonoscope camera based on an iterative refinement strategy provides training data through SimCol D data sets, and realizes the estimation of the pose of the camera of a colonoscope image by utilizing the cooperative work of a feature extraction network, a camera motion classification network and an iterative relative pose refinement network, and specifically comprises the following steps: step S1 construction of data set Dividing the data set into a training set, a verification set and a test set according to the proportion of 7:2:1, wherein the training set is used for model training and parameter optimization, the verification set is used for evaluating the performance of a model in different training stages, and the test set is used for final performance test of the model; Step S2 pretreatment of data set Preprocessing the image data in the training set, the verification set and the test set, wherein the preprocessing operation comprises uniform image resolution, normalization of pixel values, noise addition and contrast adjustment so as to reduce the risk of overfitting and enhance the robustness of a model, and the first camera in the same camera motion sequence is performed according to time intervalsFrame image and the firstPairing the frame images to form an image pair as network input; S3, constructing a model for estimating the pose of the colonoscope camera based on an iterative refinement strategy The constructed colonoscope camera pose estimation model is a characteristic extraction network-camera motion classification network-iteration relative pose refinement network structure, and specifically comprises the following steps: 3-1 design and implementation of feature ext