US-12620091-B2 - Methods of automatic segmentation of anatomy in artifact affected CT images with deep neural networks and applications of same

US12620091B2US 12620091 B2US12620091 B2US 12620091B2US-12620091-B2

Abstract

Methods and systems for segmentation of structures of interest (SOI) in a CT image post-operatively acquired with an implant user in a region of interest in which an implant is implanted. The method includes inputting the post-operatively acquired CT (Post-CT) image to trained networks to generate a dense deformation field (DDF) from the input Post-CT image to an atlas image; and warping a segmentation mesh of the SOI in the atlas image to the input Post-CT image using the DDF so as to generate the segmentation mesh of the SOI in the input Post-CT image, wherein the segmentation mesh of the SOI in the atlas image is generated by applying an active shape model-based method to the atlas image.

Inventors

Benoit M. Dawant
Jianing Wang
Jack H. Noble
Robert F. Labadie

Assignees

VANDERBILT UNIVERSITY

Dates

Publication Date: 20260505
Application Date: 20220426

Claims (20)

1 . A method for segmentation of structures of interest (SOI) in a computed tomography (CT) image post-operatively acquired from an implant user in a region of interest in which an implant is implanted, comprising: providing an atlas image, a dataset and networks, wherein the dataset comprises a plurality of CT image pairs, randomly partitioned into a training set, a validation set, and a testing set, wherein each CT image pair has a pre-implantation CT (Pre-CT) image and a post-implantation CT (Post-CT) image respectively acquired in a region of a respective implant recipient before and after an implant is implanted in the region, so that the Pre-CT image and the Post-CT image of each CT image pair are an artifact-free CT image and an artifact-affected CT image, respectively; wherein the atlas image is a Pre-CT image of the region of a subject that is not in the plurality of CT image pairs; and wherein the networks comprises a first network for registering the atlas image to each Post-CT image and a second network for registering each Post-CT image to the atlas image; training the networks with the plurality of CT image pairs, so as to learn to register the artifact-affected CT images and the atlas image with assistance of the paired artifact-free CT images; inputting the CT image post-operatively acquired from the implant user to the trained networks to generate a dense deformation field (DDF) from the input Post-CT image to the atlas image; and warping a segmentation mesh of the SOI in the atlas image to the input Post-CT image using the DDF so as to generate the segmentation mesh of the SOI in the input Post-CT image, wherein the segmentation mesh of the SOI in the atlas image is generated by applying an active shape model-based (ASM) method to the atlas image.
2 . The method of claim 1 , wherein said providing the dataset comprises, for each CT image pair: rigidly registering the Pre-CT image to the Post-CT image; and aligning the registered Pre-CT and Post-CT image pair to the atlas image so that anatomical structures in the region of the respective implant recipient are roughly in the same spatial location and orientation.
3 . The method of claim 2 , wherein said providing the dataset further comprises, for each CT image pair: applying the ASM method to the registered Pre-CT image to generate a segmentation mesh of the SOI in the registered Pre-CT image (Mesh pre ); transferring the segmentation mesh of the SOI in the registered Pre-CT image (Mesh pre ) to the Post-CT image, so as to generate a segmentation mesh of the SOI in the Post-CT image (Mesh post ); and converting the segmentation mesh of the SOI in the Post-CT image (Mesh post ) to segmentation masks of the SOI in the Post-CT image (Seg post ).
4 . The method of claim 1 , wherein all of the images are resampled to an isotropic voxel size, and cropped with images of 3D voxels containing the structures of interest.
5 . The method of claim 1 , wherein said providing the dataset comprises applying image augmentation to the training set by rotating each image by a plurality of small random angles in the range of −25 and 25 degrees about the x-axis, y-axis, and z-axis, to create additional training images from each original image.
6 . The method of claim 1 , wherein the networks have network architecture in which NET sSpc-tSpc , designed for generating a DDF for warping a source image S to a target image T, comprises a Global-net and a Local-net, wherein the network architecture is configured such that after receiving the concatenation of S and T, the Global-net generates an affine transformation matrix; S is warped to T by using the affine transformation to generate an image S′; the Local-net takes the concatenation of S′ and T to generate a non-rigid local DDF; and the affine transformation and the local DDF are composed to produce an output DDF.
7 . The method of claim 6 , wherein said training the networks comprises: inputting a concatenation of the atlas image (Atlas atlas ) and each Post-CT image (Post post ) into the networks so that the first network (NET atlas-post ) generates a DDF from the atlas space to the Post-CT space (DDF atlas-post ) and the second network (NET post-atlas ) generates a DDF from the Post-CT space to the atlas space (DDF post-atlas ); warping the Pre-CT image (Pre sSpc ), the segmentation masks (Mask sSpc ), and the fiducial vertices (FidV sSpc ) in a source space to a target space by using the corresponding DDFs, to generate Pre sSpc-tSpc , Mask sSpc-tSpc , and FidV sSpc-tSpc ; and transferring Pre sSpc-tSpc , Mask sSpc-tSpc , and FidV sSpc-tSpc back to sSpc using the corresponding DDF, to generate Pre sSpc-tSpc-sSpc , Mask sSpc-tSpc-sSpc , and FidV sSpc-tSpc-sSpc , respectively, wherein O xSpc denotes an object O in the x space, sSpc and tSpc respectively denote the source and target spaces.
8 . The method of claim 7 , wherein FidV atlas and FidV post are the fiducial vertices randomly sampled from Mesh atlas and Mesh post on the fly for calculating the fiducial registration error during training.
9 . The method of claim 7 , wherein the training objective for NET sSpc-tSpc is constructed using measurements of similarity between a target object in tSpc (O tSpc ) and a source object that is transferred to tSpc from sSpc (O sSpc-tSpc ).
10 . The method of claim 9 , wherein the training objective for the networks is a weighted sum of loss terms of MSPDice, Mean FRE, NCC, CycConsis, BendE, and L2, wherein: MSPDice=MSPDice(Mask post , Mask atlas-post )+MSPDice(Mask atlas , Mask post-atlas ), wherein MSPDice(Mask tSpc , Mask sSpc-tSpc ) is a multiscale soft probabilistic Dice between Mask tSpc and Mask sSpc-tSpc that measures the similarity of the segmentation masks between Mask tSpc and Mask sSpc-tSpc ; Mean FRE= FRE (FidV post , FidV atlas-post )+ FRE (FidV atlas , FidV post-atlas ), wherein FRE (FidV tSpc , FidV sSpc-tSpc ) is a mean fiducial registration error that measures the similarity of the fiducial vertices between FidV tSpc and FidV sSpc-tSpc , and is calculated as the average Euclidean distance between the fiducial vertices in FidV tSpc and the corresponding vertices in FidV sSpc-tSpc ; NCC=NCC(Pre post , Atlas atlas-post )+NCC(Atlas atlas , Pre post-atlas ), wherein NCC(Preg tSpc , Pre sSpc-tSpc ) is a normalized cross-correlation between Pre tSpc and Pre sSpc-tSpc that measures the similarity between the warped source image and the target image; CycConsis=CycConsis atlas-post +CycConsis post-atlas , wherein CycConsis sSpc-tSpc is a cycle-consistency loss that measures the similarity between the original source objects in the source space and the source objects that are transferred from the source space to the target space and then transferred back to the source space, wherein CycConsis sSpc-tSpc =MSPDice(Mask sSpc , Mask sSpc-tSpc-sSpc )+2× FRE (FidV sSpc , FidV sSpc-tSpc-sSpc )+0.5×NCC(Pre sSpc , Pre sSpc-tSpc-sSpc ); BendE=BendE(DDF atlas-post )+BendE(DDF post-atlas ), wherein BendE(DDF sSpc-tSpc ) is a bending loss for which the DDF from the source space to the target space DDF sSpc-tSpc is regularized using bending energy; and L2=L2(NET atlas-post )+L2(NET post-atlas ), wherein L2(NET sSpc-tSpc ) is a L2 loss for which the learnable parameters of the registration network NET sSpc-tSpc are regularized by an L2 term.
11 . The method of claim 1 , wherein the region of interest includes ear, brain, heart, or other organs of a living subject, wherein the structures of interest comprise anatomical structures in the region of interest.
12 . The method of claim 11 , wherein the anatomical structures comprise intracochlear anatomy (ICA).
13 . The method of claim 1 , wherein the implant is a cochlear implant, a deep brain stimulator, or a pacemaker.
14 . A method for segmentation of structures of interest (SOI) in a computed tomography (CT) image post-operatively acquired with an implant user in a region of interest in which an implant is implanted, comprising: inputting the post-operatively acquired CT (Post-CT) image to trained networks to generate a dense deformation field (DDF) from the input Post-CT image to an atlas image, wherein the atlas image is a Pre-CT image; and warping a segmentation mesh of the SOI in the atlas image to the input Post-CT image using the DDF so as to generate the segmentation mesh of the SOI in the input Post-CT image, wherein the segmentation mesh of the SOI in the atlas image is generated by applying an active shape model-based (ASM) method to the atlas image.
15 . The method of claim 14 , wherein the networks have network architecture in which NET sSpc-tSpc , designed for generating a DDF for warping a source image S to a target image T, comprises a Global-net and a Local-net, wherein the network architecture is configured such that after receiving the concatenation of S and T, the Global-net generates an affine transformation matrix; S is warped to T by using the affine transformation to generate an image S′; the Local-net takes the concatenation of S′ and T to generate a non-rigid local DDF; and the affine transformation and the local DDF are composed to produce an output DDF.
16 . The method of claim 15 , wherein the networks is trained with a dataset comprising a plurality of CT image pairs, randomly partitioned into a training set, a validation set, and a testing set, wherein each CT image pair has a pre-implantation CT (Pre-CT) image and a post-implantation CT (Post-CT) image respectively acquired in a region of a respective implant recipient before and after an implant is implanted in the region, so that the Pre-CT image and the Post-CT image of each CT image pair are an artifact-free CT image and an artifact-affected CT image, respectively.
17 . The method of claim 16 , wherein the Pre-CT image is rigidly registered to the Post-CT image for each CT image pair; and the registered Pre-CT and Post-CT image pair is aligned to the atlas image so that anatomical structures in the region of the respective implant recipient are roughly in the same spatial location and orientation.
18 . The method of claim 17 , wherein for each CT image pair, the ASM method is applied to the registered Pre-CT image to generate a segmentation mesh of the SOI in the registered Pre-CT image (Mesh pre ); the segmentation mesh of the SOI in the registered Pre-CT image (Mesh pre ) is transferred to the Post-CT image so as to generate a segmentation mesh of the SOI in the Post-CT image (Mesh post ); and the segmentation mesh of the SOI in the Post-CT image (Mesh post ) is converted to segmentation masks of the SOI in the Post-CT image (Seg post ).
19 . The method of claim 18 , wherein the networks are trained to learn to register the artifact-affected CT images and the atlas image with assistance of the paired artifact-free CT images.
20 . The method of claim 19 , wherein the networks is trained by: inputting a concatenation of the atlas image (Atlas atlas ) and each Post-CT image (Post post ) into the networks so that the first network (NET atlas-post ) generates a DDF from the atlas space to the Post-CT space (DDF atlas-post ) and the second network (NET post-atlas ) generates a DDF from the Post-CT space to the atlas space (DDF post-atlas ); warping the Pre-CT image (Pre sSpc ), the segmentation masks (Mask sSpc ), and the fiducial vertices (FidV sSpc ) in a source space to a target space by using the corresponding DDFs, to generate Pre sSpc-tSpc , Mask sSpc-tSpc , and FidV sSpc-tSpc ; and transferring Pre sSpc-tSpc , Mask sSpc-tSpc , and FidV sSpc-tSpc back to sSpc using the corresponding DDF, to generate Pre sSpc-tSpc-sSpc , Mask sSpc-tSpc-sSpc , and FidV sSpc-tSpc-sSpc , respectively, wherein O xSpc denotes an object O in the x space, sSpc and tSpc respectively denote the source and target spaces.

Description

CROSS-REFERENCE TO RELATED PATENT APPLICATIONS This application claims priority to and the benefit of U.S. Provisional Patent Application Ser. No. 63/179,655, filed Apr. 26, 2021, which is incorporated herein in its entirety by reference. This application is also a continuation-in-part application of U.S. patent application Ser. No. 17/266,180, filed Feb. 5, 2021, which is a national stage entry of PCT Patent Application No. PCT/US2019/045221, filed Aug. 6, 2019, which itself claims priority to and the benefit of U.S. Provisional Patent Application Ser. No. 62/714,831, filed Aug. 6, 2018, which are incorporated herein in their entireties by reference. STATEMENT AS TO RIGHTS UNDER FEDERALLY-SPONSORED RESEARCH This invention was made with government support under Grant Nos. R01DC014037 and R01DC014462 awarded by the National Institutes of Health (NIH). The government has certain rights in the invention. FIELD OF THE INVENTION The invention relates generally to cochlear implants, and more particularly, to atlas-based methods of automatic segmentation of intracochlear anatomy in metal artifact affected CT images of the ear with deep neural networks and applications of the same. BACKGROUND OF THE INVENTION The background description provided herein is for the purpose of generally presenting the context of the present invention. The subject matter discussed in the background of the invention section should not be assumed to be prior art merely as a result of its mention in the background of the invention section. Similarly, a problem mentioned in the background of the invention section or associated with the subject matter of the background of the invention section should not be assumed to have been previously recognized in the prior art. The subject matter in the background of the invention section merely represents different approaches, which in and of themselves may also be inventions. Work of the presently named inventors, to the extent it is described in the background of the invention section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present invention. The cochlea (FIG. 1C) is a spiral-shaped structure that is part of the inner ear involved in hearing. It contains two main cavities: the scala tympani (ST) and the scala vestibuli (SV). The modiolus (MD) is a porous bone around which the cochlea is wrapped that hosts the auditory nerves. A cochlear implant (CI) is an implanted neuroprosthetic device that is designed to produce hearing sensations in a person with severe to profound deafness by electrically stimulating the auditory nerves. CIs are programmed postoperatively in a process that involves activating all or a subset of the electrodes and adjusting the stimulus level for each of these to a level that is beneficial to the recipient. Programming parameters adjustment is influenced by the intracochlear position of the CI electrodes, which requires the accurate localization of the CI electrodes relative to the intracochlear anatomy (ICA) in the post-implantation CT (Post-CT) images of the CI recipients. This, in turn, requires the accurate segmentation of the ICA in the Post-CT images. Segmenting the ICA in the Post-CT images is challenging due to the strong artifacts produced by the metallic CI electrodes (FIG. 1B) that can obscure these structures, often severely. For patients who have been scanned before implantation, the segmentation of the ICA can be obtained by segmenting their pre-implantation CT (Pre-CT) image (FIG. 1A) using an active shape model-based (ASM) method. The outputs of the ASM method are surface meshes of the ST, the SV, and the MD that have a predefined number of vertices. Importantly, each vertex corresponds to a specific anatomical location on the surface of the structures and the meshes are encoded with the information needed for the programming of the implant. Preserving point-to-point correspondence when registering the images is thus of critical importance in the application. The ICA in the Post-CT image of the patients can be obtained by registering their Pre-CT image to the Post-CT image and then transferring the segmentations of the ICA in the Pre-CT image to the Post-CT image using that transformation. This approach does not extend to CI recipients for whom a Pre-CT image is unavailable, which is the case for long-term recipients who were not scanned before surgery, or for recipients for whom images cannot be retrieved. Therefore, a heretofore unaddressed need exists in the art to address the aforementioned deficiencies and inadequacies. SUMMARY OF THE INVENTION In one aspect, the invention relates to a method for segmentation of structures of interest (SOI) in a computed tomography (CT) image post-operatively acquired from an implant user in a region of interest in which an implant is implanted. The method comprises providing an atlas image,