US-20260127872-A1 - FRAMEWORK FOR IDENTIFYING LANDMARKS IN AN IMAGE

US20260127872A1US 20260127872 A1US20260127872 A1US 20260127872A1US-20260127872-A1

Abstract

A framework for identifying landmarks in an image. The framework includes multiple agents, each agent being a functional module designed for seeking a predefined home-landmark and a number of further landmarks in a landmark-region around a position of the agent. Each agent defines or receives its start-position, estimates a home-displacement of its home-landmark, estimates vote-displacements of further landmarks in the landmark-region, and chooses an updated start-position. The updated start-position is determined from the estimated home-displacement and the vote-displacements for this home-landmark estimated by other agents.

Inventors

Halid Yerebakan
Kritika Iyer
Gerardo Hermosillo Valadez

Assignees

Siemens Healthineers Ag

Dates

Publication Date: 20260507
Application Date: 20251010
Priority Date: 20241104

Claims (20)

1 . A system for identifying landmarks in an image, comprising: one or more processing units; and a non-transitory memory device communicatively coupled to the one or more processing units, the non-transitory memory device stores computer readable program code, the one or more processing units being operative with the computer readable program code to perform steps including: a) defining or receiving, by an agent, a start-position, b) estimating, by the agent, a home-displacement for a home-landmark, c) estimating, by the agent, vote-displacements for further landmarks in a landmark-region, d) choosing, by the agent, an updated start-position, determined from the home-displacement and vote-displacements for the home-landmark estimated by other agents, and e) repeating steps b) to d) with the updated start-position as start-position until a termination condition is reached.
2 . The system according to claim 1 wherein the agent comprises a machine learning network trained for estimating the home-displacement of the home-landmark and the vote-displacements for the further landmarks in the landmark-region.
3 . The system according to claim 2 wherein the machine learning network comprises a Resnet architecture.
4 . The system according to claim 2 wherein the machine learning network is trained to compute relative displacements to go to all landmarks in the landmark-region.
5 . The system according to claim 1 , wherein the agent receives the vote-displacements of the home-landmark estimated by the other agents, wherein the agent determines the updated start-position based on the home-displacement and the vote-displacements received from other agents.
6 . The system according to claim 1 , wherein the agent receives current agent-positions of at least the other agents that provide the vote-displacements and the agent determines the updated start-position based on a weighted determination, wherein vote-displacements of nearer agents have a greater weight than vote-displacements of farther agents.
7 . The system according to claim 1 , wherein the agent determines the updated start-position based on a weighted determination, wherein farther vote-displacements having a distance greater than a predefined threshold from a current start-position or a home-displacement have a smaller weight than nearer vote-displacements.
8 . The system according to claim 1 , wherein the system comprises multiple independent calculation units and wherein different agents are processed with different calculation units.
9 . The system according to claim 1 , wherein the agent comprises multiple regression heads designed to determine the updated start-position based on the home-displacement and the vote-displacements.
10 . The system according to claim 1 , wherein the one or more processing units are operative with the computer readable program code to choose the updated start-position from a determined average position or a median position from an estimated home-displacement and the vote-displacements estimated by the other agents.
11 . The system according to claim 1 , wherein the one or more processing units are operative with the computer readable program code to choose the updated start-position from a determined average position or a position between a current start-position of the agent and the determined average position.
12 . The system according to claim 1 , wherein the agent comprises a residual network that receives an input descriptor of a current start-position of the agent in the image, and wherein the residual network is trained to output vote-displacements of landmarks in a world coordinate system.
13 . The system according to claim 12 , wherein the agent projects the input descriptor into lower dimension with a linear projection layer.
14 . The system according to claim 12 , wherein the agent applies several layers of the residual network with residual connection after an initial projection.
15 . A method for identifying landmarks in an image, comprising: providing image-data; forwarding datasets of the image-data to agents, wherein each agent at least receives a landmark-region of the image-data as dataset; and processing, by the agents, the datasets, including a) defining or receiving, by at least one of the agents, a start-position, b) estimating, by the at least one of the agents, a home-displacement for a home-landmark, c) estimating, by the at least one of the agents, vote-displacements for further landmarks in the landmark-region, d) choosing, by the at least one of the agents, an updated start-position determined from the home-displacement and vote-displacements for the home-landmark estimated by other agents, and e) repeating steps b) to d) with the updated start-position as start-position until a termination condition is reached.
16 . The method according to claim 15 , further comprising: determining current agent-positions of at least the other agents that provide vote-displacements; and determining the updated start-position based on a weighted determination.
17 . The method according to claim 15 , wherein the updated start-position is determined based on a linear regression of the home-displacement and position of the home-landmark estimated by the other agents.
18 . The method according to claim 15 , wherein the at least one of the agents utilizes displacement vectors to get closer to the home-landmark, wherein the displacement vectors are determined based on a weighted average of vote estimation and assigned agent's displacement.
19 . The method according to claim 15 wherein the processing, by the agents, the datasets comprises parallel-processing.
20 . One or more non-transitory computer-readable media embodying instructions executable by machine to perform operations for identifying landmarks in an image, comprising: a) defining or receiving, by an agent, a start-position, b) estimating, by the agent, a home-displacement for a home-landmark, c) estimating, by the agent, vote-displacements for further landmarks in a landmark-region, d) choosing, by the agent, an updated start-position, determined from the home-displacement and vote-displacements for the home-landmark estimated by other agents, and e) repeating steps b) to d) with the updated start-position as start-position until a termination condition is reached.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of priority from DE Application No. 10 2024 132 018.4, filed on Nov. 4, 2024, the contents of which are incorporated by reference. TECHNICAL FIELD The present framework describes a system and a method for identifying landmarks in an image, a control-unit for a medical imaging system and a medical imaging system. BACKGROUND Landmark localization is important for automatically processing images. There is a large variety of known landmark-identification procedures that partly provide automation tools for various workflow steps. One example for a tool automatically classifying pixels or voxels in an image for containing a given landmark is Adaboost. One example for a module that can be used for automatically finding landmarks is “ALPHA Technology and Voting”. This landmarking algorithm is composed of several cascaded Adaboost models which classify voxels in an image as containing a given landmark in a coarse-to-fine manner. After detecting several landmarks using Adaboost, a voting model based on the spatial correlation of landmark groups is used to remove outliers and interpolate missing landmarks. Optionally, the Adaboost and voting steps may be repeated after aligning the image to a canonical space, which improves robustness and speed of landmark detection. Training the Adaboost and voting models requires at least 50-100 annotations per landmark. Once the classifier is trained, landmarks can be found by scanning an entire image which takes about half a second per landmark. Another example for an algorithm for automatically finding landmarks is “BodyGPS”. This algorithm is based on a self-supervised methodology for estimating normalized locations for landmarks using a regression network. This method generalizes many types of landmark locations without explicit supervised training for it. However, since no supervision is used, the estimates are not precise and may vary around 10 mm at 90% around the correct position. However, the runtime is fast with a single search agent around 50 ms. Another example for an algorithm for automatically finding landmarks is “MedLSAM”. It is based on a method that may perform organ localization based on given templates. It uses similar regression methodologies by creating ground truth via relative displacements in the images. On the other hand, it does not consider relationships between target structures. Thus, the individual landmarks do not benefit from other found locations. Noothout at. al. showed in their work “Deep learning-based regression and classification for automatic landmark localization in medical images” (IEEE transactions on medical imaging, 39(12), 4011-4022; 2020, which is herein incorporated by reference) that recent research has reported multiple landmark location estimation algorithms using global to local strategy with convolutional neural networks. The global network estimates all landmark locations at once using combination of classifier and regressor. Then, landmark specific model refines the proposed location. They report fast processing times with good localization precision. However, it still uses all patches within the image for localization estimation which is not necessary. Processing times reduced in GPU hardware that is a limitation in production environment. Ghesu et. al. show in their paper “Multi-scale deep reinforcement learning for real-time 3D-landmark detection in CT scans” (IEEE transactions on pattern analysis and machine intelligence, 41(1), 176-189; 2017, which is herein incorporated by reference) that there are single agent supervised landmark search algorithms. In their setting, the agent updates its state with limited number of actions to find desired location. Reinforcement algorithm is used to train policy of the agent. These methods are not able to identify landmarks in a robust and accurate, but yet also fast manner. SUMMARY A framework for identifying landmarks in an image is described herein. The framework includes multiple agents, each agent being a functional module designed for seeking a predefined home-landmark and a number of further landmarks in a landmark-region around a position of the agent. Each agent defines or receives its start-position, estimates a home-displacement of its home-landmark, estimates vote-displacements of further landmarks in the landmark-region, and chooses an updated start-position. The updated start-position is determined from the estimated home-displacement and the vote-displacements for this home-landmark estimated by other agents. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 shows an exemplary system and predictions of an agent; FIG. 2 shows the work of an exemplary system; FIG. 3 shows multi agent estimates to update positions of individual agents; and FIG. 4 shows the process performed with independently working agents. The various embodiments are described with reference to the drawings, wherein like reference numerals a