CN-121999037-A - System and method for identifying landmarks in images

CN121999037ACN 121999037 ACN121999037 ACN 121999037ACN-121999037-A

Abstract

Systems and methods for identifying landmarks in images. The invention relates to a system for identifying landmarks in an image, comprising a plurality of agents, each agent being a functional module designed for finding a predefined home landmark and a plurality of further landmarks in a landmark region around the localization of the agent, wherein each agent is designed for a) defining or receiving its home localization, b) estimating the home displacement of its home landmark, c) estimating the voting displacement of further landmarks in the landmark region, d) choosing an updated home localization determined from the estimated home displacement and the voting displacement of said home landmarks estimated by other agents, e) repeating steps b) to d) using the updated home localization as the home localization until a termination condition is reached. Still further, a method, a control unit and a medical imaging system are described.

Inventors

H. Yerebaken
K. Yi'er
G. Emosiyo Valdez

Assignees

西门子医疗股份公司

Dates

Publication Date: 20260508
Application Date: 20251031
Priority Date: 20241104

Claims (15)

1. A system (7 a) for identifying landmarks (2) in an image (1), comprising a plurality of agents (7), preferably working in parallel, each agent (7) being a functional module designed for finding a predefined home landmark (2) and a plurality of further landmarks (2) in a landmark region (4) around the localization of the agent (7), wherein each agent (7) is designed for: a) Defining or receiving its initial position (3), B) Estimating a home displacement (5) of its home landmark (2), C) Estimating voting displacements (6) of the further landmarks (2) in the landmark region (4), D) -selecting an updated starting position (3 a) determined from the estimated home displacement (5) and voting displacement (6) of the home landmark (2) estimated by other agents (7), E) Repeating steps b) to d) with the updated starting position (3 a) as starting position (3) until a termination condition is reached.
2. The system according to claim 1, wherein each agent (7) comprises a machine learning network (10), the machine learning network (10) being trained for estimating a home displacement (5) of its home landmark (2) and a voting displacement (6) of further landmarks (2) in the landmark region (4), preferably wherein the machine learning network (10) comprises a Resnet architecture, preferably wherein the machine learning network (10) is trained for calculating all landmarks (2) whose relative displacement was in the past landmark region (4), and/or is trained with random sampling points and supervised landmark (2) positioning.
3. The system according to one of the preceding claims, wherein the system (7 a) is designed such that each agent (7) receives a voting displacement (6) of its home landmark (2) estimated by the other agents (7), wherein each agent (7) is designed to determine an updated starting position (3 a) based on its estimation if the home displacement (5) and the received voting displacement (6) are received from the other agents (7), Preferably, wherein the system (7 a) is further designed such that each agent (7) receives a current agent position (3) of at least the other agents (7) providing the voting displacement (6), and each agent (7) is designed to determine an updated starting position (3 a) based on a weighted determination, wherein the voting displacement (6) of a closer agent (7) has a greater weight than the voting displacement (6) of a more distant agent (7), and/or Preferably, wherein the system (7 a), in particular each agent (7), is further designed to determine an updated starting position (3 a) based on the weighted determination, wherein the voting displacement (6) having a distance from the current starting position (3) or the home displacement (5) that is larger than a predefined threshold has a smaller weight than the more recent voting displacement (6).
4. The system according to one of the preceding claims, wherein the system (7 a) comprises a plurality of independent computing units, and wherein different agents (7) are handled with different computing units, Preferably, each agent (7) is processed with an individual computing unit, such that all agents (7) are processed in parallel.
5. The system according to one of the preceding claims, wherein the system, in particular a plurality of agents (7), preferably each agent (7), comprises a plurality of regression heads designed to determine an updated starting position (3 a) based on the home displacement (5) and the voting displacement (6).
6. The system according to one of the preceding claims, wherein the system (7 a), preferably each agent (7), is designed to select an updated starting position (3 a) depending on the average position or intermediate position determined from the estimated home displacement (5) and the voting displacement (6) estimated by the other agents (7), Preferably, wherein the updated initial position (3 a) of the agent (7) is the determined average position or a position between the current initial position of the agent (7) and the determined average position.
7. The system according to one of the preceding claims, wherein each agent (7) comprises a residual network (10), the residual network (10) being designed to receive an input descriptor (11) of a current starting location (3) of the agent (7) in the image (1), and wherein the residual network (10) is trained to output voting displacements (6) of landmarks (2) in the world coordinate system (7 a), Preferably, wherein the agent (7) is designed to project the input descriptors (11) into a lower dimension using a linear projection layer, And preferably wherein the agent (7) is designed to apply several layers of the residual network (10) with residual connections after the initial projection.
8. Method for identifying landmarks (2) in an image (1) with a system (7 a) according to one of the preceding claims, comprising the steps of: providing image data (1), Forwarding the dataset of image data (1) to the agents (7), wherein each agent (7) receives at least the landmark region (4) of its image data (1) as the dataset, -Processing, preferably parallel processing, of the data sets by the agents (7), wherein each agent (7): a) Defining or receiving its initial position (3), B) Estimating a home displacement (5) of its home landmark (2), C) Estimating voting displacements (6) of the further landmarks (2) in the landmark region (4), D) -selecting an updated starting position (3 a) determined from the estimated home displacement (5) and voting displacement (6) of the home landmark (2) estimated by other agents (7), E) Repeating steps b) to d) with the updated starting position fix (3 a) as starting position fix (3) until a termination condition is reached.
9. The method according to claim 8, wherein additionally the system (7 a), in particular each agent (7), also determines a current agent position (3) of at least the other agents (7) providing the voting displacement (6), and each agent (7) determines an updated starting position (3 a) based on the weighted determination, wherein the voting displacement (6) of a closer agent (7) has a greater weight than the voting displacement (6) of a further agent (7), and/or the voting displacement (6) pointing to a further position from the current starting position (3) of the agent (7) has a smaller weight than the voting displacement (6) pointing to a closer position from the current starting position (3) of the agent (7).
10. The method according to one of claims 8 or 9, wherein an updated starting position fix (3 a) is determined based on a linear regression, in particular a weighted linear regression, of the estimated home displacement (5) and the position fix of the home landmark (2) estimated by the other agent (7).
11. Method according to one of claims 8 to 10, wherein the utilization of displacement vectors is formulated as a weighted average of the displacement of the voting estimated and assigned agent (7), preferably based on the following formula: Wherein in said equation the displacement estimate d1 of an agent (7) uses a scaling factor by the median estimate of the other agents (7) And wherein once said agent (7) gets closer to its assigned landmark (2), it is preferably updated during the process of giving the agent (7) its own estimated more weight 。
12. A control unit for a medical imaging system, comprising a system (7 a) according to any of claims 1 to 7 and/or being designed to perform a method according to any of claims 8 to 11.
13. A medical imaging system comprising a control unit according to claim 12.
14. A computer program product comprising instructions which, when executed by a computer, cause the computer to carry out the steps of the method of any one of claims 8 to 11.
15. A computer readable storage medium comprising instructions which, when executed by a computer, cause the computer to carry out the steps of the method of any one of claims 8 to 11.

Description

System and method for identifying landmarks in images Technical Field The invention describes a system and a method for identifying landmarks in images, a control unit for a medical imaging system and a medical imaging system. Background Landmark localization is important for automatically processing images. There are a large variety of known landmark identification procedures that in part provide automated tools for various workflow steps. One example of a tool that automatically classifies pixels or voxels in an image for inclusion of a given landmark is Adaboost. One example of a module that may be used to automatically find landmarks is "ALPHA technique and voting". The scaling (landmarking) algorithm consists of several cascaded Adaboost models that classify voxels in the image as containing a given landmark in a coarse-to-fine manner. After detecting several landmarks using Adaboost, a voting model based on the spatial correlation of the set of landmarks is used to remove outliers and interpolate missing landmarks. Optionally, after aligning the image to the canonical space, the Adaboost and voting steps may be repeated, which increases the robustness and speed of landmark detection. Training Adaboost and voting models requires at least 50-100 annotations per landmark. Once the classifier is trained, landmarks can be found by scanning the entire image, which takes about half a second per landmark. Another example of an algorithm for automatically finding landmarks is "BodyGPS". The algorithm is based on a self-supervised methodology for estimating the location of landmarks using a regression network. The method summarises many types of landmark locations without explicit supervised training for them. However, since no supervision is used, the estimation is inaccurate and may vary by about 10 mm at 90% around the correct positioning. However, the run time with a single search agent is fast, about 50 ms. Another example of an algorithm for automatically finding landmarks is "MedLSAM". It is based on a method that can be used for organ localization based on a given template. It uses a similar regression methodology by creating reference truth values in the image via relative displacement. On the other hand, it does not consider the relationship between target structures. Thus, individual landmarks do not benefit from other found locations. Noothout et al show in their work "Deep learning-based regression and classification for automatic landmark localization in medical images"（IEEE transactions on medical imaging, 39（12）, 4011-4022;2020） that recent studies have reported using multiple landmark position estimation algorithms using global to local strategies with convolutional neural networks. The global network estimates all landmark positions at once using a combination of classifiers and regressors. The landmark-specific model then refines the proposed location. They report fast processing times and good localization accuracy. However, it still uses all patches within the image for localized estimation, which is not necessary. Processing time in the GPU hardware is reduced, which is a limitation in the production environment. Ghesu et al show in their paper "Multi-scale deep reinforcement learning for real-time 3D landmark detection in CT scans"（IEEE transactions on pattern analysis and machine intelligence, 41（1）, 176-189; 2017） that there is a single agent supervised landmark search algorithm. In their setup, agents update their state with a limited number of actions to find the desired location. The reinforcement algorithm is used to train the agent's strategy. However, it is still not possible to identify landmarks in a robust and accurate yet also fast way. Disclosure of Invention It is an object of the present invention to improve the known systems and methods and to provide a system and method for identifying landmarks in images, a control unit for a medical imaging system and a medical imaging system for overcoming the above described problems. In particular, it is an object of the present invention to provide multi-agent landmark searching. This object is achieved by a system according to claim 1, a method according to claim 8, a control unit according to claim 12 and a medical imaging system according to claim 13. Hereinafter, a way of searching landmarks and using voting strategies simultaneously in each step of the search is described, thereby improving both speed and robustness. It should be noted that the discovery of landmarks is not central to the present invention. There are many known models that can identify landmarks, however, the present invention provides exact results in a short time by using the special architecture of the known model. The system according to the invention is used for identifying landmarks in images. The system comprises a plurality of agents, preferably working in parallel, each agent being a functional module designed to find a predefined home landmark (home-l