US-20260128152-A1 - CUSTOMIZED MACHINE-LEARNING TRAINING FOR RADIOTHERAPY CLINICS

US20260128152A1US 20260128152 A1US20260128152 A1US 20260128152A1US-20260128152-A1

Abstract

Disclosed herein are methods for selecting and preparing patient data to facilitate the adoption and customized training of machine-learning models in clinical settings, particularly for radiation therapy treatment planning. The disclosed embodiments streamline the customized training process through an automated workflow that includes prefiltering patient metadata, retrieving relevant DICOM files, optional data anonymization, and generation of training data. The data is then organized into a format suitable for machine-learning training. The embodiments discussed herein reduce manual labor, minimize errors, and accelerate the integration of machine-learning into clinical workflows, enabling clinics to train and implement predictive models that replicate specific clinical practices, thereby enhancing treatment precision and improving patient outcomes.

Inventors

Mikko Hakala
Elena CZEIZLER
Hannu Laaksonen

Assignees

SIEMENS HEALTHINEERS INTERNATIONAL AG

Dates

Publication Date: 20260507
Application Date: 20241107

Claims (20)

1 . A method for training a machine learning (ML) model for radiation therapy treatment planning and customizing the ML model for a clinic, the method comprising: training, by at least one processor, the ML model using a primary training dataset comprising treatment data associated with a set of previously treated patients; receiving, by at least one processor, one or more radiotherapy treatment attributes for fine-tuning the ML model to the clinic; retrieving, by the at least one processor querying a database accessible to the clinic, the database storing treatment data associated with a second set of previously treated patients at the clinic, patient data corresponding to a subset of the set of the patients satisfying the one or more radiotherapy treatment attributes corresponding to a treatment technique; concatenating, by the at least one processor, the retrieved patient data with one or more digital imaging and communications in medicine files or other medical files and corresponding metadata associated with a previously performed treatment of each patient within the subset of the set of patients; generating, by the at least one processor, a secondary training dataset based on the patient data and the concatenated one or more digital imaging and communications in medicine files or other medical files and metadata by changing at least a file structure of the secondary training dataset in accordance with a configuration file of the ML model; and fine tuning, by the at least one processor, the ML model to the treatment technique at the clinic, using the secondary training dataset, such that the ML model is customized to the clinic.
2 . The method of claim 1 , wherein the machine learning model is only trained using the secondary training dataset.
3 . (canceled)
4 . The method of claim 1 , further comprising: anonymizing, by the at least one processor, at least one of the patient data, the concatenated digital imaging and communications in medicine files or other medical files, or the metadata.
5 . The method of claim 1 , further comprising: clustering, by the at least one processor, the secondary training dataset into a plurality of consistent subsets.
6 . The method of claim 1 , further comprising: when an outlier data point is identified, removing, by the at least one processor, the outlier data point within the secondary training dataset.
7 . (canceled)
8 . The method of claim 1 , further comprising: de-duplicating, by the at least one processor, the secondary training dataset by removing data associated with patients that satisfy a similarity threshold.
9 . A non-transitory computer-readable medium for training a machine learning (ML) model for radiation therapy treatment planning and customizing the ML model for a clinic, the computer-readable medium comprising instructions that when executed cause a processor to: train the ML model using a primary training dataset comprising treatment data associated with a set of previously treated patients; receive one or more radiotherapy treatment attributes for fine-tuning the ML model to the clinic; retrieve by the at least one processor querying a database accessible to clinic, the database storing treatment data associated with a second set of previously treated patients at the clinic, patient data corresponding to a subset of the set of the patients satisfying the one or more radiotherapy treatment attributes corresponding to a treatment technique; concatenate the retrieved patient data with one or more digital imaging and communications in medicine files or other medical files and corresponding metadata associated with a previously performed treatment of each patient within the subset of the set of patients; generate a secondary training dataset based on the patient data and the concatenated one or more digital imaging and communications in medicine files or other medical files and metadata by changing at least a file structure of the secondary training dataset in accordance with a configuration file of the ML model; and fine tune the ML model to the treatment technique at the clinic, using the secondary training dataset, such that the ML model is customized to the clinic.
10 . The computer-readable medium of claim 9 , wherein the machine learning model is only trained using the secondary training dataset.
11 . (canceled)
12 . The computer-readable medium of claim 9 , wherein the instructions further cause the processor to anonymize at least one of the patient data, the concatenated digital imaging and communications in medicine files or other medical files, or the metadata.
13 . The computer-readable medium of claim 9 , wherein the instructions further cause the processor to cluster the secondary training dataset into a plurality of homogeneous subsets.
14 . The computer-readable medium of claim 9 , wherein the instructions further cause the processor to, when an outlier data point is identified, remove the outlier data point within the secondary training dataset.
15 . (canceled)
16 . The computer-readable medium of claim 9 , wherein the instructions further cause the processor to de-duplicate the secondary training dataset by removing data associated with patients that satisfy a similarity threshold.
17 . A computer system for training a machine learning (ML) model for radiation therapy treatment planning and customizing the ML model for a clinic, the computer system comprising a processor configured to: train the ML model using a primary training dataset comprising treatment data associated with a set of previously treated patients; receive one or more radiotherapy treatment attributes for fine-tuning the ML model to the clinic; retrieve by the at least one processor querying a database accessible to clinic, the database storing treatment data associated with a second set of previously treated patients at the clinic, patient data corresponding to a subset of the set of the patients satisfying the one or more radiotherapy treatment attributes corresponding to a treatment technique; concatenate the retrieved patient data with one or more digital imaging and communications in medicine files or other medical files and corresponding metadata associated with a previously performed treatment of each patient within the subset of the set of patients; generate a secondary training dataset based on the patient data and the concatenated one or more digital imaging and communications in medicine files or other medical files and metadata by changing at least a file structure of the secondary training dataset in accordance with a configuration file of the ML model; and fine tune the ML model to the treatment technique at the clinic, using the secondary training dataset, such that the ML model is customized to the clinic.
18 . The computer system of claim 17 , wherein the machine learning model is only trained using the secondary training dataset.
19 . (canceled)
20 . The computer system of claim 17 , wherein the processor is further configured to anonymize at least one of the patient data, the concatenated digital imaging and communications in medicine files or other medical files, or the metadata.

Description

TECHNICAL FIELD This application relates generally to clinic-specific radiotherapy planning systems, and in particular, to customization of training machine-learning models for radiotherapy planning, to increase their operational efficiency. BACKGROUND Radiation therapy treatment planning (RTTP) is a complex process that contains specific guidelines, protocols, and instructions adopted by different medical professionals, such as clinicians, medical device manufacturers, treating physicians, and the like. Due to the extreme nature of radiation emitted from radiotherapy machines, it is imperative that all the instructions are precisely followed. Field geometry, as used in the context of RTTP, refers to various attributes or settings of a radiotherapy machine while a patient receives a prescribed radiotherapy dose. For instance, a prescribing physician may identify a structure (e.g., the patient's organ to be treated or tumor to be eradicated) and a corresponding dosage. Moreover, other parties (e.g., clinicians or machine manufacturers) may determine positioning attributes (e.g., angles) of the gantry and the patient on the couch to provide optimum treatment. In order to increase efficiency in this process, many clinics use their own machine-learning models that are trained using a central entity. For instance, an entity may provide a machine learning model to a particular clinic where the clinic can customize or fine-tune the machine learning model using their own patient data and/or clinic-specific rules and protocols. This customized machine-learning approach can be integrated into plan optimizer platforms wherein a machine-learning model trained using a cohort of patients is fine-tuned for a particular clinic. These approaches primarily aim to assist medical professionals and dosimetrists in creating high-quality, consistent treatment plans for cancer patients by leveraging past treatment data using their customized model for a particular clinic. Using this paradigm, each clinic can train its model (or adapt by further training a pre-trained model) using its own historical treatment data or any customized set of data. This allows the model to reflect that clinic's specific clinical practices and preferences, ensuring that the generated treatment plans align with their standards of the clinic. However, clinic-specific radiotherapy planning approaches have faced some challenges. For instance, this system is labor-intensive and error-prone because it involves each clinic preparing data for its model training. For instance, clinicians must manually identify and select relevant patient data from a large and complex database to train a clinic-specific model. This process involves extracting treatment plans, ensuring the data is properly labeled, and organizing it into a format that can be used to train the model while ensuring data quality. This manual effort is time-consuming and requires a high level of expertise, making it a significant challenge for medical professionals. Moreover, the quality of the model will directly depend upon the subjective expertise of the medical professionals of each clinic, which is undesirable and inconsistent. Finally, the models may not operate efficiently because they are sensitive to the variability in the data used for training. For instance, some models require homogeneous datasets to function effectively, meaning that any inconsistencies or outliers in the data can negatively impact the model's performance. This further complicates the data preparation process, as it necessitates careful selection and curation of data to ensure that it meets the necessary criteria for training. SUMMARY For the aforementioned reasons, there is a desire for a system that can adapt or otherwise customize a computer model (e.g., an AI or machine-learning or a more conventional model) for a particular clinic. Using the methods and systems discussed herein may allow for faster and more efficient training time, and sometimes, using fewer computing resources. Moreover, machine learning models that are trained using the methods and system discussed herein may be customized to clinics, such that their predictions are more accurate. Therefore, the methods and systems discussed herein provide functional and technical improvements specific to the field of machine learning. The methods and systems discussed herein address the challenges associated with manual data handling in the training of machine-learning models for radiotherapy treatment planning. Currently, clinics must manually curate patient data from complex databases, a process that is both time-consuming and error-prone. This manual approach leads to inconsistencies and inefficiencies, as clinics need to extract relevant data, organize it, and ensure its quality for machine-learning model training. The methods and systems discussed herein provide an automated workflow that simplifies patient data selection, file mapping, anonymization, and fin