DE-102024132854-A1 - Computing equipment and methods for generating training data

DE102024132854A1DE 102024132854 A1DE102024132854 A1DE 102024132854A1DE-102024132854-A1

Abstract

The invention provides a method for generating training data, as well as a corresponding computing device and system. The method comprises at least the following steps: Receiving (S100) sensor data (1) from at least one sensor (10; 20; 30) present in a clinical situation; Feeding (S200) the sensor data into a first machine learning model (110); Processing (S300) the first input data (11) to generate first output data (19) which represents a recommendation for action, a control signal, or information to a user; Feeding (S400) the first output data (19) as second input data (21) into a second machine learning model (120); Processing (S500) the second input data (21) by the second machine learning model (120) to generate second output data (29) which indicates whether the received sensor data (1) should be used to improve the first machine learning model (110), and, if so: Store at least some of the received sensor data (1) as a basis for training data for the first machine learning model (110).

Inventors

Sebastian Wenzler
Erhan Kenar

Assignees

KARL STORZ SE & CO. KG

Dates

Publication Date: 20260513
Application Date: 20241111

Claims (10)

A computer-implemented method for generating training data for a machine learning model, comprising the following steps: Receiving (S100) sensor data (1) from at least one sensor (10; 20; 30) present in a clinical situation; Feeding (S200) at least parts of the received sensor data (1) as at least part of initial input data (11) into an initial machine learning model (110); Processing (S300) the initial input data (11) by the initial machine learning model (110) to generate initial output data (19) which represents a recommendation for action, a control signal for a medical device (200), or information for a user; Feeding (S400) at least parts of the first output data (19) and/or intermediate values (18) of the first machine learning model (110) as at least part of the second input data (21) into a second machine learning model (120); Processing (S500) the second input data (21) by the second machine learning model (120) to generate second output data (29) indicating whether the received sensor data (1) should be used to improve the first machine learning model (110), and, if so: Storing and/or transmitting (S600) at least part of the received sensor data (1) to a data storage device (140) as a basis for training data for the first machine learning model (110).
Procedure according to Claim 1 , wherein the second machine learning model (120) includes a support vector machine, an artificial neural network, and/or an extreme learning machine.
Procedure according to Claim 1 or 2 , wherein the second machine learning model (120) is set up such that the second output data (29) includes a numerical value which is compared with a threshold value, and a result of the comparison indicates whether the received sensor data (1) should be used to improve the first machine learning model (110).
Procedure according to one of the Claims 1 until 3 , furthermore, comprising: dynamically selecting an intermediate layer or a final layer of the first machine learning model (110); wherein at least one intermediate value (18) of the first machine learning model (110), which is used as part of the second input data (21), is taken from the dynamically selected intermediate layer or final layer.
Procedure according to one of the Claims 1 until 4 , wherein the sensor data (1) of the at least one sensor (10, 20, 30) is or comprises at least one of the following: - image and/or video data of an image sensor (10); - temperature data of a temperature sensor; - position data of a position sensor; - orientation data of an orientation sensor; - vital data of a patient; - current data of a current sensor; - voltage data of a voltage sensor (20); - pressure data of a pressure sensor; - flow data of a flow sensor; - motor data of a motor, such as speed and/or acceleration and/or torque data; - device settings of a medical device (200).
Procedure according to one of the Claims 1 until 5 , furthermore, comprising: supplementing the stored or transmitted portion of the sensor data with at least one of the following: - information about a surgical procedure during which the transmitted portion of the sensor data (1) was generated; - a medical specialty to which the surgical procedure belongs; - at least a portion of the output data (29) of the second machine learning model (120); - a location of the surgical procedure; - a timestamp related to the surgical procedure; - time information relating to a last preceding transmission of sensor data; - a position of an input unit on a master; - information from a patient record of a patient undergoing the surgical procedure; - a complication identified during or noted in connection with the surgical procedure; in particular, each as a label for training. data for supervised learning of the first machine learning model (110).
Procedure according to one of the Claims 1 until 6 , also encompassing the acquisition (S450) of a user input (31); and the feeding of the acquired user input as a further part of the second input data (21) into the second machine learning model (120).
Procedure according to one of the Claims 1 until 7 , wherein the first machine learning model (110) is trained to detect and/or classify smoke development during a surgical procedure.
Computing unit (100), configured to execute the procedure according to one of the Claims 1 until 8 to be carried out, optionally integrated into a camera control system.
System (1000), comprising a computing device (100) according to Claim 9 and at least one sensor (10, 20, 30) which is configured to acquire sensor data (1) from at least one clinical situation.

Description

Technical field of the invention The invention relates to a computer-implemented method for generating training data for a machine learning model, and to a computing device which is configured to carry out such a method. Background of the invention Machine learning models, such as artificial neural networks, k-means algorithms, or support vector machines, are typically trained using training data (training phase) and then deployed at their intended location (deployment phase). An update is generated periodically, for example, by adding newly collected training data to the existing training data, and the machine learning model is retrained or further trained with this supplemented data. In other variations, continuous retraining takes place using all data accumulated in the interim. In both cases, these are relatively rigid processes whose timing depends neither on the quality of the previous training nor on the relevance of the data collected in the meantime. Summary of the invention The present invention aims to provide an improved method for generating training data for a machine learning model, which in particular eliminates the aforementioned disadvantages of the prior art. Accordingly, according to a first aspect of the present invention, a computer-implemented method for generating training data for a machine learning model is provided, comprising the steps: Receiving sensor data from at least one sensor present in a clinical situation; Feeding at least parts of the received sensor data as at least part of initial input data into a first machine learning model; Processing the initial input data by the first machine learning model to generate initial output data, which represents a recommendation for action, a control signal for a medical device, or information for a user; Feeding at least parts of the first output data and/or intermediate values of the first machine learning model as at least part of the second input data into a second machine learning model; Processing the second input data by the second machine learning model to generate second output data, which indicates whether the received sensor data should be used to improve the first machine learning model, and if so: Storing and/or transmitting at least some of the received sensor data (especially that which was part of the initial input data) to a data storage device as a basis for (or as) training data for the first machine learning model. The invention thus provides a method for checking whether the currently used (i.e., in the deployment phase) first machine learning model still delivers results of sufficient quality, or whether its improvement is already necessary and/or possible. This not only results in a significant reduction in the required storage capacity and/or transmission bandwidth, but also in a significant improvement of the training data set or a reduction of additional training data to data that actually offers potential for improvement to the first machine learning model. For example, the second machine learning model can be used to advantageously check whether information provided to the user with the initial output data (e.g., "Event X is imminent" or "Smoke extraction required") has subsequently proven to be correct. If not, there might be room for improvement at this point—that is, with regard to the sensor data that the first machine learning model processed to generate the incorrect initial output data. The corresponding sensor data could then be stored by the second machine learning model as future training data or transmitted to the data storage. Within the scope of this invention, a clinical situation can be understood to mean an operation, a surgical procedure, the treatment of a patient, an external examination of a patient, or the like. In particular, medical or surgical instruments may be used. The processing of the second input data by the second machine learning model preferably takes place immediately following, or partially overlapping, with the processing of the first input data by the first machine learning model. An intermediate value of the first machine learning model is a value generated by the first machine learning model but not part of its output data. Such intermediate values are also referred to as "hidden features." In an artificial neural network, for example, such intermediate values are generated by intermediate layers, i.e., layers located between the input layer and the output layer. Preferably, the intermediate values used as part of the second input data are taken from the penultimate or last layer before the output layer. The stored and/or transmitted sensor data can themselves become part of the new (additional, future) training data. The second machine learning model can thus act as a kind of pre-filter for important, helpful, etc., training data for the existing, previously trained, first machine learning model. This allows the model to react to changing application situations after