CN-116392260-B - Control device and method for vascular intervention operation
Abstract
The invention provides a control device and method for vascular intervention operation, and relates to the technical field of control, wherein the device comprises an acquisition module, a prediction module and a determination module, wherein the acquisition module is used for acquiring first image data at the previous moment, second image data at the current moment and first action information at the previous moment, the first image data and the second image data comprise vascular image data and instrument image data corresponding to the vascular intervention operation, the prediction module is used for obtaining probability of selecting each action at the current moment output by a target operation model, the target operation model is trained based on an offline reinforcement learning method, the determination module is used for determining a target control instruction based on the probability of selecting each action, and the control module is used for controlling the instrument to move in a blood vessel based on the target control instruction. The invention improves the accuracy of the target operation model obtained by training, and further improves the accuracy of the movement of the instrument in the blood vessel.
Inventors
- ZHOU XIAOHU
- XIANG TIANYU
- GUI MEIJIANG
- LI HAO
- XIE XIAOLIANG
- LIU SHIQI
- FENG ZHENQIU
- HOU ZENGGUANG
- Yao Boxian
- HUANG DEXING
- YU ZHE
Assignees
- 中国科学院自动化研究所
Dates
- Publication Date
- 20260505
- Application Date
- 20230303
Claims (5)
- 1. A control device for vascular interventional procedures, the device comprising: The acquisition module is used for acquiring first image data at the previous moment, second image data at the current moment and first action information at the previous moment, wherein the first image data and the second image data comprise blood vessel image data and instrument image data corresponding to blood vessel interventional operation; The prediction module is used for inputting the first image data, the second image data and the first action information into a target operation model to obtain the probability of selecting each action at the current moment output by the target operation model, wherein the target operation model is trained by using first image sample data, second image sample data and action sample information corresponding to the first image sample data based on an offline reinforcement learning method; a determining module for determining a target control instruction based on a probability of selecting each of the actions; A control module for controlling movement of the instrument in the blood vessel based on the target control instruction; the target surgical operation model is trained based on the following modes: Inputting the first image sample data and the second image sample data in the sample data into an encoder of an initial operation model to obtain a coded information sample output by the encoder; Inputting the coded information samples and the action sample information in the sample data into a strategy estimation sub-model and a function estimation sub-model of the initial operation model to obtain a prediction probability of each action and a function estimation value output by the function estimation sub-model at a first moment output by the strategy estimation sub-model, wherein the first moment is a moment corresponding to the acquisition of the second image sample data; Selecting a predicted probability of each action based on the first moment, determining a target action, and determining a return value of the first moment based on the target action; updating model parameters of the initial operation model based on the return value and the function estimated value to obtain the target operation model, wherein the accumulated return value corresponding to the target operation model is the largest; The updating the model parameters of the initial operation model based on the return value and the function estimated value to obtain the target operation model comprises the following steps: Determining a function estimation loss function based on the return value, the function estimation value, an estimation value of an objective function, and the advantage of the sample data relative to an agent policy; Determining a policy mimicking loss function based on the agent policy and the sample data; And estimating a loss function based on the function and simulating the loss function by the strategy, and optimizing model parameters of the initial operation model to obtain the target operation model.
- 2. The vascular interventional procedure control device according to claim 1, wherein the estimating a loss function based on the function and the policy mimicking a loss function, optimizing model parameters of the initial procedure model to obtain the target procedure model, comprises: Estimating a loss function based on the function and simulating the loss function by the strategy, and optimizing model parameters of the initial surgical operation model to obtain a simulated surgical operation model; determining a policy optimization loss function based on the agent policy and an estimated value of the objective function; Optimizing a loss function based on the strategy and the function estimated loss function, and optimizing model parameters of the simulation-based operation model to obtain the target operation model.
- 3. The vascular interventional procedure control device according to claim 1, wherein the determining the return value at the first time based on the target action comprises: Executing the target action to obtain the position of the instrument in the blood vessel at a first moment; And determining a return value of the first moment based on whether the position deviates from a target path, whether the contact force of the instrument is greater than or equal to a preset threshold, and a difference value between the position of the first moment and the position of the second moment when the instrument moves from the position of the first moment to the position of the second moment.
- 4. The vascular interventional procedure control device according to claim 2, wherein the optimizing the model parameters of the model-based simulator operation model based on the strategic optimization loss function and the function estimation loss function to obtain the target surgical operation model comprises: determining a sampling probability corresponding to each sample data based on the weight of the sample data; determining target sample data based on the sampling probability of each sample data; Inputting the target sample data into the simulation-based operation model, optimizing model parameters of the simulation-based operation model based on the strategy optimization loss function and the function estimation loss function, and obtaining the target operation model.
- 5. The vascular interventional procedure control device according to claim 4, wherein the device further comprises: And the updating module is used for updating the weight of the target sample data based on the return value, the estimated value of the target function and the estimated value of the function after optimizing the model parameters of the simulated operation model in each round.
Description
Control device and method for vascular intervention operation Technical Field The invention relates to the technical field of control, in particular to a control device and method for vascular intervention operation. Background Vascular intervention is a minimally invasive treatment modality performed using robotic systems. Under the guidance of the imaging system, a doctor operates the robot system to control interventional instruments such as a guide wire to reach the focus position through the vascular cavity so as to perform treatments such as thrombolysis, dilating stenotic blood vessels and the like. In the related art, a blood vessel interventional operation model is generally trained by adopting an imitation learning method or a statistical learning method based on a doctor operation example, so that autonomous instrument delivery of the blood vessel interventional operation robot is realized. However, in the above related art, the model is trained by using a simulated learning method or a statistical learning method, and optimization of model parameters is aimed at a doctor operation example, so that accuracy of the model depends on quality of the doctor operation example, and if quality of the doctor operation example is poor, accuracy of the trained model is reduced, and further accuracy of movement of the instrument in a blood vessel is reduced. Disclosure of Invention Aiming at the problems existing in the prior art, the embodiment of the invention provides a control device and a control method for vascular intervention operation. The invention provides a control device for vascular intervention operation, comprising: The acquisition module is used for acquiring first image data at the previous moment, second image data at the current moment and first action information at the previous moment, wherein the first image data and the second image data comprise blood vessel image data and instrument image data corresponding to blood vessel interventional operation; The prediction module is used for inputting the first image data, the second image data and the first action information into a target operation model to obtain the probability of selecting each action at the current moment output by the target operation model, wherein the target operation model is trained by using first image sample data, second image sample data and action sample information corresponding to the first image sample data based on an offline reinforcement learning method; a determining module for determining a target control instruction based on a probability of selecting each of the actions; a control module for controlling movement of the instrument in the blood vessel based on the target control instructions. According to the control device for the vascular intervention operation, the target operation model is trained based on the following modes: Inputting the first image sample data and the second image sample data in the sample data into an encoder of an initial operation model to obtain a coded information sample output by the encoder; Inputting the coded information samples and the action sample information in the sample data into a strategy estimation sub-model and a function estimation sub-model of the initial operation model to obtain a prediction probability of each action and a function estimation value output by the function estimation sub-model at a first moment output by the strategy estimation sub-model, wherein the first moment is a moment corresponding to the acquisition of the second image sample data; Selecting a predicted probability of each action based on the first moment, determining a target action, and determining a return value of the first moment based on the target action; And updating model parameters of the initial operation model based on the return value and the function estimated value to obtain the target operation model, wherein the accumulated return value corresponding to the target operation model is the largest. According to the control device for vascular intervention operation provided by the invention, the updating of the model parameters of the initial operation model based on the return value and the function estimation value to obtain the target operation model comprises the following steps: Determining a function estimation loss function based on the return value, the function estimation value, an estimation value of an objective function, and the advantage of the sample data relative to an agent policy; Determining a policy mimicking loss function based on the agent policy and the sample data; And estimating a loss function based on the function and simulating the loss function by the strategy, and optimizing model parameters of the initial operation model to obtain the target operation model. According to the control device for vascular intervention operation provided by the invention, the model parameters of the initial operation model are optimized based on the function e