EP-4738244-A1 - DIFFUSION OPTIMAL CONTROL FOR INVERSE PROBLEMS

EP4738244A1EP 4738244 A1EP4738244 A1EP 4738244A1EP-4738244-A1

Abstract

A method for image generation utilizing a diffusion process with optimal control is disclosed. Optimal control is applied in an inverse diffusion problem to mitigate processing costs associated with image generation. The method includes an outer loop performed for a predetermined number of iterations, with two inner loops (a back propagation loop and a forward propagation loop) that calculate, over a number of time steps, the desired image from a beginning state (e.g., static noise or a measurement) after completion of the predetermined number of iterations.

Inventors

PEREIRA, MARCUS A.
LI, Henry Fangyi
LIN, WAN-YI

Assignees

Robert Bosch GmbH

Dates

Publication Date: 20260506
Application Date: 20251029

Claims (20)

A method for generating an image, the method comprising: receiving a step size value, a state vector, and a measurement value corresponding to an initial image; determining a first gradient value and a second gradient value based on the measurement value; executing a first loop in a first plurality of steps each having a particular increment size, from a first time value to a last time value, wherein executing the first loop comprises: updating the first and second gradient values using values of the state vector; and computing a feedforward gain value and a feedback gain value, using the first and second gradient values, for each of the first plurality of steps; and executing a second loop in a second plurality of steps each having the step size value, from the last time value to the first time value, wherein executing the second loop comprises: re-computing the values of the state vector using the feedforward gain value and feedback gain value computed during execution of the first loop; and generating updated values of a control vector based on the values of the state vector subsequent to the re-computing; repeating the execution of the first loop and the second loop for a predetermined number of iterations; and generating a final image based on values of the state vector and the control vector after completion of the repeating execution of the first loop and the second loop.
The method of claim 1, further comprising generating, using a control system, one or more control signals based on the final image.
The method of claim 2, wherein the control system includes one or more sensors configured to generate the measurement values.
The method of claim 3, wherein the one or more sensors include a video camera.
The method of claim 3, wherein the one or more sensors include one or more X-ray detectors.
The method of claim 3, wherein the one or more sensor include LiDAR.
The method of claim 2, further comprising the control system controlling a manufacturing process.
The method of claim 2, further comprising the control system controlling an autonomous personal assistant.
The method of claim 2, further comprising the control system generating signals control signals for an actuator based on the final image.
The method of claim 1, wherein executing the first loop comprises executing a diffusion model.
A system for generating an image based on measured data, the system comprising: one or more sensors configured to generate a measurement value corresponding to an initial image; a processor configured to: determine a first gradient value and a second gradient value based on the measurement value; execute a first loop in a first plurality of steps each having a particular increment size, from a first time value to a last time value, wherein to execute the first loop, the processor is further configured to: update the first and second gradient values using values of a state vector; and compute a feedforward gain value and a feedback gain value, using the first and second gradient values, for each of the first plurality of steps; and execute a second loop in a second plurality of steps each having a step size value, from the last time value to the first time value, wherein to execute the second loop, the processor is further configured to: re-compute the values of the state vector using the feedforward gain value and feedback gain value computed during execution of the first loop; and generate updated values of a control vector based on the values of the state vector subsequent to the re-computing; repeat execution of the first loop and the second loop for a predetermined number of iterations; and generate a final image based on values of the state vector and the control vector after completion of the repeating execution of the first loop and the second loop.
The system of claim 11, further comprising a control system configured to generate one or more control signals based on the final image.
The system of claim 12, wherein the control system is configured to generate one or more actuator control commands based on the one or more control signals.
The system of claim 11, wherein the one or more sensors comprise a video camera.
The system of claim 11, wherein the one or more sensors comprise an X-ray detector.
The system of claim 11, wherein the one or more sensors comprise LiDAR.
The system of claim 11, wherein, in executing the first loop, the processor is configured to execute a diffusion model.
A non-transitory computer-readable medium storing instructions that, when executed by a computer system, cause the computer system to carry out operations comprising: receiving a step size value, a state vector, and, from one or more sensors coupled to the computer system, a measurement value corresponding to an initial image; determining a first gradient value and a second gradient value based on the measurement value; executing a first loop in a first plurality of steps each having a particular increment size, from a first time value to a last time value, wherein executing the first loop comprises: updating the first and second gradient values using values of the state vector; and computing a feedforward gain value and a feedback gain value, using the first and second gradient values, for each of the first plurality of steps; and executing a second loop in a second plurality of steps each having the step size value, from the last time value to the first time value, wherein executing the second loop comprises: re-computing the values of the state vector using the feedforward gain value and feedback gain value computed during execution of the first loop; and generating updated values of a control vector based on the values of the state vector subsequent to the re-computing; repeating the execution of the first loop and the second loop for a predetermined number of iterations; and generating a final image based on values of the state vector and the control vector after completion of the repeating execution of the first loop and the second loop.
The computer-readable medium of claim 18, further comprising instructions that, when executed by the computer system, cause the computer system to: generate one or more control signals based on the final image; and generate one or more actuator control commands based on the control signals.
The computer-readable medium of claim 18, further comprising instructions to convert raw sensor data from the one or more sensors into the initial image from one or more of the following sensor types: a video camera; an X-ray sensor; a thermal sensor; a LiDAR sensor.

Description

TECHNICAL FIELD The present disclosure relates to solving inverse problems for applications such as control signal generation and image generation, and more particularly, to incorporating optimal control into diffusion algorithms for generating control signals and/or images. BACKGROUND Diffusion models are a class of generative models that are used for various applications such as image generation. A diffusion model may implement a two-phase process that includes a forward diffusion process and a reverse, or inverse diffusion process. The forward diffusion process may gradually add noise to the data, while the reverse diffusion process may remove the noise to reconstruct the original data (e.g., image). In a forward diffusion process, data points may be progressively transformed into noise by gradually or iteratively adding small amounts of Gaussian noise. The forward process may corrupt the data into a noisy distribution that may be mathematically modeled by, e.g., a Gaussian distribution. The inverse diffusion process begins from a sample of pure noise, the data is de-noised in a step-by-step manner. As the data is de-noised, the data is reconstructed. This may be carried out by predicting the noise that was added at each step and subtracting the predicted noise. SUMMARY A method for image generation utilizing a diffusion process with optimal control is disclosed. In various embodiments, the disclosed method uses optimal control in an inverse diffusion problem to mitigate processing costs associated with image generation. The method includes an outer loop performed for a number of iterations, with two inner loops (a back propagation loop and a forward propagation loop) that calculate, over a number of time steps, the desired image from a beginning state (e.g., static noise or a measurement). BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 shows a system 100 for training a neural network.FIG. 2 shows a computer-implemented method 200 for training a neural network.FIGS. 3A and 3B graphically illustrate the methodology of the present disclosure.FIGS. 4A-4C illustrate examples of image generation of the present disclosure relative to other methods.FIG. 5 depicts a schematic diagram of an interaction between computer-controlled machine 510 and control system 512.FIG. 6 depicts a schematic diagram of the control system of FIG. 1 configured to control a manufacturing machine, such as a punch cutter, a cutter or a gun drill, of manufacturing system, such as part of a production line.FIG. 7 depicts a schematic diagram of the control system of FIG. 1 configured to control an automated personal assistant.FIG. 8 depicts a schematic diagram of the control system of FIG. 1 configured to control a monitoring system, such as a control access system or a surveillance system.FIG. 9 depicts a schematic diagram of the control system of FIG. 1 configured to control an imaging system, for example an MM apparatus, x-ray imaging apparatus or ultrasonic apparatus. DETAILED DESCRIPTION Embodiments of the present disclosure are described herein. It is to be understood, however, that the disclosed embodiments are merely examples and other embodiments can take various and alternative forms. The figures are not necessarily to scale; some features could be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative bases for teaching one skilled in the art to variously employ the embodiments. As those of ordinary skill in the art will understand, various features illustrated and described with reference to any one of the figures can be combined with features illustrated in one or more other figures to produce embodiments that are not explicitly illustrated or described. The combinations of features illustrated provide representative embodiments for typical application. Various combinations and modifications of the features consistent with the teachings of this disclosure, however, could be desired for particular applications or implementations. "A", "an", and "the" as used herein refers to both singular and plural referents unless the context clearly dictates otherwise. By way of example, "a processor" programmed to perform various functions refers to one processor programmed to perform each and every function, or more than one processor collectively programmed to perform each of the various functions. Diffusion models have been shown to be adept at conditional generation tasks in part due to their iterative sampling algorithm, which allows the dynamics of an uncontrolled prior score function ∇x log pt(x) to be directed towards an arbitrary posterior distribution by introducing an additive guidance term u. When this guidance term is the conditional score ∇x log pt(y|x), the resulting sample is provably drawn from the desired conditional distribution p(x|y). A central obstacle to this framework is