JP-7857334-B2 - A virtual training method for neural networks to operate technical devices.
Inventors
- シュテフェン グローテンヘーファー
Assignees
- ディスペース ゲー・エム・ベー・ハー
Dates
- Publication Date
- 20260512
- Application Date
- 20240326
- Priority Date
- 20230327
Claims (13)
- A computer-implemented method for training a neural network to operate a technical device, the method comprising the following steps: establishing a first data link between the neural network and a virtual first simulation of the technical device by a first data interface of the first simulation for reading state data from the first simulation and transferring control data to the first simulation; The steps include setting a first training objective for operating the first simulation, The steps include training the neural network by the first simulation operated by the neural network, and checking the training progress of the neural network against the first training objective, Steps include: disconnecting the first data link once the first training objective is achieved; then establishing a second data link between the neural network and a virtual second simulation of the technical device by a second data interface of the second simulation for reading state data from the second simulation and transferring control data to the second simulation, wherein the second simulation is configured to be more realistic than the first simulation and requires more mathematical operations per simulation cycle than the first simulation due to the higher realism of the second simulation; The steps include training the neural network by the second simulation which is operated by the neural network, A method that includes this.
- The first data interface and the second data interface are configured identically. The method according to claim 1.
- The method includes the following steps: generating simulated sensor data that includes information about the virtual environment of the technical device; The steps include transferring the simulated sensor data to the neural network, The steps include training the neural network to evaluate the sensor data and to take the sensor data into consideration when operating the first simulation and/or the second simulation, including, The method according to claim 1.
- The aforementioned technical device is a robot, particularly in the following categories, namely vehicles. Robot arm, Robots for positioning or mounting materials and/or objects, A robot for cleaning surfaces. Robots for inspecting spaces or surfaces A robot for applying chemical substances, such as cleaning agents, disinfectants, paints, varnishes, or coatings. Robots for performing medical procedures, It is a robot belonging to at least one of the following categories. The method according to claim 1 .
- The first training objective is a section of training progressed on a virtual training course or virtual test route, Multiple motion patterns performed without any collision or within a predetermined range of motion, During the period in which the neural network operates the first simulation appropriately without any unwanted events occurring, The threshold reached for the reward function, Belonging to at least one of the following: The method according to claim 4.
- The second simulation described above is: A simulation that considers more mechanical and/or electrical components of the technical device than the first simulation, and/or considers more mechanical degrees of freedom than the first simulation, and/or simulates physical phenomena and/or laws more realistically and/or with greater accuracy than the first simulation, and/or considers more physical forces and/or interactions than the first simulation, and/or has a smaller simulation step size than the first simulation, The method according to claim 1 .
- The method includes the following steps: setting a second training objective for operating the second simulation, During the operation of the second simulation, the step of checking the training progress of the neural network with respect to the second training objective, After the second training objective is achieved, the procedure involves establishing a physical data link between the neural network and the technical device. The steps include operating the technical device using the neural network, including, The method according to claim 1 .
- A virtual training environment for training a neural network to operate a technical device, wherein the virtual training environment is A first simulation of the technical device having a first data interface for reading state data from the first simulation and for transferring control data to the first simulation, A programming interface for establishing a first data link between the neural network and the first simulation, A training function configured to set a training objective for the first simulation to be operated by the neural network, and to check the training progress of the neural network against the training objective, In a virtual training environment including, The virtual training environment includes a second simulation of the technical device, The second simulation is configured to be more realistic than the first simulation, and due to the higher realism of the second simulation, it requires more mathematical operations per simulation cycle than the first simulation. The second simulation includes a second data interface for reading state data from the second simulation and for transferring control data to the second simulation. The aforementioned training function is When the aforementioned training objective is achieved, the first data link is shut off. The first data link is replaced by a second data link between the neural network and the second simulation. The neural network (8) is trained by the second simulation (14b) which is operated by the neural network (8). It is characterized by being configured in such a way, A virtual training environment.
- The first data interface and the second data interface are configured identically. The virtual training environment described in claim 8.
- The virtual training environment includes a virtual environment for the technical device and a sensor simulation. The aforementioned virtual training environment is The sensor simulation generates simulated sensor data that includes information about the virtual environment. The simulated sensor data is transmitted to the neural network via the programming interface. It is structured in such a way. The virtual training environment described in claim 8.
- The aforementioned technical device is a robot, particularly in the following categories, namely vehicles. Robot arm, Robots for positioning or mounting materials and/or objects, A robot for cleaning surfaces. Robots for inspecting spaces or surfaces A robot for applying chemical substances, such as cleaning agents, disinfectants, paints, varnishes, or coatings. Robots for performing medical procedures, It is a robot belonging to at least one of the following categories. The virtual training environment described in claim 8 .
- The aforementioned training objectives fall into the following categories of training objectives: sections to be completed on a virtual training course or virtual test route. Multiple motion patterns performed without any collision or within a predetermined range of motion, During the period in which the neural network operates the first simulation appropriately without any unwanted events occurring, The threshold reached for the reward function, Belonging to at least one of the following: The virtual training environment according to claim 11.
- The second simulation described above is: A virtual training environment according to claim 8, which takes into account more mechanical and/or electrical components of the technical device than the first simulation, and/or takes into account more mechanical degrees of freedom than the first simulation, and/or simulates physical phenomena and/or laws more realistically and/or with greater accuracy than the first simulation, and/or takes into account more physical forces and/or interactions than the first simulation, and/or has a smaller simulation step size than the first simulation.
Description
In recent years, neural networks have made significant progress in the development of autonomous systems capable of independently performing complex motion patterns or monitoring tasks in changing environments. However, a problem in this regard is the sheer volume of training data required to adequately train neural networks, particularly when they are intended for safety-critical tasks where misrecognition could lead to catastrophic consequences. Examples include self-driving cars intended for road traffic, smart surveillance cameras that recognize people drowning in lakes, or rescue robots that search for injured individuals in a region or building. These types of systems require not only a large amount of diverse training data to optimally eliminate malfunctions caused by insufficient or biased training, but the training data must also contain a vast amount of data depicting catastrophic or critical scenarios that the neural network must be trained to reliably recognize and process. Because such scenarios occur rarely by their nature, corresponding data collection is also rare, meaning that collecting sufficient training data represents an enormous amount of work. One known solution to this problem from prior art is to generate synthetic training data using computer simulations. In this case, scenarios for training a neural network are reconstructed in a virtual environment, and training data is generated in a format that the neural network can understand based on these reconstructed scenarios. In this way, the virtual scenarios can be easily modified as desired. However, since synthetic data generally does not perfectly replicate real-world data, there is a fundamental risk that the neural network will become accustomed to the synthetic training data and will no longer be adequately trained for use in the real world. The scientific paper “Synthetic Data for Deep Learning” by Sergey I. Nikolenko (2019) contains a comprehensive overview of related conventional techniques. The paper describes various examples of neural networks successfully trained in a virtual environment. However, these academic studies only demonstrate that it is possible in principle. This does not change the fact that the more realistic the synthesized training data, the more reliable the neural network trained using that synthesized data becomes. This is a schematic diagram of a virtual training environment.This is a flowchart of the training movements.This figure shows a virtual environment for simulating highly automated vehicles.This figure shows a sensor simulation for simulating lane recognition in highly automated vehicles.This figure shows a first simulation of a highly automated vehicle.This figure shows a second simulation of a highly automated vehicle.This figure shows a third simulation of a highly automated vehicle. Figure 1 shows a computer system 2, on which a virtual training environment 4 and a development environment 6 for a neural network 8 are set up. The neural network 8, intended to operate the technical device and to be trained for that task, is embedded in the development environment 6. The development environment 6 is logically separated from the training environment 4 and includes, in addition to the neural network 8, a first development routine 10 and a second development routine 12 for executing a reward function T. The second development routine 12 is configured to monitor the function value of the reward function T and to modify the configuration of the neural network 8 based on this function value; that is, to modify the values stored in the neurons and synapses of the neural network 8 to optimize the function value of T in a predetermined manner. The function value of T depends on a matrix x, and the entries in matrix x are supplied by the virtual training environment 4. Development environments like development environment 6 shown in the diagram are known in the prior art and are available on the market. Examples include PyTorch and TensorFlow. For example, the virtual training environment 4 includes three simulations of the technical device, namely a first simulation 14a, a second simulation 14b, and a third simulation 14c. The first simulation 14a includes a first data interface 16a for the neural network 8 to operate the first simulation 14a. While being trained on the first simulation 14a, the neural network 8 reads state data from the first simulation 14a and transfers control data to the first simulation 14a. Similarly, the second simulation 14b includes a second data interface 16b, and the third simulation 14c includes a third data interface 16c. Both the second data interface 16b and the third data interface 16c are functionally identical to the first data interface 16a. As a result, the three simulations 14a, 14b, and 14c can be instantly swapped with each other, and the neural network 8 cannot directly identify which of the three simulations 14a, 14b, and 14c it is currently running. The degr