CN-120603683-B - Autonomous control of devices

CN120603683BCN 120603683 BCN120603683 BCN 120603683BCN-120603683-B

Abstract

The present invention relates to a method for autonomous control of a device. The device is typically mobile in the real world and the device need not be limited in physical design. The advantage of this approach is that the device can perform an autonomous learning process and continuously improve the learned knowledge or behavior. The shortcomings of the prior art in which training data must first be generated are generally overcome, as is the case with conventional artificial intelligence methods. In general, the method is widely applicable, and the device can automatically learn and continuously revise its own knowledge base in the process. The invention furthermore relates to an apparatus designed to carry out the method, a system component comprising a plurality of the proposed apparatuses, a computer program product and a computer-readable storage medium, which carries out the method or causes a computer to carry out the method.

Inventors

Carl Albert SCHREIBER

Assignees

卡尔·阿尔伯特·施赖伯

Dates

Publication Date: 20260505
Application Date: 20230619
Priority Date: 20221215

Claims (12)

1. A method for autonomous control of a device, the method comprising the steps of: -initializing (100) by randomly driving the actuator and reading a plurality of internal sensor units arranged to measure internal device properties as internal states and reading external environment properties as external states by a plurality of external sensor units arranged for measurement (100); Wherein an object is detected by the external sensor unit and a parameter detected by the external sensor unit is compared with a device attribute and an action instruction of the device and the detected parameter is used to predict a behavior of the detected object, Wherein the action instruction as a mapping function converts the output triplet of drive, device properties and environment properties to the actuator into a target triplet specifying a new internal state, a new external state and possible actions to be performed in the new state, -Storing (104) a correlation existing between the drive (101) of the actuator and the read device properties and environment properties as an action instruction, and -Controlling (105) the device according to at least one predefined target device property and/or at least one predefined target environment property using at least one stored (104) action instruction, the at least one stored (104) action instruction indicating how to use the output triplet and the target triplet to set the at least one predefined target device property and/or the at least one predefined target environment property, the method iterating in a learning manner, new action instructions being continuously identified in the iterating, the new action instructions converting the output triplet into target triples in each case, and the action instructions being stored for controlling the device.
2. A method according to claim 1, characterized in that the method iterates in a learning manner such that new action instructions are always identified, which in each case convert an output triplet into a target triplet, and which are stored for driving the device.
3. Method according to claim 1 or 2, characterized in that the reading of the internal actuator and/or the reading (103) of the external actuator is performed in such a way that the respective support values are stored and intermediate values are interpolated and/or extrapolated.
4. The method according to claim 1 or 2, characterized in that the target device property and/or the target environment property is at least temporarily influenced by a device property and/or an environment property.
5. A method according to claim 1 or 2, wherein a plurality of action instructions are combined to form an orchestration to convert an output triplet into a target triplet.
6. A method according to claim 1 or 2, characterized in that the interrelationship between the parameters of the detected object and the movements of the detected object is used to generate the movement instructions of the device.
7. The method according to claim 1 or 2, wherein the actuator is a movement unit, a drive, a wiper, a lamp, a steering gear or an actuator unit.
8. A method according to claim 1 or 2, wherein the actuator is a motor.
9. A method according to claim 1 or 2, wherein the actuator is a gripping arm.
10. The method according to claim 1 or 2, wherein the internal sensor unit detects an actuator state, an actuator setting, an actuator position, a size, a weight, a specification, a speed, an acceleration, a deceleration and/or a battery level.
11. The method according to claim 1 or 2, wherein the external sensor unit detects a distance, a lidar signal, an image signal, a position, a size, a specification, and/or an acoustic signal.
12. An apparatus adapted to carry out the method of any one of the preceding claims, the apparatus comprising: -an initialization unit arranged to initialize (100) by randomly driving the actuator and reading a plurality of internal sensor units arranged to measure internal device properties as internal states and reading external properties as external states by a plurality of external sensor units arranged for measurement; wherein the device is designed to detect an object by means of the external sensor unit and to compare the parameters detected by the external sensor unit with device properties and with action instructions of the device and to use the detected parameters to predict the behaviour of the detected object, Wherein the action instruction as a mapping function converts the output triplet of drive, device properties and environment properties to the actuator into a target triplet specifying a new internal state, a new external state and possible actions to be performed in the new state, -A memory unit arranged to store (104) a correlation existing between the drive (101) of the actuator and the read device properties and environment properties as an action instruction; -wherein the device is further adapted to be controlled in accordance with at least one predefined target device property and/or at least one predefined target environment property using at least one stored action instruction indicating how to use the output triplet and the target triplet to set the at least one predefined target device property and/or the at least one predefined target environment property, and -Wherein the apparatus is further designed such that the method iterates in a learning manner, in which iteration new action instructions are continuously identified, each of which converts an output triplet into a target triplet, and these action instructions are stored for controlling the apparatus.

Description

Autonomous control of devices Technical Field The present invention relates to a method for autonomous control of a device, wherein the device is typically moving in the real physical world and does not need to be further specified in terms of its physical configuration. The advantage achieved by this method is that the device learns autonomously and continuously improves the learned knowledge or behavior. In general, the shortcomings of the prior art in which training data must first be created are overcome, as is the case with conventional artificial intelligence methods. In general, the method has general applicability and the device automatically learns and continuously revises its own knowledge base. Furthermore, a device for carrying out the method and a system arrangement comprising a plurality of the proposed devices are proposed. Furthermore, a computer program product and a computer-readable storage medium for carrying out the method steps or for causing a computer to carry out the method are proposed. Background TAKAHASHI KUNIYUKI et al ："Dynamic motion learning for multi-DOF flexible-joint robots using active-passive motor babbling through deep learning（ discloses a learning strategy for performing dynamic motion tasks by a multi-degree-of-freedom flexible joint robot by deep learning, "ADVANCED ROBOTICS (advanced robot)," 31 st volume, 18 th phase, 2017, 9 month, 17 th day, pages 1002-1015, XP 055840673. While robots with flexible joints have a number of potential advantages, such as utilizing inherent dynamics and passive adaptation to environmental changes through mechanical compliance, control of such robots remains challenging due to the increasing dynamic complexity of such robots. Various Artificial Intelligence (AI) methods are known in the art that contemplate selecting and providing training data and letting the algorithm recognize rules during the training phase. Here, a specific algorithm can identify a rule through the specification of initial data and target values, and then put into use. This means that implicit knowledge in a large amount of data can be exploited. After the training phase is completed, an appropriately trained algorithm is applied to the actual data, so that a hidden dependency relationship is generated to solve the problem in reality. One disadvantage of this approach is that training data must first be selected and specified, which results in the generated artificial intelligence being limited to the training data, which may have a detrimental effect on the learning phase at this stage, which is not only error-prone, but also costly. Population intelligence is also known in the art that provides multiple devices, and these devices cooperatively solve the problem in a distributed manner. In this case, the control and coordination of the individual participants is also complicated and sometimes prone to error. This is especially true if distributed learning is to be performed that requires further coordination. Artificial neural networks are also known from the prior art, which provide neurons and connections, i.e. nodes and edges, based on graph theory. These networks are well known to mimic the function of the human brain and may also learn. The edge weights may change, new edges may be added or old edges may be deleted, and existing nodes may be deactivated or new nodes may be added. This constitutes an overall system of dynamic learning. However, there are at least four problems with using KNN in the manner described above: 1. for an object that has not been trained with a KNN, the KNN does not provide any information to call the routine assigned to the object. 2. The programmed properties or behavior of the detected and identified objects may be incorrect, changed during this time, or remain unknown. 3. One or more scientists, experts, etc. select the training data and specify the results to be achieved. Even with unsupervised learning, the training data and super parameters are still specified by human beings, which cannot exclude the possibility of errors. Ann is unable to correct errors. Each piece of individual information is distributed among all parameters of the ANN, like a hologram, where each data point contains information about the entire image. Therefore, the ANN must always be completely deleted and completely retrained. This means that when the ontology (ego, self) encounters a previously unknown object, no recognition failure (unlike KNN) occurs, e.g. in video where the autopilot car suddenly sees two boxes in the lane. KNN generally represents an artificial neural network. There is a need in the art to achieve autonomous control of a device so that time consuming preparation work, such as providing training data and teaching, can be avoided or minimized. Thus, there is a need for a self-learning system that can perform collective learning, or can estimate what other participants may do or can do. Thus, there is a need for a system comprising